imputeKNN
uses k-nearest neighbors (kNN) method to fill missing values.
imputeKNN(labData, idColName, k)
a file or dataframe of laboratory test data with at least 4 columns about patient ID, lab item, test value and test date, respectively.
the column name that records patient ID in labData.
the numbers of selected nearestZneighbors.
A new, wide-formatted imputed data.table
.
Given wide format data, data can conduct KNN imputation. If lab item code are composed by multiple columns, it will automatically combined as one. All other columns will all be seen as value columns.
timeSeriesData <- getTimeSeriesLab(labData = labSample,
idColName = SUBJECT_ID,
labItemColName = ITEMID,
dateColName = CHARTTIME,
valueColName = VALUENUM,
indexDate = first,
gapDate = 360,
completeWindows = TRUE)
wideData <- wideTimeSeriesLab(labData = timeSeriesData,
idColName = ID,
labItemColName = ITEMID,
windowColName = Window,
valueColName = Max)
imputeKNN(labData = wideData,
idColName = ID + Window,
k = 2)
#> ID Window 50811 50861 50904 50906 50912 50931 51222
#> 1: 36 1 12.70000 12.0 41.50000 114.000 2.000 161.00 14.600
#> 2: 36 2 12.70000 619.0 46.25000 117.000 2.050 296.50 14.250
#> 3: 36 3 12.70000 315.5 43.87500 115.500 2.025 228.75 14.425
#> 4: 36 4 12.85000 12.0 36.75000 111.000 1.000 126.00 14.400
#> 5: 109 1 12.77500 51.0 39.12500 112.500 13.600 151.00 15.700
#> 6: 109 2 12.73750 638.5 45.06250 116.250 7.850 291.50 14.800
#> 7: 109 3 12.73750 38.0 40.31250 113.250 6.600 150.00 14.400
#> 8: 109 4 7.20000 80.0 39.71875 112.875 8.800 256.00 11.300
#> 9: 109 5 6.90000 78.0 39.71875 112.875 10.500 154.00 13.100
#> 10: 132 1 12.75625 59.0 39.71875 112.875 0.600 138.00 15.100
#> 11: 143 1 12.70000 1226.0 51.00000 120.000 2.100 432.00 13.900
#> 12: 145 1 13.00000 252.0 32.00000 108.000 10.800 231.00 12.900