imputeTimeSeriesLab
does imputation for time-series data.
imputeTimeSeriesLab(
labData,
idColName,
labItemColName,
windowColName,
valueColName,
impMethod,
imputeOverallMean = FALSe
)
a file or dataframe of laboratory test data with at least 4 columns about patient ID, lab item, test value and test date, respectively.
the column name that records patient ID in labData.
the column name that records lab item in labData. If lab code is combined by multiple columns, then just simply add +
operator between column names, e.g., A + B
.
the column name that records time window sequence in labData.
the column name that records lab test value in labData. If there are more than one value column to be imputed, just simply add &
operator between column names, e.g., A & B
, then imputation of multiple columns can be done simultaneously.
desired imputation method:mean
, interpolation
or nocb
.
TRUE
= If an individual never performed for a test before the data point, the mean of the test from all the individuals in the dataset can be used to impute. Default is FALSE
A data.table
with imputed data.
Two imputation methods are provided: mean
or interpolation
. If choosing mean
method, the imputation is based the mean of all other non-null values among all the windows of the specific lab item for certain patient. If interpolation
, the imputation uses linear interpolation method, and other out-of-range null values will be imputed by mean of known values. If nocb
, the imputation method is "next observation carried backward".
timeSeriesData <- getTimeSeriesLab(labData = labSample,
idColName = SUBJECT_ID,
labItemColName = ITEMID,
dateColName = CHARTTIME,
valueColName = VALUENUM,
indexDate = first,
gapDate = 360,
completeWindows = TRUE)
imputeTSData <- imputeTimeSeriesLab(labData = timeSeriesData,
idColName = ID,
labItemColName = ITEMID,
windowColName = Window,
valueColName = Max & Min & Mean & Nearest,
impMethod = mean,
imputeOverallMean=FALSE)
head(imputeTSData)
#> ID ITEMID Window Max Min Mean Nearest imputed
#> 1: 36 50811 1 12.7 10 11.675 12.3 FALSE
#> 2: 36 50811 2 12.7 10 11.675 12.3 TRUE
#> 3: 36 50811 3 12.7 10 11.675 12.3 TRUE
#> 4: 36 50811 4 12.7 10 11.675 12.3 TRUE
#> 5: 36 50861 1 12.0 8 10.000 8.0 FALSE
#> 6: 36 50861 2 12.0 10 11.000 10.0 TRUE