Data integration for condition era calculation

Conditions era is used to integrate distributed data of clinical records into a single progression record.

getConditionEra(
  dxDataFile,
  idColName,
  icdColName,
  dateColName,
  icdVerColName = NULL,
  icd10usingDate = NULL,
  groupDataType = ccs,
  customGroupingTable,
  isDescription = TRUE,
  gapDate = 30,
  selectedCaseFile = NULL
)

Arguments

dxDataFile: A data frame object of clinical diagnostic data with at least 3 columns: ID, ICD, and Date. As for date column, the data format should be YYYY/MM/DD or YYYY-MM-DD.
idColName: Column name of ID column in dxDataFile. Data type of this argumant should be string without quotation marks.
icdColName: Column name of ICD column in dxDataFile. Data type of this argumant should be string without quotation marks.
dateColName: Column name of date column in dxDataFile, and the type of date column should be a date format in R or a string format with date information in YYYY/MM/DD or YYYY-MM-DD. Data type of this argumant should be string without quotation marks.
icdVerColName: (Optional) Column name if there is a columns to record ICD-9/10 version used in dxDataFile. In this column, data format should be numeric 9L or 10L to indicate which ICD version is used for each cell. See examples below to get more information.
icd10usingDate: The date that ICD-10 was started to be used in dxDataFile dataset. The data format should be YYYY/MM/DD or YYYY-MM-DD. Necessary if icdVerColName is null.
groupDataType: Five Stratified methods can be chosen: CCS (ccs), multiple-level CCS (ccslvl1, ccslvl2, CCSR (ccsr),ccslvl3, ccslvl4), PheWAS (PheWAS), comorbidities (ahrq,charlson, elix), precise or fuzzy customized method (customGrepIcdGroup, customIcdGroup). The value should be string stated above without quotation mark. Default value is ccs. When conducting cases selection by un-grouped ICD codes, then use the method: ICD (ICD).
customGroupingTable: Used-defined grouping categories. icdDxToCustom needs a dataset with two columns called "Group" and "ICD", respectively; User can define one or more disease categories in "Group" column, and define a list of corresponding category-related ICD codes in "ICD" column. icdDxToCustomGrep needs a dataset with two columns: "Group", "grepIcd"; "Group" defines one or more disease categories and "grepICD" defines disease-related ICD code character strings containing regular expressions.
isDescription: Binary. If true, category description of classification methods will be used in the group column. If false, category name will be used. By default, it is set to be True (standard category description).
gapDate: Length of condition gap with numeric data type, By default, it set to 30 days "30".
selectedCaseFile: A data frame with the label of case selected or not. Can be generated by selectCases function or a self-defined data frame (as long as the column names and data type are the same as the output of selectCases function, but not recommended). Default is 'NULL'

Value

A new data.table based on classifying dxDataFile and calculated condition era by groupDataType for each patient.

Details

This function calculates condition era by grouped categories of each patient.

Examples

# sample file for example

head(sampleDxFile)
#>     ID  ICD       Date Version
#> 1:  A2 Z992 2020-05-22      10
#> 2:  A5 Z992 2020-01-24      10
#> 3:  A8 Z992 2015-10-27      10
#> 4: A13 Z992 2020-04-26      10
#> 5: A13 Z992 2025-02-02      10
#> 6: A15 Z992 2023-05-12      10

# Select case with "Diseases of the urinary system" by level 2 of CCS classification

selectedCaseFile <- selectCases(sampleDxFile, ID, ICD, Date,
                                icdVerColName = NULL,
                                icd10usingDate = "2015/10/01",
                                groupDataType = ccslvl2,
                                caseCondition = "Diseases of the urinary system",
                                caseCount = 1)
#> Wrong ICD format: total 9 ICD codes (the number of occurrences is in brackets)
#> c("A0.11 (20)", "E114 (8)", "Z9.90 (6)", "F42 (6)", "001 (5)", "75.52 (4)", "755.2 (3)", "123.45 (3)", "7552 (2)")
#> 	
#> Wrong ICD version: total 7 ICD codes (the number of occurrences is in brackets)
#> c("V27.0 (18)", "A01.05 (8)", "42761 (7)", "V24.1 (6)", "A0105 (5)", "E03.0 (4)", "650 (4)")
#> 	
#> Warning: The ICD mentioned above matches to "NA" due to the format or other issues.
#> Warning: "Wrong ICD format" means the ICD has wrong format
#> Warning: "Wrong ICD version" means the ICD classify to wrong ICD version (cause the "icd10usingDate" or other issues)

# Condition era calculation with case selection

Era1 <- getConditionEra(sampleDxFile, ID, ICD, Date, icdVerColName = NULL, "2015-10-01",
                        groupDataType = CCSlvl3,
                        selectedCaseFile = selectedCaseFile)
#> Wrong ICD format: total 9 ICD codes (the number of occurrences is in brackets)
#> c("A0.11 (20)", "E114 (8)", "Z9.90 (6)", "F42 (6)", "001 (5)", "75.52 (4)", "755.2 (3)", "123.45 (3)", "7552 (2)")
#> 	
#> Wrong ICD version: total 7 ICD codes (the number of occurrences is in brackets)
#> c("V27.0 (18)", "A01.05 (8)", "42761 (7)", "V24.1 (6)", "A0105 (5)", "E03.0 (4)", "650 (4)")
#> 	
#> Warning: The ICD mentioned above matches to "NA" due to the format or other issues.
#> Warning: "Wrong ICD format" means the ICD has wrong format
#> Warning: "Wrong ICD version" means the ICD classify to wrong ICD version (cause the "icd10usingDate" or other issues)
head(Era1)
#>    ID        CCS_LVL_3_LABEL era firstCaseDate endCaseDate count period
#> 1: A0 Chronic kidney disease   1    2009-07-25  2009-07-25     1 0 days
#> 2: A0 Chronic kidney disease   2    2010-03-28  2010-03-28     1 0 days
#> 3: A0 Chronic kidney disease   3    2010-10-29  2010-10-29     1 0 days
#> 4: A0 Chronic kidney disease   4    2012-04-05  2012-04-05     1 0 days
#> 5: A0 Chronic kidney disease   5    2013-12-20  2013-12-20     1 0 days
#> 6: A1 Chronic kidney disease   1    2006-11-29  2006-11-29     1 0 days
#>    selectedCase
#> 1:     Selected
#> 2:     Selected
#> 3:     Selected
#> 4:     Selected
#> 5:     Selected
#> 6:     Selected

# Define the grouping categories

grepTable <- data.frame(Group = "Chronic kidney disease",
                        grepIcd = "^58|^N18",
                        stringsAsFactors = FALSE)

# Condition era calculation with grouping custom method of code standardization

Era2 <- getConditionEra(sampleDxFile, ID, ICD, Date, icdVerColName = NULL, "2015-10-01",
                        groupDataType = customGrepIcdGroup,
                        customGroupingTable = grepTable)
head(Era2)
#>    ID            GrepedGroup era firstCaseDate endCaseDate count period
#> 1: A0 Chronic kidney disease   1    2009-07-25  2009-07-25     1 0 days
#> 2: A0 Chronic kidney disease   2    2010-10-29  2010-10-29     1 0 days
#> 3: A0 Chronic kidney disease   3    2013-12-20  2013-12-20     1 0 days
#> 4: A1 Chronic kidney disease   1    2006-11-29  2006-11-29     1 0 days
#> 5: A1 Chronic kidney disease   2    2012-06-19  2012-06-19     1 0 days
#> 6: A1 Chronic kidney disease   3    2013-04-28  2013-04-28     1 0 days

Data integration for condition era calculation

Arguments

Value

Details

See also

Examples