Researches can define the grouping categories and therefore have more flexible for grouping ICD diagnostic codes.

icdDxToCustom(
  dxDataFile,
  idColName,
  icdColName,
  dateColName,
  customGroupingTable
)

icdDxToCustomGrep(
  dxDataFile,
  idColName,
  icdColName,
  dateColName,
  customGroupingTable
)

Arguments

dxDataFile

A data frame object of clinical diagnostic data with at least 3 columns: ID, ICD, and Date. As for date column, the data format should be YYYY/MM/DD or YYYY-MM-DD.

idColName

Column name of ID column in dxDataFile. Data type of this argumant should be string without quotation marks.

icdColName

Column name of ICD column in dxDataFile. Data type of this argumant should be string without quotation marks.

dateColName

Column name of date column in dxDataFile, and the type of date column should be a date format in R or a string format with date information in YYYY/MM/DD or YYYY-MM-DD. Data type of this argumant should be string without quotation marks.

customGroupingTable

Used-defined grouping categories. icdDxToCustom needs a dataset with two columns called "Group" and "ICD", respectively; User can define one or more disease categories in "Group" column, and define a list of corresponding category-related ICD codes in "ICD" column. icdDxToCustomGrep needs a dataset with two columns: "Group", "grepIcd"; "Group" defines one or more disease categories and "grepICD" defines disease-related ICD code character strings containing regular expressions.

Value

Two new data.tables. 1) groupedDT: Based on dxDataFile with two new columns for uniform format diagnostic codes and classified standard categories. 2) summarised_groupedDT: Summarized the dataset groupedDT and sorted by memberID.

Details

There are two functions for customized defined grouping method, the customized category grouping is based on precise (`icdDxToCustom`) and fuzzy (`icdDxToCustomGrep`) grouping method, respectively.

See also

Other code classification functions: dxPheWAS, dxComorbid, DxCCS

Examples

# sample file for example

head(sampleDxFile)
#>     ID  ICD       Date Version
#> 1:  A2 Z992 2020-05-22      10
#> 2:  A5 Z992 2020-01-24      10
#> 3:  A8 Z992 2015-10-27      10
#> 4: A13 Z992 2020-04-26      10
#> 5: A13 Z992 2025-02-02      10
#> 6: A15 Z992 2023-05-12      10

# Group diagnostic codes into "Chronic kidney disease" with precise grouping method

groupingTable <- data.frame(Group = rep("Chronic kidney disease",6),
                            ICD = c("N181","5853","5854","5855","5856","5859"),
                            stringsAsFactors = FALSE)

icdDxToCustom(sampleDxFile, ID, ICD, Date, customGroupingTable = groupingTable)
#> $groupedDT
#>         ICD  ID       Date Group
#>   1:   Z992  A2 2020-05-22  <NA>
#>   2:   Z992  A5 2020-01-24  <NA>
#>   3:   Z992  A8 2015-10-27  <NA>
#>   4:   Z992 A13 2020-04-26  <NA>
#>   5:   Z992 A13 2025-02-02  <NA>
#>  ---                            
#> 296:    650  C2 2017-10-10  <NA>
#> 297:    650  D4 2022-04-04  <NA>
#> 298: 123.45  C3 2022-09-28  <NA>
#> 299: 123.45  D8 2018-06-17  <NA>
#> 300: 123.45  C0 2017-05-09  <NA>
#> 
#> $summarised_groupedDT
#>      ID                  Group firstCaseDate endCaseDate count    period
#>  1:  A0 Chronic kidney disease    2009-07-25  2013-12-20     3 1609 days
#>  2:  A1 Chronic kidney disease    2006-11-29  2014-09-24     4 2856 days
#>  3: A10 Chronic kidney disease    2007-11-04  2007-11-04     1    0 days
#>  4: A11 Chronic kidney disease    2008-03-09  2010-02-21     2  714 days
#>  5: A12 Chronic kidney disease    2006-05-14  2011-02-25     3 1748 days
#>  6: A13 Chronic kidney disease    2006-04-29  2010-02-21     2 1394 days
#>  7: A14 Chronic kidney disease    2011-11-07  2013-07-26     2  627 days
#>  8: A17 Chronic kidney disease    2007-02-19  2013-09-30     2 2415 days
#>  9: A18 Chronic kidney disease    2007-04-05  2012-02-10     2 1772 days
#> 10:  A2 Chronic kidney disease    2011-09-20  2015-01-06     3 1204 days
#> 11:  A3 Chronic kidney disease    2010-11-12  2014-02-24     2 1200 days
#> 12:  A4 Chronic kidney disease    2015-03-09  2015-03-09     1    0 days
#> 13:  A5 Chronic kidney disease    2010-08-24  2010-08-24     1    0 days
#> 14:  A6 Chronic kidney disease    2007-10-01  2007-10-01     1    0 days
#> 15:  A7 Chronic kidney disease    2007-02-01  2013-12-16     2 2510 days
#> 16:  A8 Chronic kidney disease    2014-08-20  2014-08-20     1    0 days
#> 

# Group diagnostic codes into "Chronic kidney disease" with fuzzy grouping method

grepTable <- data.frame(Group = "Chronic kidney disease",
                        grepIcd = "^585|^N18",
                        stringsAsFactors = FALSE)

icdDxToCustomGrep(sampleDxFile, ID, ICD, Date, customGroupingTable = grepTable)
#> $groupedDT
#>       ID    ICD       Date GrepedGroup
#>   1:  A2   Z992 2020-05-22        <NA>
#>   2:  A5   Z992 2020-01-24        <NA>
#>   3:  A8   Z992 2015-10-27        <NA>
#>   4: A13   Z992 2020-04-26        <NA>
#>   5: A13   Z992 2025-02-02        <NA>
#>  ---                                  
#> 296:  C2    650 2017-10-10        <NA>
#> 297:  D4    650 2022-04-04        <NA>
#> 298:  C3 123.45 2022-09-28        <NA>
#> 299:  D8 123.45 2018-06-17        <NA>
#> 300:  C0 123.45 2017-05-09        <NA>
#> 
#> $summarised_groupedDT
#>      ID            GrepedGroup firstCaseDate endCaseDate count    period
#>  1:  A0 Chronic kidney disease    2009-07-25  2013-12-20     3 1609 days
#>  2:  A1 Chronic kidney disease    2006-11-29  2014-09-24     4 2856 days
#>  3: A10 Chronic kidney disease    2007-11-04  2007-11-04     1    0 days
#>  4: A11 Chronic kidney disease    2008-03-09  2010-02-21     2  714 days
#>  5: A12 Chronic kidney disease    2006-05-14  2011-02-25     3 1748 days
#>  6: A13 Chronic kidney disease    2006-04-29  2010-02-21     2 1394 days
#>  7: A14 Chronic kidney disease    2011-11-07  2013-07-26     2  627 days
#>  8: A17 Chronic kidney disease    2007-02-19  2013-09-30     2 2415 days
#>  9: A18 Chronic kidney disease    2007-04-05  2012-02-10     2 1772 days
#> 10:  A2 Chronic kidney disease    2011-09-20  2015-01-06     3 1204 days
#> 11:  A3 Chronic kidney disease    2010-11-12  2014-02-24     2 1200 days
#> 12:  A4 Chronic kidney disease    2015-03-09  2015-03-09     1    0 days
#> 13:  A5 Chronic kidney disease    2010-08-24  2010-08-24     1    0 days
#> 14:  A6 Chronic kidney disease    2007-10-01  2007-10-01     1    0 days
#> 15:  A7 Chronic kidney disease    2007-02-01  2013-12-16     2 2510 days
#> 16:  A8 Chronic kidney disease    2014-08-20  2014-08-20     1    0 days
#> 17:  B0 Chronic kidney disease    2015-12-26  2023-08-28     4 2802 days
#> 18:  B1 Chronic kidney disease    2016-08-08  2024-03-04     5 2765 days
#> 19:  B2 Chronic kidney disease    2016-03-20  2024-09-20     5 3106 days
#> 20:  B3 Chronic kidney disease    2019-05-07  2025-05-25     6 2210 days
#> 21:  B4 Chronic kidney disease    2015-12-02  2025-07-21     6 3519 days
#>      ID            GrepedGroup firstCaseDate endCaseDate count    period
#>