R/fun_emr.R
, R/icdDxToCustom.R
, R/icdDxToCustomGrep.R
dxCustom.Rd
Researches can define the grouping categories and therefore have more flexible for grouping ICD diagnostic codes.
icdDxToCustom(
dxDataFile,
idColName,
icdColName,
dateColName,
customGroupingTable
)
icdDxToCustomGrep(
dxDataFile,
idColName,
icdColName,
dateColName,
customGroupingTable
)
A data frame object of clinical diagnostic data with at least 3 columns: ID, ICD, and Date. As for date column, the data format should be YYYY/MM/DD or YYYY-MM-DD.
Column name of ID column in dxDataFile. Data type of this argumant should be string without quotation marks.
Column name of ICD column in dxDataFile. Data type of this argumant should be string without quotation marks.
Column name of date column in dxDataFile, and the type of date column should be a date format in R or a string format with date information in YYYY/MM/DD or YYYY-MM-DD. Data type of this argumant should be string without quotation marks.
Used-defined grouping categories. icdDxToCustom
needs a dataset with two columns called "Group" and "ICD", respectively; User can define one or more disease categories in "Group" column, and define a list of corresponding category-related ICD codes in "ICD" column. icdDxToCustomGrep
needs a dataset with two columns: "Group", "grepIcd"; "Group" defines one or more disease categories and "grepICD" defines disease-related ICD code character strings containing regular expressions.
Two new data.table
s. 1) groupedDT
: Based on dxDataFile
with two new columns for uniform format diagnostic codes and classified standard categories. 2) summarised_groupedDT
: Summarized the dataset groupedDT
and sorted by memberID.
There are two functions for customized defined grouping method, the customized category grouping is based on precise (`icdDxToCustom`) and fuzzy (`icdDxToCustomGrep`) grouping method, respectively.
Other code classification functions: dxPheWAS
, dxComorbid
, DxCCS
# sample file for example
head(sampleDxFile)
#> ID ICD Date Version
#> 1: A2 Z992 2020-05-22 10
#> 2: A5 Z992 2020-01-24 10
#> 3: A8 Z992 2015-10-27 10
#> 4: A13 Z992 2020-04-26 10
#> 5: A13 Z992 2025-02-02 10
#> 6: A15 Z992 2023-05-12 10
# Group diagnostic codes into "Chronic kidney disease" with precise grouping method
groupingTable <- data.frame(Group = rep("Chronic kidney disease",6),
ICD = c("N181","5853","5854","5855","5856","5859"),
stringsAsFactors = FALSE)
icdDxToCustom(sampleDxFile, ID, ICD, Date, customGroupingTable = groupingTable)
#> $groupedDT
#> ICD ID Date Group
#> 1: Z992 A2 2020-05-22 <NA>
#> 2: Z992 A5 2020-01-24 <NA>
#> 3: Z992 A8 2015-10-27 <NA>
#> 4: Z992 A13 2020-04-26 <NA>
#> 5: Z992 A13 2025-02-02 <NA>
#> ---
#> 296: 650 C2 2017-10-10 <NA>
#> 297: 650 D4 2022-04-04 <NA>
#> 298: 123.45 C3 2022-09-28 <NA>
#> 299: 123.45 D8 2018-06-17 <NA>
#> 300: 123.45 C0 2017-05-09 <NA>
#>
#> $summarised_groupedDT
#> ID Group firstCaseDate endCaseDate count period
#> 1: A0 Chronic kidney disease 2009-07-25 2013-12-20 3 1609 days
#> 2: A1 Chronic kidney disease 2006-11-29 2014-09-24 4 2856 days
#> 3: A10 Chronic kidney disease 2007-11-04 2007-11-04 1 0 days
#> 4: A11 Chronic kidney disease 2008-03-09 2010-02-21 2 714 days
#> 5: A12 Chronic kidney disease 2006-05-14 2011-02-25 3 1748 days
#> 6: A13 Chronic kidney disease 2006-04-29 2010-02-21 2 1394 days
#> 7: A14 Chronic kidney disease 2011-11-07 2013-07-26 2 627 days
#> 8: A17 Chronic kidney disease 2007-02-19 2013-09-30 2 2415 days
#> 9: A18 Chronic kidney disease 2007-04-05 2012-02-10 2 1772 days
#> 10: A2 Chronic kidney disease 2011-09-20 2015-01-06 3 1204 days
#> 11: A3 Chronic kidney disease 2010-11-12 2014-02-24 2 1200 days
#> 12: A4 Chronic kidney disease 2015-03-09 2015-03-09 1 0 days
#> 13: A5 Chronic kidney disease 2010-08-24 2010-08-24 1 0 days
#> 14: A6 Chronic kidney disease 2007-10-01 2007-10-01 1 0 days
#> 15: A7 Chronic kidney disease 2007-02-01 2013-12-16 2 2510 days
#> 16: A8 Chronic kidney disease 2014-08-20 2014-08-20 1 0 days
#>
# Group diagnostic codes into "Chronic kidney disease" with fuzzy grouping method
grepTable <- data.frame(Group = "Chronic kidney disease",
grepIcd = "^585|^N18",
stringsAsFactors = FALSE)
icdDxToCustomGrep(sampleDxFile, ID, ICD, Date, customGroupingTable = grepTable)
#> $groupedDT
#> ID ICD Date GrepedGroup
#> 1: A2 Z992 2020-05-22 <NA>
#> 2: A5 Z992 2020-01-24 <NA>
#> 3: A8 Z992 2015-10-27 <NA>
#> 4: A13 Z992 2020-04-26 <NA>
#> 5: A13 Z992 2025-02-02 <NA>
#> ---
#> 296: C2 650 2017-10-10 <NA>
#> 297: D4 650 2022-04-04 <NA>
#> 298: C3 123.45 2022-09-28 <NA>
#> 299: D8 123.45 2018-06-17 <NA>
#> 300: C0 123.45 2017-05-09 <NA>
#>
#> $summarised_groupedDT
#> ID GrepedGroup firstCaseDate endCaseDate count period
#> 1: A0 Chronic kidney disease 2009-07-25 2013-12-20 3 1609 days
#> 2: A1 Chronic kidney disease 2006-11-29 2014-09-24 4 2856 days
#> 3: A10 Chronic kidney disease 2007-11-04 2007-11-04 1 0 days
#> 4: A11 Chronic kidney disease 2008-03-09 2010-02-21 2 714 days
#> 5: A12 Chronic kidney disease 2006-05-14 2011-02-25 3 1748 days
#> 6: A13 Chronic kidney disease 2006-04-29 2010-02-21 2 1394 days
#> 7: A14 Chronic kidney disease 2011-11-07 2013-07-26 2 627 days
#> 8: A17 Chronic kidney disease 2007-02-19 2013-09-30 2 2415 days
#> 9: A18 Chronic kidney disease 2007-04-05 2012-02-10 2 1772 days
#> 10: A2 Chronic kidney disease 2011-09-20 2015-01-06 3 1204 days
#> 11: A3 Chronic kidney disease 2010-11-12 2014-02-24 2 1200 days
#> 12: A4 Chronic kidney disease 2015-03-09 2015-03-09 1 0 days
#> 13: A5 Chronic kidney disease 2010-08-24 2010-08-24 1 0 days
#> 14: A6 Chronic kidney disease 2007-10-01 2007-10-01 1 0 days
#> 15: A7 Chronic kidney disease 2007-02-01 2013-12-16 2 2510 days
#> 16: A8 Chronic kidney disease 2014-08-20 2014-08-20 1 0 days
#> 17: B0 Chronic kidney disease 2015-12-26 2023-08-28 4 2802 days
#> 18: B1 Chronic kidney disease 2016-08-08 2024-03-04 5 2765 days
#> 19: B2 Chronic kidney disease 2016-03-20 2024-09-20 5 3106 days
#> 20: B3 Chronic kidney disease 2019-05-07 2025-05-25 6 2210 days
#> 21: B4 Chronic kidney disease 2015-12-02 2025-07-21 6 3519 days
#> ID GrepedGroup firstCaseDate endCaseDate count period
#>