The PheWAS classification for ICD-9-CM codes is a diagnostic categorization scheme that can employ in many types of projects analyzing data on diagnoses.
icdDxToPheWAS(
dxDataFile,
idColName,
icdColName,
dateColName,
icdVerColName = NULL,
icd10usingDate = NULL,
isDescription = TRUE
)
A data frame object of clinical diagnostic data with at least 3 columns: ID, ICD, and Date. As for date column, the data format should be YYYY/MM/DD or YYYY-MM-DD.
Column name of ID column in dxDataFile. Data type of this argumant should be string without quotation marks.
Column name of ICD column in dxDataFile. Data type of this argumant should be string without quotation marks.
Column name of date column in dxDataFile, and the type of date column should be a date format in R or a string format with date information in YYYY/MM/DD or YYYY-MM-DD. Data type of this argumant should be string without quotation marks.
(Optional) Column name if there is a columns to record ICD-9/10 version used in dxDataFile. In this column, data format should be numeric 9L or 10L to indicate which ICD version is used for each cell. See examples below to get more information.
The date that ICD-10 was started to be used in dxDataFile dataset. The data format should be YYYY/MM/DD or YYYY-MM-DD. Necessary if icdVerColName is null.
Binary. If true, category description of classification methods will be used in the group column. If false, category name will be used. By default, it is set to be True
(standard category description).
Three new data.table
s. 1) groupedDT
: Based on dxDataFile
with two new columns for uniform format diagnostic codes and classified categories. 2) summarised_groupedDT
: Summarized the groupedDT
dataset and sorted by memberID. 3) Error
: Potential error codes from standardization step: dxUniform
(icdDxShortToDecimal and icdDxDecimalToShort).
Collapse ICD codes into a smaller number of clinically meaningful categories that are more useful for presenting descriptive statistics than are individual ICD diagnostic codes.
Other code classification functions: dxCustom
(icdDxToCustom and icdDxToCustomGrep), dxComorbid
(icdDxToComorbid), DxCCS
(icdDxToCCS and icdDxToCCSLvl)
# sample file for example
head(sampleDxFile)
#> ID ICD Date Version
#> 1: A2 Z992 2020-05-22 10
#> 2: A5 Z992 2020-01-24 10
#> 3: A8 Z992 2015-10-27 10
#> 4: A13 Z992 2020-04-26 10
#> 5: A13 Z992 2025-02-02 10
#> 6: A15 Z992 2023-05-12 10
# Group diagnostic codes into PheWAS
icdDxToPheWAS(sampleDxFile, ID, ICD, Date, icdVerColName = NULL, "2015-10-01", FALSE)
#> Wrong ICD format: total 9 ICD codes (the number of occurrences is in brackets)
#> c("A0.11 (20)", "E114 (8)", "Z9.90 (6)", "F42 (6)", "001 (5)", "75.52 (4)", "755.2 (3)", "123.45 (3)", "7552 (2)")
#>
#> Wrong ICD version: total 7 ICD codes (the number of occurrences is in brackets)
#> c("V27.0 (18)", "A01.05 (8)", "42761 (7)", "V24.1 (6)", "A0105 (5)", "E03.0 (4)", "650 (4)")
#>
#> Warning: The ICD mentioned above matches to "NA" due to the format or other issues.
#> Warning: "Wrong ICD format" means the ICD has wrong format
#> Warning: "Wrong ICD version" means the ICD classify to wrong ICD version (cause the "icd10usingDate" or other issues)
#> $groupedDT
#> Decimal ID ICD Date PheCode
#> 1: Z99.2 A2 Z992 2020-05-22 585.31
#> 2: Z99.2 A5 Z992 2020-01-24 585.31
#> 3: Z99.2 A8 Z992 2015-10-27 585.31
#> 4: Z99.2 A13 Z992 2020-04-26 585.31
#> 5: Z99.2 A13 Z992 2025-02-02 585.31
#> ---
#> 300: 650 C2 650 2017-10-10 <NA>
#> 301: 650 D4 650 2022-04-04 <NA>
#> 302: 123.45 C3 123.45 2022-09-28 <NA>
#> 303: 123.45 D8 123.45 2018-06-17 <NA>
#> 304: 123.45 C0 123.45 2017-05-09 <NA>
#>
#> $summarised_groupedDT
#> ID PheCode firstCaseDate endCaseDate count period
#> 1: A2 585.31 2015-08-12 2020-05-22 2 1745 days
#> 2: A5 585.31 2014-10-27 2020-01-24 3 1915 days
#> 3: A8 585.31 2007-11-22 2015-10-27 4 2896 days
#> 4: A13 585.31 2010-09-29 2025-02-02 3 5240 days
#> 5: A15 585.31 2007-05-25 2023-05-12 4 5831 days
#> ---
#> 146: D7 755.4 2009-01-12 2009-01-12 1 0 days
#> 147: C4 300.3 2022-01-23 2022-09-27 2 247 days
#> 148: D3 300.3 2022-02-22 2022-02-22 1 0 days
#> 149: D0 300.3 2021-02-26 2024-07-15 2 1235 days
#> 150: C2 300.3 2020-03-24 2020-03-24 1 0 days
#>
#> $Error
#> ICD count IcdVersionInFile WrongType Suggestion
#> 1: A0.11 20 ICD 10 Wrong format
#> 2: V27.0 18 ICD 10 Wrong version
#> 3: E114 8 ICD 10 Wrong format
#> 4: A01.05 8 ICD 9 Wrong version
#> 5: 42761 7 ICD 10 Wrong version
#> 6: Z9.90 6 ICD 10 Wrong format
#> 7: F42 6 ICD 10 Wrong format
#> 8: V24.1 6 ICD 10 Wrong version
#> 9: A0105 5 ICD 9 Wrong version
#> 10: 001 5 ICD 9 Wrong format 0019
#> 11: 75.52 4 ICD 9 Wrong format
#> 12: E03.0 4 ICD 9 Wrong version
#> 13: 650 4 ICD 10 Wrong version
#> 14: 123.45 3 ICD 10 Wrong format
#> 15: 755.2 3 ICD 9 Wrong format 755.29
#> 16: 7552 2 ICD 9 Wrong format 75529
#>