The PheWAS classification for ICD-9-CM codes is a diagnostic categorization scheme that can employ in many types of projects analyzing data on diagnoses.

icdDxToPheWAS(
  dxDataFile,
  idColName,
  icdColName,
  dateColName,
  icdVerColName = NULL,
  icd10usingDate = NULL,
  isDescription = TRUE
)

Arguments

dxDataFile

A data frame object of clinical diagnostic data with at least 3 columns: ID, ICD, and Date. As for date column, the data format should be YYYY/MM/DD or YYYY-MM-DD.

idColName

Column name of ID column in dxDataFile. Data type of this argumant should be string without quotation marks.

icdColName

Column name of ICD column in dxDataFile. Data type of this argumant should be string without quotation marks.

dateColName

Column name of date column in dxDataFile, and the type of date column should be a date format in R or a string format with date information in YYYY/MM/DD or YYYY-MM-DD. Data type of this argumant should be string without quotation marks.

icdVerColName

(Optional) Column name if there is a columns to record ICD-9/10 version used in dxDataFile. In this column, data format should be numeric 9L or 10L to indicate which ICD version is used for each cell. See examples below to get more information.

icd10usingDate

The date that ICD-10 was started to be used in dxDataFile dataset. The data format should be YYYY/MM/DD or YYYY-MM-DD. Necessary if icdVerColName is null.

isDescription

Binary. If true, category description of classification methods will be used in the group column. If false, category name will be used. By default, it is set to be True (standard category description).

Value

Three new data.tables. 1) groupedDT: Based on dxDataFile with two new columns for uniform format diagnostic codes and classified categories. 2) summarised_groupedDT: Summarized the groupedDT dataset and sorted by memberID. 3) Error: Potential error codes from standardization step: dxUniform (icdDxShortToDecimal and icdDxDecimalToShort).

Details

Collapse ICD codes into a smaller number of clinically meaningful categories that are more useful for presenting descriptive statistics than are individual ICD diagnostic codes.

See also

Other code classification functions: dxCustom (icdDxToCustom and icdDxToCustomGrep), dxComorbid (icdDxToComorbid), DxCCS (icdDxToCCS and icdDxToCCSLvl)

Examples

# sample file for example

head(sampleDxFile)
#>     ID  ICD       Date Version
#> 1:  A2 Z992 2020-05-22      10
#> 2:  A5 Z992 2020-01-24      10
#> 3:  A8 Z992 2015-10-27      10
#> 4: A13 Z992 2020-04-26      10
#> 5: A13 Z992 2025-02-02      10
#> 6: A15 Z992 2023-05-12      10

# Group diagnostic codes into PheWAS

icdDxToPheWAS(sampleDxFile, ID, ICD, Date, icdVerColName = NULL, "2015-10-01", FALSE)
#> Wrong ICD format: total 9 ICD codes (the number of occurrences is in brackets)
#> c("A0.11 (20)", "E114 (8)", "Z9.90 (6)", "F42 (6)", "001 (5)", "75.52 (4)", "755.2 (3)", "123.45 (3)", "7552 (2)")
#> 	
#> Wrong ICD version: total 7 ICD codes (the number of occurrences is in brackets)
#> c("V27.0 (18)", "A01.05 (8)", "42761 (7)", "V24.1 (6)", "A0105 (5)", "E03.0 (4)", "650 (4)")
#> 	
#> Warning: The ICD mentioned above matches to "NA" due to the format or other issues.
#> Warning: "Wrong ICD format" means the ICD has wrong format
#> Warning: "Wrong ICD version" means the ICD classify to wrong ICD version (cause the "icd10usingDate" or other issues)
#> $groupedDT
#>      Decimal  ID    ICD       Date PheCode
#>   1:   Z99.2  A2   Z992 2020-05-22  585.31
#>   2:   Z99.2  A5   Z992 2020-01-24  585.31
#>   3:   Z99.2  A8   Z992 2015-10-27  585.31
#>   4:   Z99.2 A13   Z992 2020-04-26  585.31
#>   5:   Z99.2 A13   Z992 2025-02-02  585.31
#>  ---                                      
#> 300:     650  C2    650 2017-10-10    <NA>
#> 301:     650  D4    650 2022-04-04    <NA>
#> 302:  123.45  C3 123.45 2022-09-28    <NA>
#> 303:  123.45  D8 123.45 2018-06-17    <NA>
#> 304:  123.45  C0 123.45 2017-05-09    <NA>
#> 
#> $summarised_groupedDT
#>       ID PheCode firstCaseDate endCaseDate count    period
#>   1:  A2  585.31    2015-08-12  2020-05-22     2 1745 days
#>   2:  A5  585.31    2014-10-27  2020-01-24     3 1915 days
#>   3:  A8  585.31    2007-11-22  2015-10-27     4 2896 days
#>   4: A13  585.31    2010-09-29  2025-02-02     3 5240 days
#>   5: A15  585.31    2007-05-25  2023-05-12     4 5831 days
#>  ---                                                      
#> 146:  D7   755.4    2009-01-12  2009-01-12     1    0 days
#> 147:  C4   300.3    2022-01-23  2022-09-27     2  247 days
#> 148:  D3   300.3    2022-02-22  2022-02-22     1    0 days
#> 149:  D0   300.3    2021-02-26  2024-07-15     2 1235 days
#> 150:  C2   300.3    2020-03-24  2020-03-24     1    0 days
#> 
#> $Error
#>        ICD count IcdVersionInFile     WrongType Suggestion
#>  1:  A0.11    20           ICD 10  Wrong format           
#>  2:  V27.0    18           ICD 10 Wrong version           
#>  3:   E114     8           ICD 10  Wrong format           
#>  4: A01.05     8            ICD 9 Wrong version           
#>  5:  42761     7           ICD 10 Wrong version           
#>  6:  Z9.90     6           ICD 10  Wrong format           
#>  7:    F42     6           ICD 10  Wrong format           
#>  8:  V24.1     6           ICD 10 Wrong version           
#>  9:  A0105     5            ICD 9 Wrong version           
#> 10:    001     5            ICD 9  Wrong format       0019
#> 11:  75.52     4            ICD 9  Wrong format           
#> 12:  E03.0     4            ICD 9 Wrong version           
#> 13:    650     4           ICD 10 Wrong version           
#> 14: 123.45     3           ICD 10  Wrong format           
#> 15:  755.2     3            ICD 9  Wrong format     755.29
#> 16:   7552     2            ICD 9  Wrong format      75529
#>