R/fun_emr.R
, R/icdDxDecimalToShort.R
, R/icdDxShortToDecimal.R
dxUniform.Rd
These two functions can convert the ICD diagnostic codes to a uniform format.
icdDxDecimalToShort(
dxDataFile,
icdColName,
dateColName,
icdVerColName = NULL,
icd10usingDate = NULL
)
icdDxShortToDecimal(
dxDataFile,
icdColName,
dateColName,
icdVerColName = NULL,
icd10usingDate = NULL
)
A data frame object of clinical diagnostic data with at least 2 columns: ICD, and Date. As for date column, the data format should be YYYY/MM/DD or YYYY-MM-DD.
Column name of ICD column in dxDataFile. Data type of this argumant should be string without quotation marks.
Column name of date column in dxDataFile, and the type of date column should be a date format in R or a string format with date information in YYYY/MM/DD or YYYY-MM-DD. Data type of this argumant should be string without quotation marks.
(Optional) Column name if there is a columns to record ICD-9/10 version used in dxDataFile. In this column, data format should be numeric 9L or 10L to indicate which ICD version is used for each cell. See examples below to get more information.
The date that ICD-10 was started to be used in dxDataFile dataset. The data format should be YYYY/MM/DD or YYYY-MM-DD. Necessary if icdVerColName is null.
Two new data.table
s. 1) ICD
: Uniform format diagnostic codes with column name "ICD". 2) Error
: Potential error codes with 5 columns: ICD, count, IcdVersionInFile, WrongType and Suggestion.
icdDxShortToDecimal
can be used for grouping diagnostic code to PheWAS classification (dxPheWAS
: icdDxToPheWAS). icdDxDecimalToShort
can be used for grouping to the other classification methods (dxCCS
: icdDxToCCS & icdDxToCCSLvl; dxComorbid
: icdDxToComorbid). These transformation functions do not only convert the ICD codes to uniform format but also check potential coding error of the ICD format and version.
# sample file for example
head(sampleDxFile)
#> ID ICD Date Version
#> 1: A2 Z992 2020-05-22 10
#> 2: A5 Z992 2020-01-24 10
#> 3: A8 Z992 2015-10-27 10
#> 4: A13 Z992 2020-04-26 10
#> 5: A13 Z992 2025-02-02 10
#> 6: A15 Z992 2023-05-12 10
# convert the diagnostic codes to the decimal format
icdDxShortToDecimal(sampleDxFile,ICD,Date, icd10usingDate = "2015/10/01")
#> Wrong ICD format: total 9 ICD codes (the number of occurrences is in brackets)
#> c("A0.11 (20)", "E114 (8)", "Z9.90 (6)", "F42 (6)", "001 (5)", "75.52 (4)", "755.2 (3)", "123.45 (3)", "7552 (2)")
#>
#> Wrong ICD version: total 7 ICD codes (the number of occurrences is in brackets)
#> c("V27.0 (18)", "A01.05 (8)", "42761 (7)", "V24.1 (6)", "A0105 (5)", "E03.0 (4)", "650 (4)")
#>
#> Warning: The ICD mentioned above matches to "NA" due to the format or other issues.
#> Warning: "Wrong ICD format" means the ICD has wrong format
#> Warning: "Wrong ICD version" means the ICD classify to wrong ICD version (cause the "icd10usingDate" or other issues)
#> $ICD
#> ICD
#> 1: Z99.2
#> 2: Z99.2
#> 3: Z99.2
#> 4: Z99.2
#> 5: Z99.2
#> ---
#> 296: 650
#> 297: 650
#> 298: 123.45
#> 299: 123.45
#> 300: 123.45
#>
#> $Error
#> ICD count IcdVersionInFile WrongType Suggestion
#> 1: A0.11 20 ICD 10 Wrong format
#> 2: V27.0 18 ICD 10 Wrong version
#> 3: E114 8 ICD 10 Wrong format
#> 4: A01.05 8 ICD 9 Wrong version
#> 5: 42761 7 ICD 10 Wrong version
#> 6: Z9.90 6 ICD 10 Wrong format
#> 7: F42 6 ICD 10 Wrong format
#> 8: V24.1 6 ICD 10 Wrong version
#> 9: A0105 5 ICD 9 Wrong version
#> 10: 001 5 ICD 9 Wrong format 0019
#> 11: 75.52 4 ICD 9 Wrong format
#> 12: E03.0 4 ICD 9 Wrong version
#> 13: 650 4 ICD 10 Wrong version
#> 14: 123.45 3 ICD 10 Wrong format
#> 15: 755.2 3 ICD 9 Wrong format 755.29
#> 16: 7552 2 ICD 9 Wrong format 75529
#>
# convert the diagnostic codes to the short format
icdDxDecimalToShort(sampleDxFile,ICD,Date, icd10usingDate = "2015/10/01")
#> Wrong ICD format: total 9 ICD codes (the number of occurrences is in brackets)
#> c("A0.11 (20)", "E114 (8)", "Z9.90 (6)", "F42 (6)", "001 (5)", "75.52 (4)", "755.2 (3)", "123.45 (3)", "7552 (2)")
#>
#> Wrong ICD version: total 7 ICD codes (the number of occurrences is in brackets)
#> c("V27.0 (18)", "A01.05 (8)", "42761 (7)", "V24.1 (6)", "A0105 (5)", "E03.0 (4)", "650 (4)")
#>
#> Warning: The ICD mentioned above matches to "NA" due to the format or other issues.
#> Warning: "Wrong ICD format" means the ICD has wrong format
#> Warning: "Wrong ICD version" means the ICD classify to wrong ICD version (cause the "icd10usingDate" or other issues)
#> $ICD
#> ICD
#> 1: Z992
#> 2: Z992
#> 3: Z992
#> 4: Z992
#> 5: Z992
#> ---
#> 296: 650
#> 297: 650
#> 298: 123.45
#> 299: 123.45
#> 300: 123.45
#>
#> $Error
#> ICD count IcdVersionInFile WrongType Suggestion
#> 1: A0.11 20 ICD 10 Wrong format
#> 2: V27.0 18 ICD 10 Wrong version
#> 3: E114 8 ICD 10 Wrong format
#> 4: A01.05 8 ICD 9 Wrong version
#> 5: 42761 7 ICD 10 Wrong version
#> 6: Z9.90 6 ICD 10 Wrong format
#> 7: F42 6 ICD 10 Wrong format
#> 8: V24.1 6 ICD 10 Wrong version
#> 9: A0105 5 ICD 9 Wrong version
#> 10: 001 5 ICD 9 Wrong format 0019
#> 11: 75.52 4 ICD 9 Wrong format
#> 12: E03.0 4 ICD 9 Wrong version
#> 13: 650 4 ICD 10 Wrong version
#> 14: 123.45 3 ICD 10 Wrong format
#> 15: 755.2 3 ICD 9 Wrong format 755.29
#> 16: 7552 2 ICD 9 Wrong format 75529
#>