Introduction

Alessandro Gasparini

2024-07-16

This vignette illustrates how to use the {comorbidity} package to identify comorbid conditions and to compute weighted (or unweighted) comorbidity scores.

For this, we will simulate a dataset with 100 patients and 10000 ICD-10 codes using the sample_diag() function:

library(comorbidity)

set.seed(1)
df <- data.frame(
  id = sample(seq(100), size = 10000, replace = TRUE),
  code = sample_diag(n = 100)
)
# Sort
df <- df[order(df$id, df$code), ]
str(df)
#> 'data.frame':    10000 obs. of  2 variables:
#>  $ id  : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ code: chr  "A671" "B170" "C33" "C33" ...

By default, the sample_diag() function simulates ICD-10 data; it is however possible to simulate ICD-9 codes too, as we will see later on.

Mapping Comorbidities

The comorbidity() function can be used to apply mapping algorithms to a dataset. Here, for instance, we use the Quan et al. (2005) version of the Charlson Comorbidity Index:

charlson_df <- comorbidity(
  x = df,
  id = "id",
  code = "code",
  map = "charlson_icd10_quan",
  assign0 = FALSE
)
str(charlson_df)
#> Classes 'comorbidity' and 'data.frame':  100 obs. of  18 variables:
#>  $ id      : int  1 2 3 4 5 6 7 8 9 10 ...
#>  $ mi      : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ chf     : int  0 1 1 0 0 1 1 0 1 1 ...
#>  $ pvd     : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ cevd    : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ dementia: int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ cpd     : int  1 1 1 1 0 0 1 1 1 1 ...
#>  $ rheumd  : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ pud     : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ mld     : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ diab    : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ diabwc  : int  1 0 1 1 1 1 1 1 1 1 ...
#>  $ hp      : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ rend    : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ canc    : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ msld    : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ metacanc: int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ aids    : int  0 0 0 0 0 0 0 0 0 0 ...
#>  - attr(*, "variable.labels")= chr [1:18] "ID" "Myocardial infarction" "Congestive heart failure" "Peripheral vascular disease" ...
#>  - attr(*, "map")= chr "charlson_icd10_quan"

The resulting data frame has a row per subject, a column for IDs, and a column for each condition included in a given score (e.g. 17 conditions for the Charlson score).

length(unique(df$id)) == nrow(charlson_df)
#> [1] TRUE

The different columns are also labelled for compatibility with the RStudio viewer, see e.g. View(charlson_df) after running the code above on your computer.

To see all supported mapping algorithms, please see the vignette:

vignette("B-comorbidity-scores", package = "comorbidity")

Comorbidity Scores

After calculating a data frame of comorbid conditions, that can be used to calculate comorbidity scores using the score() function. Say we want to calculate the Charlson comorbidity score, weighted using the Quan et al. (2011) weights:

quan_cci <- score(x = charlson_df, weights = "quan", assign0 = FALSE)
table(quan_cci)
#> quan_cci
#>  2  3  4  5  6 
#>  7 18 20 30 25

This returns a single value per subject:

length(quan_cci) == nrow(charlson_df)
#> [1] TRUE

If a pure combination of conditions is required (e.g. an unweighted score), pass the NULL value to the weights argument of the score() function:

unw_cci <- score(x = charlson_df, weights = NULL, assign0 = FALSE)
table(unw_cci)
#> unw_cci
#>  1  2  3  4 
#>  7 25 43 25

Once again, available weighting systems/algorithms are described in the same vignette that was mentioned above.

References