As outlined in the “loading data” article, the first step to calculating CHAMPS statistics is to load the R package and load the data. Suppose we have downloaded the csv files available in a directory ~/Downloads/CHAMPS_de_identified_data_2020-08-01
:
The functions outlined below are designed to be general, providing parameters for specifying different conditions, pathogens, sites, specimen types, TAC variables, and ICD10 codes.
The selection and combination of these parameters will be dictated by the nature of the requested analysis, and will often require domain expertise to determing meaningful combinations.
To see what valid values of different parameters are, a number of functions are provided.
All summary functions have a site
parameter for which the user can specify which sites to include in the calculation. A list of valid sites found in the data can be listed by calling valid_sites()
.
valid_sites(d) #> [1] "Bangladesh" "Ethiopia" "Kenya" "Mali" "Mozambique" "Sierra Leone" #> [7] "South Africa"
In all of the examples below, we are going to use the following 5 sites:
sites <- c("Bangladesh", "Kenya", "Mali", "Mozambique", "South Africa")
A list of valid DeCoDe conditions can be listed by calling valid_conditions()
. It returns a list of valid etiologies and valid group descriptions.
valid_conditions(d)$etiol #> [1] "Acinetobacter baumannii" #> [2] "Adenovirus" #> [3] "Adenovirus 40/41" #> [4] "B. pertussis" #> [5] "Campylobacter jejuni" #> [6] "Candida albicans" #> [7] "Candida glabrata" #> [8] "Candida parapsilosis" #> [9] "Candida spp." #> [10] "Candida tropicalis" ...
valid_conditions(d)$champs_group_desc #> [1] "Anemias" #> [2] "BIrth trauma" #> [3] "Cancer" #> [4] "Cesarean delivery" #> [5] "Chorioamnionitis and membrane complications" #> [6] "Congenital birth defects" #> [7] "Congenital infection" #> [8] "Congenital Infection" #> [9] "Diabetes" #> [10] "Diarrheal Diseases" ...
While these are truncated for brevity, there are currently length(valid_conditions(d)$etiol)
etiology conditions and length(valid_conditions(d)$champs_group_desc)
group description conditions. We see “Streptococcus agalactiae” is listed as one of these conditions, and it is what we will check for in our Group B Streptococcus examples below.
Often we want to tabulate results for specific sets of TAC specimen types. To get a list of these, we can call valid_specimen_types()
.
valid_specimen_types(d) #> [1] "Cerebrospinal fluid sample" "Nasopharyngeal and Oropharyngeal swab" #> [3] "Plasma or spun blood specimen" "Rectal swab" #> [5] "Tissue specimen from lung" "Whole blood"
In our examples below, we will often compute statistics for all specimen types excluding nasopharyngeal and oropharyngeal swab, as well as all specimen types. We will specify these as specimen_types1
and specimen_types2
, respectively:
specimen_types1 <- c( "Cerebrospinal fluid sample", "Tissue specimen from lung", "Whole blood", "Rectal swab", "Plasma or spun blood specimen" ) specimen_types2 <- c( "Nasopharyngeal and Oropharyngeal swab", "Cerebrospinal fluid sample", "Tissue specimen from lung", "Whole blood", "Rectal swab", "Plasma or spun blood specimen" )
For summaries that involve TAC results, we often want to specify a pathogen of interest. To see a list of valid pathogens, we can call valid_pathogens()
.
valid_pathogens() #> [1] "Acinetobacter baumannii" #> [2] "Adenovirus" #> [3] "Adenovirus 40/41" #> [4] "Aeromonas spp." #> [5] "Arenavirus/Lassa Fever" #> [6] "Ascaris lumbricoides" #> [7] "Astrovirus" #> [8] "Bartonella spp." #> [9] "Bordatella spp." #> [10] "Bordetella parapertussi, Bordetella bronchiseptica (Insertion sequence IS1001)" ...
In all of our examples below, we will be using “Group B Streptococcus”.
Some calculations require a list of ICD10 codes, which are used to filter the DeCoDe results using the “Other Significant Condition” variables. For a list of ICD10 codes that exist in the CHAMPS data, we can call valid_icds()
.
valid_icds(d) #> [1] "A00" "A02" "A02.1" "A02.8" "A03" "A04.0" "A04.1" "A04.4" "A04.5" "A04.8" "A07.1" #> [12] "A08.0" "A08.1" "A08.2" "A08.3" "A08.5" "A09" "A09.0" "A09.9" "A16.7" "A17.0" "A18.8" #> [23] "A19.1" "A19.9" "A37" "A39.0" "A39.2" "A40" "A40.0" "A40.1" "A40.3" "A40.8" "A40.9" #> [34] "A41.0" "A41.3" "A41.4" "A41.5" "A41.8" "A41.9" "A48.3" "A49.9" "A50" "A50.0" "A50.9" #> [45] "A78" "A82.9" "A85.0" "A85.1" "A86" "A87.8" "B00.7" "B05" "B05.2" "B05.8" "B06" #> [56] "B17.2" "B18.2" "B20" "B20.0" "B20.1" "B20.2" "B20.3" "B20.6" "B20.7" "B20.8" "B20.9" #> [67] "B22" "B22.1" "B22.2" "B22.7" "B23" "B24" "B25" "B25.0" "B25.1" "B25.8" "B25.9" #> [78] "B33.2" "B33.8" "B34.0" "B34.1" "B34.3" "B36.0" "B37.1" "B37.5" "B37.7" "B37.8" "B44.7" #> [89] "B50" "B50.0" "B50.8" "B50.9" "B54" "B59" "B65.1" "B77.9" "C22.2" "C49.9" "C74.9" #> [100] "C80.9" "C91.0" "C92.9" "C95.9" "D41.0" "D48.0" "D50.0" "D50.8" "D50.9" "D53.8" "D53.9" #> [111] "D57.0" "D57.1" "D57.8" "D59.3" "D59.8" "D61.9" "D64" "D64.9" "D65" "D66" "D68" #> [122] "D70" "D72.1" "D82.1" "D84.9" "E10" "E10.1" "E16.1" "E16.2" "E22.2" "E27.1" "E40" #> [133] "E41" "E42" "E43" "E44.0" "E45" "E46" "E50" "E55.0" "E64.3" "E83.5" "E87.0" #> [144] "E87.1" "E88.8" "E88.9" "F82" "F89" "G00.1" "G00.2" "G00.3" "G00.8" "G00.9" "G01" #> [155] "G03.9" "G04.8" "G06.0" "G12.9" "G31.9" "G40.9" "G41.9" "G71.2" "G80.9" "G90.2" "G91" #> [166] "G91.0" "G91.8" "G91.9" "G93.1" "G93.4" "G93.5" "G93.8" "G96.9" "H35.1" "H66.3" "H66.4" #> [177] "I27.2" "I27.9" "I31.3" "I33.0" "I41.1" "I42.0" "I42.9" "I50" "I51.4" "I51.9" "I61.9" #> [188] "I63.9" "I67.8" "I81" "J05.0" "J06.9" "J10.0" "J12.0" "J12.1" "J12.2" "J12.8" "J12.9" #> [199] "J13" "J14" "J15" "J15.0" "J15.1" "J15.2" "J15.3" "J15.4" "J15.5" "J15.6" "J15.8" #> [210] "J15.9" "J16.8" "J17" "J17.1" "J17.2" "J17.3" "J17.8" "J18" "J18.0" "J18.9" "J21.9" #> [221] "J35.3" "J44.8" "J46" "J47" "J60" "J69" "j69.0" "J69.0" "J69.8" "J84.9" "J85.1" #> [232] "J86" "J86.9" "J93" "J93.9" "J95.5" "J98.4" "K10.9" "K12.2" "K20" "K21.9" "K40.3" #> [243] "K52.9" "K55.0" "K56.1" "K56.2" "K56.6" "K56.7" "K62.5" "K65.0" "K71.6" "K72.0" "K72.1" #> [254] "K72.9" "K74.5" "K75.4" "K75.9" "K76.0" "K76.5" "K76.6" "K76.9" "K83.1" "K91.4" "K92.0" #> [265] "K92.9" "L01.0" "L02.0" "L03" "L12.2" "L26" "L30.9" "L51.2" "L89" "L98.4" "M00" #> [276] "M79.8" "M86.9" "N05" "N12" "N17.0" "N17.9" "N18.9" "N25.8" "N26" "N28.9" "N39.0" #> [287] "O14.2" "O32.9" "O69.9" "P00.2" "P00.8" "P01.2" "P01.3" "P01.5" "P01.6" "P02.1" "P02.2" #> [298] "P02.3" "P02.4" "P02.5" "P02.7" "P03.0" "P03.1" "P03.4" "P05.0" "P05.1" "P05.9" "P07.0" #> [309] "P07.1" "P07.2" "P07.3" "P08.0" "P08.1" "P12.2" "P14.0" "P15.4" "P15.9" "P20" "P20.0" #> [320] "P20.1" "P20.9" "P21" "P21.0" "P21.1" "P21.9" "P22.0" "P22.9" "P23.2" "P23.3" "P23.4" #> [331] "P23.6" "P23.8" "P23.9" "P24" "P24.0" "P24.1" "P24.9" "P25.1" "P25.2" "P26" "P26.8" #> [342] "P26.9" "P27.1" "P28.0" "P28.4" "P29.0" "P29.2" "P29.3" "P29.8" "P35.0" "P35.1" "P35.8" #> [353] "P36.0" "P36.1" "P36.2" "P36.3" "P36.4" "P36.5" "P36.8" "P36.9" "P37.1" "P37.2" "P37.5" #> [364] "P38" "P39.2" "P39.9" "P51.0" "P52.0" "P52.1" "P52.2" "P52.3" "P52.4" "P52.8" "P52.9" #> [375] "P55.1" "P55.9" "P56.9" "P57.0" "P57.9" "P59" "P59.0" "P59.9" "P60" "P61.0" "P61.4" #> [386] "P70.4" "P76" "P77" "P78.0" "P78.8" "P80.9" "P83.2" "P90" "P91" "P91.0" "P91.2" #> [397] "P91.6" "P91.7" "P91.8" "P92.3" "P95" "Q00.0" "Q00.1" "Q02" "Q03" "Q03.1" "Q03.9" #> [408] "Q04.0" "Q04.2" "Q04.3" "Q04.6" "Q05" "Q05.2" "Q05.9" "Q11.1" "Q21.0" "Q21.1" "Q21.3" #> [419] "Q23.4" "Q24.8" "Q24.9" "Q25.0" "Q25.1" "Q27.0" "Q28.8" "Q31.0" "Q31.1" "Q31.8" "Q33.6" #> [430] "Q35" "Q35.9" "Q36.9" "Q37.8" "Q37.9" "Q39.1" "Q40.1" "Q41.1" "Q42.0" "Q42.3" "Q43.3" #> [441] "Q44.2" "Q44.5" "Q44.7" "Q60.6" "Q61.3" "Q61.9" "Q62.0" "Q64.2" "Q64.9" "Q66.0" "Q66.8" #> [452] "Q73.1" "Q75.3" "Q77.1" "Q78.9" "Q79.0" "Q79.2" "Q79.3" "Q79.9" "Q81.9" "Q87" "Q87.1" #> [463] "Q87.2" "Q87.8" "Q89" "Q89.4" "Q89.7" "Q89.9" "Q90" "Q90.9" "Q91.3" "Q91.7" "Q93.2" #> [474] "Q98.0" "Q99.9" "R02" "R04.8" "R06.8" "R11" "R19.8" "R56.8" "R57.1" "R62.8" "R64" #> [485] "R73.9" "R95" "R99" "S00" "S01.5" "S06.2" "S06.5" "S06.8" "S07" "S09.8" "S31.8" #> [496] "S72.9" "T07" "T09.1" "T17.4" "T17.9" "T20" "T21" "T31.1" "T31.2" "T31.3" "T31.5" #> [507] "T41.2" "T45.1" "T47.4" "T50.9" "T55" "T58" "T60.0" "T65" "T65.9" "T69" "T71" #> [518] "T74.0" "T74.1" "T74.8" "T80.2" "T81" "T81.4" "T85.5" "T85.6" "T85.7" "T88.3" "T88.9" #> [529] "V03.9" "W22" "W78" "Y07.9" "Y60.0" "Y63.8" "Y84.0" "Z03.6" "Z20.6" "Z35.3" "Z35.4" #> [540] "Z38.1" "Z87.0" "Z87.5" "Z93.1"
In our examples below, we will use ICD10 codes “P36.0”, “A40.1”, “P23.3”, “G00.2”.
A common summary to compute is a tabulation of the number of cases of a given condition found in the causal chain by site and by age, within the context of all cases.
A function, calc_cc_allcases_by_site_age()
takes a condition and list of sites as input and produces this tabulation. For GBS and the sites specified above:
calc_cc_allcases_by_site_age(d, condition = "Streptococcus agalactiae", sites = sites) #> $version #> # A tibble: 1 x 3 #> dataset_name dataset_version dataset_release_date #> <chr> <dbl> <date> #> 1 CHAMPS Level2 Data 4.1 2020-08-01 #> #> $condition #> [1] "Streptococcus agalactiae" #> #> $numerator #> nn$site #> nn$age_group Bangladesh Kenya Mali Mozambique South Africa Sum #> Stillbirth 1 1 0 1 8 11 #> Death in the first 24 hours 0 1 0 3 14 18 #> Early Neonate (1 to 6 days) 0 0 0 1 8 9 #> Late Neonate (7 to 27 days) 0 1 0 0 3 4 #> Infant (28 days to less than 12 months) 0 0 0 0 1 1 #> Child (12 months to less than 60 Months) 0 0 0 0 1 1 #> Sum 1 3 0 5 35 44 #> #> $denominator #> dd$site #> dd$age_group Bangladesh Kenya Mali Mozambique South Africa Sum #> Stillbirth 67 82 46 59 80 334 #> Death in the first 24 hours 38 50 18 48 99 253 #> Early Neonate (1 to 6 days) 28 26 17 15 142 228 #> Late Neonate (7 to 27 days) 6 15 11 5 83 120 #> Infant (28 days to less than 12 months) 2 91 17 18 109 237 #> Child (12 months to less than 60 Months) 1 78 15 34 56 184 #> Sum 142 342 124 179 569 1356 #> #> attr(,"class") #> [1] "champs_computed" "cc_allcases_by_site_age"
The output of this function is a list that contains the original input parameters as well as a table of the numerator statistics (number of GBS cases in the causal chain by age and site), and a table of the denominator statistics (number of DeCoDed cases by age and site).
Note that currently only a single condition
can be passed, although this could be expanded in the future to allow multiple conditions.
A function, calc_cc_detected_by_site_age()
, tabulates the number of cases of a given condition found in the causal chain by site and by age, within the context of cases where the pathogen(s) are detected in TAC results (dying from vs. dying with).
This function takes the same parameters of condition
and site
, with an additional parameter pathogen
which specifies what to look for in the TAC results tabulation. Also, a parameter specimen_types
allows you to specify which TAC results to count.
To compute these statistics for TAC results in blood, CSF, or lung (no NP only):
calc_cc_detected_by_site_age(d, condition = "Streptococcus agalactiae", pathogen = "Group B Streptococcus", sites = sites, specimen_types = specimen_types1) #> $version #> # A tibble: 1 x 3 #> dataset_name dataset_version dataset_release_date #> <chr> <dbl> <date> #> 1 CHAMPS Level2 Data 4.1 2020-08-01 #> #> $condition #> [1] "Streptococcus agalactiae" #> #> $pathogen #> [1] "Group B Streptococcus" #> #> $sites #> [1] "Bangladesh" "Kenya" "Mali" "Mozambique" "South Africa" #> #> $specimen_types #> [1] "Cerebrospinal fluid sample" "Tissue specimen from lung" #> [3] "Whole blood" "Rectal swab" #> [5] "Plasma or spun blood specimen" #> #> $numerator #> nn$site #> nn$age_group Bangladesh Kenya Mali Mozambique South Africa #> Stillbirth 1 1 0 1 7 #> Death in the first 24 hours 0 1 0 3 11 #> Early Neonate (1 to 6 days) 0 0 0 1 8 #> Late Neonate (7 to 27 days) 0 1 0 0 3 #> Infant (28 days to less than 12 months) 0 0 0 0 1 #> Child (12 months to less than 60 Months) 0 0 0 0 1 #> Sum 1 3 0 5 31 #> nn$site #> nn$age_group Ethiopia Sierra Leone Sum #> Stillbirth 0 0 10 #> Death in the first 24 hours 0 0 15 #> Early Neonate (1 to 6 days) 0 0 9 #> Late Neonate (7 to 27 days) 0 0 4 #> Infant (28 days to less than 12 months) 0 0 1 #> Child (12 months to less than 60 Months) 0 0 1 #> Sum 0 0 40 #> #> $denominator #> dd$site #> dd$age_group Bangladesh Kenya Mali Mozambique South Africa Sum #> Stillbirth 1 5 1 1 16 24 #> Death in the first 24 hours 1 1 0 4 20 26 #> Early Neonate (1 to 6 days) 0 2 0 1 24 27 #> Late Neonate (7 to 27 days) 1 3 0 0 19 23 #> Infant (28 days to less than 12 months) 0 2 1 1 14 18 #> Child (12 months to less than 60 Months) 0 3 2 4 12 21 #> Sum 3 16 4 11 105 139 #> #> attr(,"class") #> [1] "champs_computed" "cc_detected_site_case"
And for all specimen types:
calc_cc_detected_by_site_age(d, condition = "Streptococcus agalactiae", pathogen = "Group B Streptococcus", sites = sites, specimen_types = specimen_types2) #> $version #> # A tibble: 1 x 3 #> dataset_name dataset_version dataset_release_date #> <chr> <dbl> <date> #> 1 CHAMPS Level2 Data 4.1 2020-08-01 #> #> $condition #> [1] "Streptococcus agalactiae" #> #> $pathogen #> [1] "Group B Streptococcus" #> #> $sites #> [1] "Bangladesh" "Kenya" "Mali" "Mozambique" "South Africa" #> #> $specimen_types #> [1] "Nasopharyngeal and Oropharyngeal swab" "Cerebrospinal fluid sample" #> [3] "Tissue specimen from lung" "Whole blood" #> [5] "Rectal swab" "Plasma or spun blood specimen" #> #> $numerator #> nn$site #> nn$age_group Bangladesh Kenya Mali Mozambique South Africa #> Stillbirth 1 1 0 1 7 #> Death in the first 24 hours 0 1 0 3 13 #> Early Neonate (1 to 6 days) 0 0 0 1 8 #> Late Neonate (7 to 27 days) 0 1 0 0 3 #> Infant (28 days to less than 12 months) 0 0 0 0 1 #> Child (12 months to less than 60 Months) 0 0 0 0 1 #> Sum 1 3 0 5 33 #> nn$site #> nn$age_group Ethiopia Sierra Leone Sum #> Stillbirth 0 0 10 #> Death in the first 24 hours 0 0 17 #> Early Neonate (1 to 6 days) 0 0 9 #> Late Neonate (7 to 27 days) 0 0 4 #> Infant (28 days to less than 12 months) 0 0 1 #> Child (12 months to less than 60 Months) 0 0 1 #> Sum 0 0 42 #> #> $denominator #> dd$site #> dd$age_group Bangladesh Kenya Mali Mozambique South Africa Sum #> Stillbirth 1 8 1 1 22 33 #> Death in the first 24 hours 2 4 2 6 27 41 #> Early Neonate (1 to 6 days) 1 3 0 1 31 36 #> Late Neonate (7 to 27 days) 2 3 1 2 21 29 #> Infant (28 days to less than 12 months) 0 10 3 2 15 30 #> Child (12 months to less than 60 Months) 0 11 2 5 13 31 #> Sum 6 39 9 17 129 200 #> #> attr(,"class") #> [1] "champs_computed" "cc_detected_site_case"
Note that currently only a single pathogen
can be passed, although this could be expanded in the future to allow multiple pathogens.
To tabulate the number of cases where a pathogen is detected in TAC results in the context of all cases by site and by age, we can use the function calc_detected_allcases_by_site_age()
.
This has the same arguments as calc_cc_detected_by_site_age()
with an additional argument tac_variable
, which is used to ensure the denominator (all cases) only includes those where the appropriate test had a valid positive or negative result.
To compute these statistics for TAC results in blood, CSF, or lung (no NP only):
calc_detected_allcases_by_site_age(d, condition = "Streptococcus agalactiae", pathogen = "Group B Streptococcus", sites = sites, specimen_types = specimen_types1) #> $version #> # A tibble: 1 x 3 #> dataset_name dataset_version dataset_release_date #> <chr> <dbl> <date> #> 1 CHAMPS Level2 Data 4.1 2020-08-01 #> #> $condition #> [1] "Streptococcus agalactiae" #> #> $pathogen #> [1] "Group B Streptococcus" #> #> $sites #> [1] "Bangladesh" "Kenya" "Mali" "Mozambique" "South Africa" #> #> $specimen_types #> [1] "Cerebrospinal fluid sample" "Tissue specimen from lung" #> [3] "Whole blood" "Rectal swab" #> [5] "Plasma or spun blood specimen" #> #> $numerator #> nn$site #> nn$age_group Bangladesh Kenya Mali Mozambique South Africa Sum #> Stillbirth 1 5 1 1 16 24 #> Death in the first 24 hours 1 1 0 4 20 26 #> Early Neonate (1 to 6 days) 0 2 0 1 24 27 #> Late Neonate (7 to 27 days) 1 3 0 0 19 23 #> Infant (28 days to less than 12 months) 0 2 1 1 14 18 #> Child (12 months to less than 60 Months) 0 3 2 4 12 21 #> Sum 3 16 4 11 105 139 #> #> $denominator #> dd$site #> dd$age_group Bangladesh Kenya Mali Mozambique South Africa Sum #> Stillbirth 67 82 46 59 80 334 #> Death in the first 24 hours 38 50 18 48 99 253 #> Early Neonate (1 to 6 days) 28 26 17 15 142 228 #> Late Neonate (7 to 27 days) 6 15 11 5 83 120 #> Infant (28 days to less than 12 months) 2 91 17 18 109 237 #> Child (12 months to less than 60 Months) 1 78 15 34 56 184 #> Sum 142 342 124 179 569 1356 #> #> attr(,"class") #> [1] "champs_computed" "detected_allcases_by_site_age"
And for all specimen types:
calc_detected_allcases_by_site_age(d, condition = "Streptococcus agalactiae", pathogen = "Group B Streptococcus", sites = sites, specimen_types = specimen_types2) #> $version #> # A tibble: 1 x 3 #> dataset_name dataset_version dataset_release_date #> <chr> <dbl> <date> #> 1 CHAMPS Level2 Data 4.1 2020-08-01 #> #> $condition #> [1] "Streptococcus agalactiae" #> #> $pathogen #> [1] "Group B Streptococcus" #> #> $sites #> [1] "Bangladesh" "Kenya" "Mali" "Mozambique" "South Africa" #> #> $specimen_types #> [1] "Nasopharyngeal and Oropharyngeal swab" "Cerebrospinal fluid sample" #> [3] "Tissue specimen from lung" "Whole blood" #> [5] "Rectal swab" "Plasma or spun blood specimen" #> #> $numerator #> nn$site #> nn$age_group Bangladesh Kenya Mali Mozambique South Africa Sum #> Stillbirth 1 8 1 1 22 33 #> Death in the first 24 hours 2 4 2 6 27 41 #> Early Neonate (1 to 6 days) 1 3 0 1 31 36 #> Late Neonate (7 to 27 days) 2 3 1 2 21 29 #> Infant (28 days to less than 12 months) 0 10 3 2 15 30 #> Child (12 months to less than 60 Months) 0 11 2 5 13 31 #> Sum 6 39 9 17 129 200 #> #> $denominator #> dd$site #> dd$age_group Bangladesh Kenya Mali Mozambique South Africa Sum #> Stillbirth 67 82 46 59 80 334 #> Death in the first 24 hours 38 50 18 48 99 253 #> Early Neonate (1 to 6 days) 28 26 17 15 142 228 #> Late Neonate (7 to 27 days) 6 15 11 5 83 120 #> Infant (28 days to less than 12 months) 2 91 17 18 109 237 #> Child (12 months to less than 60 Months) 1 78 15 34 56 184 #> Sum 142 342 124 179 569 1356 #> #> attr(,"class") #> [1] "champs_computed" "detected_allcases_by_site_age"
Note that to match the hand-computed outputs we have the target to match, the parameter tac_variable
is actually not used. An open question is whether the code should be modified to use this variable.
To tabulate the number of cases where a pathogen is detected in TAC results by DeCoDe result (in causal chain or not, etc.) and by either site or age, we can use the function calc_detected_by_decode()
.
This function has a new parameter, icds
, for which we can specify a list of ICD10 codes that will be used to count the number of cases where the condition of interest was not in the causal chain but was a contributing cause. Note that this function is limited as the use of icds
to classify Contributing (P2) only works for a few cases.
To compute this breakdown by site, for TAC results in blood, CSF, or lung (no NP only):
calc_detected_by_decode(d, by = "site", condition = "Streptococcus agalactiae", pathogen = "Group B Streptococcus", icds = c("P36.0", "A40.1", "P23.3", "G00.2"), sites = sites, specimen_types = specimen_types1) #> $version #> # A tibble: 1 x 3 #> dataset_name dataset_version dataset_release_date #> <chr> <dbl> <date> #> 1 CHAMPS Level2 Data 4.1 2020-08-01 #> #> $condition #> [1] "Streptococcus agalactiae" #> #> $pathogen #> [1] "Group B Streptococcus" #> #> $icds #> [1] "P36.0" "A40.1" "P23.3" "G00.2" #> #> $sites #> [1] "Bangladesh" "Kenya" "Mali" "Mozambique" "South Africa" #> #> $specimen_types #> [1] "Cerebrospinal fluid sample" "Tissue specimen from lung" #> [3] "Whole blood" "Rectal swab" #> [5] "Plasma or spun blood specimen" #> #> $numerator #> nn$site #> nn$result Bangladesh Kenya Mali Mozambique South Africa Sum #> In causal chain 1 3 0 5 31 40 #> Contributing (P2) 0 0 0 0 14 14 #> Other Infectious 1 9 3 5 42 60 #> Not in CC/no ID 1 4 1 1 18 25 #> Sum 3 16 4 11 105 139 #> #> attr(,"class") #> [1] "champs_computed" "detected_by_decode"
To compute this breakdown by age, for TAC results in blood, CSF, or lung (no NP only):
calc_detected_by_decode(d, by = "age", condition = "Streptococcus agalactiae", pathogen = "Group B Streptococcus", icds = c("P36.0", "A40.1", "P23.3", "G00.2"), sites = sites, specimen_types = specimen_types1) #> $version #> # A tibble: 1 x 3 #> dataset_name dataset_version dataset_release_date #> <chr> <dbl> <date> #> 1 CHAMPS Level2 Data 4.1 2020-08-01 #> #> $condition #> [1] "Streptococcus agalactiae" #> #> $pathogen #> [1] "Group B Streptococcus" #> #> $icds #> [1] "P36.0" "A40.1" "P23.3" "G00.2" #> #> $sites #> [1] "Bangladesh" "Kenya" "Mali" "Mozambique" "South Africa" #> #> $specimen_types #> [1] "Cerebrospinal fluid sample" "Tissue specimen from lung" #> [3] "Whole blood" "Rectal swab" #> [5] "Plasma or spun blood specimen" #> #> $numerator #> nn$result #> nn$age_group In causal chain Contributing (P2) Other Infectious #> Stillbirth 10 3 1 #> Death in the first 24 hours 15 3 2 #> Early Neonate (1 to 6 days) 9 3 8 #> Late Neonate (7 to 27 days) 4 4 15 #> Infant (28 days to less than 12 months) 1 0 15 #> Child (12 months to less than 60 Months) 1 1 19 #> Sum 40 14 60 #> nn$result #> nn$age_group Not in CC/no ID Sum #> Stillbirth 10 24 #> Death in the first 24 hours 6 26 #> Early Neonate (1 to 6 days) 7 27 #> Late Neonate (7 to 27 days) 0 23 #> Infant (28 days to less than 12 months) 2 18 #> Child (12 months to less than 60 Months) 0 21 #> Sum 25 139 #> #> attr(,"class") #> [1] "champs_computed" "detected_by_decode"
Note that a dataset comes with the package, infectious_causes
, which is used internally in this function to match against CHAMPS group descriptions to identify which cases fall into the “Other Infectious” category.
champs::infectious_causes #> [1] "Diarrheal Diseases" "Congenital infection" #> [3] "HIV" "Lower respiratory infections" #> [5] "Malaria" "Measles" #> [7] "Meningitis/Encephalitis" "Neonatal sepsis" #> [9] "Other infections" "Sepsis" #> [11] "Syphilis" "Tuberculosis" #> [13] "Upper respiratory infections"
Also note that cases in the “Contributing (P2)” category are found using the ICD10 codes in the Other significant condition variable provided to the function. There are only a small subset of pathogens that this categorization into “Contributing (P2)” can occur based only on ICD-10 codes, all others would need the ability to search in the free text field which is not part of the L2 dataset.
Take cases filtered positive TAC for a pathogen but where the condition is not in the causal chain, but the cause of death is infectious, and then see what pathogens are there.
calc_top_tac_pathogens(d, condition = "Streptococcus agalactiae", pathogen = "Group B Streptococcus", icds = c("P36.0", "A40.1", "P23.3", "G00.2"), sites = sites, specimen_types = specimen_types2) #> $version #> # A tibble: 1 x 3 #> dataset_name dataset_version dataset_release_date #> <chr> <dbl> <date> #> 1 CHAMPS Level2 Data 4.1 2020-08-01 #> #> $condition #> [1] "Streptococcus agalactiae" #> #> $pathogen #> [1] "Group B Streptococcus" #> #> $icds #> [1] "P36.0" "A40.1" "P23.3" "G00.2" #> #> $specimen_types #> [1] "Nasopharyngeal and Oropharyngeal swab" "Cerebrospinal fluid sample" #> [3] "Tissue specimen from lung" "Whole blood" #> [5] "Rectal swab" "Plasma or spun blood specimen" #> #> $sites #> [1] "Bangladesh" "Kenya" "Mali" "Mozambique" "South Africa" #> #> $df #> # A tibble: 23 x 2 #> value n #> <chr> <int> #> 1 Klebsiella pneumoniae 33 #> 2 Acinetobacter baumannii 22 #> 3 Streptococcus pneumoniae 14 #> 4 Cytomegalovirus (CMV) 9 #> 5 Plasmodium falciparum 7 #> 6 Haemophilus influenzae 6 #> 7 Pneumocystis jirovecii 6 #> 8 Staphylococcus aureus 6 #> 9 Candida spp 3 #> 10 Enterococcus faecalis 3 #> # … with 13 more rows #> #> $n #> [1] 96 #> #> attr(,"class") #> [1] "champs_computed" "top_pathogens"
To calculate the average postmortem interval (PMI - average time from death to MITS in hours) by specimen type of NP Only vs. Blood/CSF/Lung and by site, we can use the function calc_pmi_by_specimen_site()
.
Here, instead of displaying the raw results, which provide sums of hours for the numerator, and number of observations for the denominator, it is more interesting to look at the average time in hours, which is the numerator divided by the denominator.
res <- calc_pmi_by_specimen_site(d, pathogen = "Group B Streptococcus", sites = sites) round(res$numerator / res$denominator, 1) #> dd$site #> dd$specimen_type2 Bangladesh Kenya Mali Mozambique South Africa Sum #> NP Only 1.3 15.5 12.2 13.5 25.0 17.9 #> Blood, CSF, or Lung 1.3 23.7 11.0 12.9 26.5 24.0 #> Sum 1.3 18.4 11.7 13.1 26.2 22.2
To calculate the average PMI by DeCoDe result and site, we can use the function calc_pmi_by_decode_site()
.
res <- calc_pmi_by_decode_site(d, condition = "Streptococcus agalactiae", pathogen = "Group B Streptococcus", icds = c("P36.0", "A40.1", "P23.3", "G00.2"), specimen_types = specimen_types2, sites = sites) round(res$numerator / res$denominator, 1) #> dd$site #> dd$result Bangladesh Kenya Mali Mozambique South Africa Sum #> In causal chain 1.0 21.0 10.8 25.8 23.0 #> Contributing (P2) 27.5 27.5 #> Other Infectious 1.0 19.0 14.2 14.0 25.8 21.5 #> Not in CC/no ID 1.7 11.3 6.7 14.2 27.0 20.8 #> Sum 1.3 18.4 11.7 13.1 26.2 22.2
To calculate the average PMI by age and DeCoDe result, we can use the function calc_pmi_by_age_decode()
.
res <- calc_pmi_by_age_decode(d, condition = "Streptococcus agalactiae", pathogen = "Group B Streptococcus", icds = c("P36.0", "A40.1", "P23.3", "G00.2"), specimen_types = specimen_types1, sites = sites) round(res$numerator / res$denominator, 1) #> dd$result #> dd$age_group In causal chain Contributing (P2) Other Infectious #> Stillbirth 18.8 27.3 44.0 #> Death in the first 24 hours 22.2 50.0 36.0 #> Early Neonate (1 to 6 days) 25.3 13.7 24.8 #> Late Neonate (7 to 27 days) 32.0 21.0 21.9 #> Infant (28 days to less than 12 months) 16.0 26.9 #> Child (12 months to less than 60 Months) 25.0 28.0 20.7 #> Sum 23.2 27.5 24.1 #> dd$result #> dd$age_group Not in CC/no ID Sum #> Stillbirth 16.2 21.0 #> Death in the first 24 hours 18.0 25.8 #> Early Neonate (1 to 6 days) 27.1 24.3 #> Late Neonate (7 to 27 days) 23.6 #> Infant (28 days to less than 12 months) 52.0 27.7 #> Child (12 months to less than 60 Months) 21.3 #> Sum 22.9 24.0
To tabulate the number of positive specimens for each case where the pathogen of interest is detected in the TAC results by postmortem interval range, we can use the function calc_nspecimen_by_pmi()
.
For GBS:
calc_nspecimen_by_pmi(d, pathogen = "Group B Streptococcus", sites = sites ) #> $version #> # A tibble: 1 x 3 #> dataset_name dataset_version dataset_release_date #> <chr> <dbl> <date> #> 1 CHAMPS Level2 Data 4.1 2020-08-01 #> #> $pathogen #> [1] "Group B Streptococcus" #> #> $sites #> [1] "Bangladesh" "Kenya" "Mali" "Mozambique" "South Africa" #> #> $numerator #> dd$pmi_range #> dd$n 0 to 3 4 to 6 7 to 9 10 to 12 13 to 15 16 to 18 19 to 21 22 to 24 Over 24h Sum #> 0 0 0 0 0 0 0 0 0 1 1 #> 5 187 110 93 110 101 122 108 84 330 1245 #> Sum 187 110 93 110 101 122 108 84 331 1246 #> #> attr(,"class") #> [1] "champs_computed" "nspecimen_by_pmi"
Note that to match the hand-calculated results provided as a target, we need to calculate the results for all sites as opposed to the 5 sites we have specified here.
For Klebsiella pneumoniae:
calc_nspecimen_by_pmi(d, pathogen = "Klebsiella pneumoniae", sites = sites ) #> $version #> # A tibble: 1 x 3 #> dataset_name dataset_version dataset_release_date #> <chr> <dbl> <date> #> 1 CHAMPS Level2 Data 4.1 2020-08-01 #> #> $pathogen #> [1] "Klebsiella pneumoniae" #> #> $sites #> [1] "Bangladesh" "Kenya" "Mali" "Mozambique" "South Africa" #> #> $numerator #> dd$pmi_range #> dd$n 0 to 3 4 to 6 7 to 9 10 to 12 13 to 15 16 to 18 19 to 21 22 to 24 Over 24h Sum #> 0 0 0 0 0 0 0 0 0 1 1 #> 5 187 110 93 110 101 122 108 84 330 1245 #> Sum 187 110 93 110 101 122 108 84 331 1246 #> #> attr(,"class") #> [1] "champs_computed" "nspecimen_by_pmi"
For Escherichia coli/Shigella:
calc_nspecimen_by_pmi(d, pathogen = "Escherichia coli/Shigella", sites = sites ) #> $version #> # A tibble: 1 x 3 #> dataset_name dataset_version dataset_release_date #> <chr> <dbl> <date> #> 1 CHAMPS Level2 Data 4.1 2020-08-01 #> #> $pathogen #> [1] "Escherichia coli/Shigella" #> #> $sites #> [1] "Bangladesh" "Kenya" "Mali" "Mozambique" "South Africa" #> #> $numerator #> dd$pmi_range #> dd$n 0 to 3 4 to 6 7 to 9 10 to 12 13 to 15 16 to 18 19 to 21 22 to 24 Over 24h Sum #> 0 0 0 0 0 0 0 0 0 1 1 #> 3 187 110 93 110 101 122 108 84 330 1245 #> Sum 187 110 93 110 101 122 108 84 331 1246 #> #> attr(,"class") #> [1] "champs_computed" "nspecimen_by_pmi"
To calculate the top pathogens and the associated number of cases by site of acquisition, we can use calc_top_dcd_pathogens_by_acq()
. In this function, we can specify the condition and the age groups we want to include in the calculation.
For example, to calculate the top pathogens associated with lower respiratory infection deaths for ages 1-59mo:
calc_top_dcd_pathogens_by_acq(d, condition = "Lower respiratory infections", age_groups = c( "Infant (28 days to less than 12 months)", "Child (12 months to less than 60 Months)") ) #> $version #> # A tibble: 1 x 3 #> dataset_name dataset_version dataset_release_date #> <chr> <dbl> <date> #> 1 CHAMPS Level2 Data 4.1 2020-08-01 #> #> $table #> # A tibble: 27 x 4 #> # Groups: etiol [27] #> etiol Community Facility Total #> <chr> <int> <int> <int> #> 1 Klebsiella pneumoniae 37 41 78 #> 2 Streptococcus pneumoniae 61 10 71 #> 3 Cytomegalovirus (CMV) 14 20 34 #> 4 Haemophilus influenzae 18 7 25 #> 5 Staphylococcus aureus 13 11 24 #> 6 Respiratory syncytial virus (RSV) 5 14 19 #> 7 Pneumocystis jirovecii 10 7 17 #> 8 Adenovirus 2 12 14 #> 9 Pseudomonas aeruginosa 3 8 11 #> 10 Escherichia coli 9 1 10 #> # … with 17 more rows #> #> attr(,"class") #> [1] "champs_computed" "calc_top_dcd_pathogens_by_acq"
calc_top_tac_pathogens_cc(d, condition = "Lower respiratory infections", age_groups = c( "Infant (28 days to less than 12 months)", "Child (12 months to less than 60 Months)"), specimen_types = c( "Nasopharyngeal and Oropharyngeal swab", "Tissue specimen from lung"), specimen_abbrv = c("# NP+", "# Lung+") ) #> $version #> # A tibble: 1 x 3 #> dataset_name dataset_version dataset_release_date #> <chr> <dbl> <date> #> 1 CHAMPS Level2 Data 4.1 2020-08-01 #> #> $table #> # A tibble: 40 x 5 #> pathogen `# NP+ (n=454)` `# Lung+ (n=454)` `# in CC (n=454… `# in CC as LRI (n=2… #> <chr> <int> <int> <int> <int> #> 1 Klebsiella pneumon… 301 193 124 78 #> 2 Cytomegalovirus (C… 262 194 48 34 #> 3 Streptococcus pneu… 240 151 82 71 #> 4 Haemophilus influe… 239 142 31 25 #> 5 Staphylococcus aur… 235 91 33 24 #> 6 Moraxella catarrha… 208 99 3 2 #> 7 Rhinovirus 182 69 8 8 #> 8 Acinetobacter baum… 127 34 32 12 #> 9 Adenovirus 83 33 25 14 #> 10 Pseudomonas aerugi… 79 35 17 11 #> # … with 30 more rows #> #> $n #> [1] 454 #> #> $n_condition #> [1] 250 #> #> attr(,"class") #> [1] "champs_computed" "calc_top_tac_pathogens_cc"
calc_top_etiol_by_age(d, age_groups = c( "Death in the first 24 hours", "Early Neonate (24-72 hours)", "Early Neonate (72+hrs to 6 days)", "Late Neonate (7 to 27 days)") ) #> $version #> # A tibble: 1 x 3 #> dataset_name dataset_version dataset_release_date #> <chr> <dbl> <date> #> 1 CHAMPS Level2 Data 4.1 2020-08-01 #> #> $etiol_counts #> # A tibble: 69 x 3 #> # Groups: age_group_subcat [4] #> age_group_subcat etiol n #> <fct> <chr> <int> #> 1 Death in the first 24 hours Candida spp 1 #> 2 Death in the first 24 hours Chlamydia trachomatis 1 #> 3 Death in the first 24 hours Enterococcus faecalis 2 #> 4 Death in the first 24 hours Escherichia coli 11 #> 5 Death in the first 24 hours Haemophilus influenzae 2 #> 6 Death in the first 24 hours Klebsiella pneumoniae 6 #> 7 Death in the first 24 hours Listeria monocytogenes 1 #> 8 Death in the first 24 hours Streptococcus agalactiae 18 #> 9 Death in the first 24 hours Streptococcus pneumoniae 2 #> 10 Death in the first 24 hours Toxoplasma gondii 1 #> # … with 59 more rows #> #> $top #> # A tibble: 33 x 2 #> etiol n #> <chr> <int> #> 1 Acinetobacter baumannii 116 #> 2 Klebsiella pneumoniae 109 #> 3 Escherichia coli 36 #> 4 Streptococcus agalactiae 32 #> 5 Candida spp 19 #> 6 Staphylococcus aureus 19 #> 7 Enterococcus faecalis 11 #> 8 Streptococcus pneumoniae 7 #> 9 Enterococcus faecium 6 #> 10 Listeria monocytogenes 6 #> # … with 23 more rows #> #> $denominators #> # A tibble: 4 x 2 #> age_group_subcat n #> <fct> <int> #> 1 Death in the first 24 hours 268 #> 2 Early Neonate (24-72 hours) 147 #> 3 Early Neonate (72+hrs to 6 days) 100 #> 4 Late Neonate (7 to 27 days) 126 #> #> $no_etiol #> # A tibble: 4 x 2 #> age_group_subcat n #> <fct> <int> #> 1 Death in the first 24 hours 9 #> 2 Early Neonate (24-72 hours) 5 #> 3 Early Neonate (72+hrs to 6 days) 3 #> 4 Late Neonate (7 to 27 days) 4 #> #> attr(,"class") #> [1] "champs_computed" "calc_top_etiol_by_age"
calc_cc_allcases_by_age_acq(d, condition = "Klebsiella pneumoniae", age_groups = valid_age_subcats(d) ) #> $version #> # A tibble: 1 x 3 #> dataset_name dataset_version dataset_release_date #> <chr> <dbl> <date> #> 1 CHAMPS Level2 Data 4.1 2020-08-01 #> #> $table #> # A tibble: 9 x 4 #> # Groups: age_group_subcat [9] #> age_group_subcat Community Facility Total #> <fct> <int> <int> <int> #> 1 Stillbirth 2 0 2 #> 2 Death in the first 24 hours 6 0 6 #> 3 Early Neonate (24-72 hours) 10 12 22 #> 4 Early Neonate (72+hrs to 6 days) 2 29 31 #> 5 Late Neonate (7 to 27 days) 6 44 50 #> 6 Infant (28 days to less than 6 months) 10 4 14 #> 7 Infant (6 months to less than 12 months) 17 41 58 #> 8 Child (12 months to less than 60 Months) 23 24 47 #> 9 <NA> 1 4 5 #> #> $denominators #> # A tibble: 9 x 2 #> age_group_subcat n #> <fct> <int> #> 1 Stillbirth 381 #> 2 Death in the first 24 hours 268 #> 3 Early Neonate (24-72 hours) 147 #> 4 Early Neonate (72+hrs to 6 days) 100 #> 5 Late Neonate (7 to 27 days) 126 #> 6 Infant (28 days to less than 6 months) 65 #> 7 Infant (6 months to less than 12 months) 162 #> 8 Child (12 months to less than 60 Months) 207 #> 9 <NA> 20 #> #> attr(,"class") #> [1] "champs_computed" "calc_cc_allcases_by_age_acq"
calc_syndrome_combinations(d, condition = "Streptococcus pneumoniae", syndrome_names = c( "Lower respiratory infections", "Meningitis/Encephalitis", "Neonatal sepsis", "Congenital infection"), syndrome_values = c( "Pneumonia", "Meningitis", "Sepsis", "Sepsis"), specimen_types = c( "Cerebrospinal fluid sample", "Tissue specimen from lung", "Whole blood") ) #> $version #> # A tibble: 1 x 3 #> dataset_name dataset_version dataset_release_date #> <chr> <dbl> <date> #> 1 CHAMPS Level2 Data 4.1 2020-08-01 #> #> $table #> # A tibble: 6 x 2 #> syndrome n #> <chr> <int> #> 1 Pneumonia 41 #> 2 Pneumonia, Sepsis 23 #> 3 Sepsis 15 #> 4 Meningitis, Pneumonia, Sepsis 7 #> 5 Meningitis 3 #> 6 Meningitis, Pneumonia 2 #> #> $age_breakdown #> # A tibble: 5 x 2 #> age_group n #> <fct> <int> #> 1 Stillbirth 2 #> 2 Death in the first 24 hours 2 #> 3 Early Neonate (1 to 6 days) 5 #> 4 Infant (28 days to less than 12 months) 41 #> 5 Child (12 months to less than 60 Months) 41 #> #> $cc_leading_to_death #> $cc_leading_to_death$numerator #> [1] 91 #> #> $cc_leading_to_death$denominator #> [1] 1476 #> #> $cc_leading_to_death$pct #> [1] 6.165312 #> #> #> $tac_age_breakdown #> # A tibble: 6 x 3 #> age_group n pct #> <fct> <int> <dbl> #> 1 Stillbirth 254 7.40 #> 2 Death in the first 24 hours 203 5.91 #> 3 Early Neonate (1 to 6 days) 418 12.2 #> 4 Late Neonate (7 to 27 days) 347 10.1 #> 5 Infant (28 days to less than 12 months) 1044 30.4 #> 6 Child (12 months to less than 60 Months) 1167 34.0 #> #> attr(,"class") #> [1] "champs_computed" "calc_syndrome_combinations"
calc_cc_by_age_syndrome(d, condition = "Streptococcus pneumoniae", syndrome_names = c( "Lower respiratory infections", "Meningitis/Encephalitis", "Neonatal sepsis", "Congenital infection"), syndrome_values = c( "Pneumonia", "Meningitis", "Sepsis", "Sepsis") ) #> $version #> # A tibble: 1 x 3 #> dataset_name dataset_version dataset_release_date #> <chr> <dbl> <date> #> 1 CHAMPS Level2 Data 4.1 2020-08-01 #> #> $table #> # A tibble: 6 x 8 #> age_group Meningitis `Meningitis, Pn… `Meningitis, Pn… Pneumonia `Pneumonia, Sep… Sepsis #> <fct> <int> <int> <int> <int> <int> <int> #> 1 Stillbir… 0 0 0 0 0 2 #> 2 Death in… 0 0 0 1 0 1 #> 3 Early Ne… 1 0 0 1 0 3 #> 4 Late Neo… 0 0 0 0 0 0 #> 5 Infant (… 1 0 5 18 11 6 #> 6 Child (1… 1 2 2 21 12 3 #> # … with 1 more variable: `NA` <int> #> #> attr(,"class") #> [1] "champs_computed" "calc_cc_by_age_syndrome"
In the next article, we will show how the results of some of these calculations can be transformed into outputs such as tables and plots.