R/process_accel.R
process_mort.Rd
This function creates a clean mortality dataset which can be combined with data from the NHANES 2003-2004/2005-2006 waves.
process_mort(waves = c("C", "D"), mort_release_yr = 2011, localpath = NULL)
waves | Character vector indicating the waves . Defaults to a vector with "C" and "D", corresponding to the 2003-2004 and 2005-2006 waves. |
---|---|
mort_release_yr | Nuemric value indicating the year associated with the raw mortality data to be processed. The default, 2011, corresponds to the most recent raw mortality data included in the data package. |
localpath | Character scalar describing the location where the raw data are stored. If NULL, the funciton will look in pacakge data directory for the requested raw mortality data. Defaults to NULL. |
This function will return a list with number of elements less than or equal to the number of waves of data specified by the "waves" argument. The exact number of elements returned will depend on whether all files specified by the user are found in either: 1) the local directory indicated by the localpath argument; or available in the data package. Because the mortality data provided changes from year-to-year, the columns of each element will depend on the release year.
For the 2011 release year data, each element of the list returned is a data frame with columns:
SEQN: Unique subject identifier
eligstat: Eligibility status for mortality follow-up
1: Eligible
2: Under age 18, not available for public release
3: Ineligible
mortat: Indicator for whether participant was found to be alive or deceased at follow-up time given by permth_exm and permth_int
0: Assumed alive
1: Assumed deceased
NA: Under age 18, not available for public release or ineligible for mortality follow-up
permth_exm: Time in months from the mobile examination center (MEC) assessment where mortality was assessed.
permth_int: Time in months from the household interview where mortality was assessed.
ucod_leading: Underlying cause of death recode from UCOD_113 leading causes where available. Specific causes:
001: Diseases of the heart (I00-I09, I11, I13, I20-I51)
002: Malignant neoplasms (C00-C97)
003: Chronic lower respiratory diseases (J40-J47)
004: Accidents (unintentional injuries) (V01-X59, Y85-Y86)
005: Cerebrovascular diseases (I60-I69)
006: Alzheimer's disease (G30)
007: Diabetes mellitus (E10-E14)
008: Influenza and pneumonia (J09-J18)
009: Nephritis, nephrotic syndrome and nephrosis (N00-N07, N17-N19, N25-N27)
010: All other causes (residual)
NA: Ineligible, under age 18, assumed alive or no cause data
diabetes_mcod: diabetes flag from multiple cause of death (mcod)
hyperten_mcod: hyperten flag from multiple cause of death (mcod)
mortscrce_ndi: mortality source: NDI match
mortscrce_ssa: mortality source: SSA information
mortscrce_cms: mortality source: CMS information
mortscrce_dc: mortality source: death certificate match
mortscrce_dcl: mortality source: data collection
For the 2015 release year data, only the first 8 columns described above are provided.
As of writing, this function has only been tested on the 2011 release for the 2003-2004 and 2005-2006 NHANES mortality data.devtools::check(args = "--as-cran") The raw data comes in the form of a vector of strings, with each string associated with on participant. Assuming mortality releases for other waves use the same format, this function. As future mortality data are released, we will update the package with both the processed and raw mortality data for the NHANES 2003-2006 waves. If necessary, we will modify the code to be able to process all releases of the mortality data for 2011 and beyond. The documentation here will be updated as we confirm future mortality data releases are processed correctly using this function.
National Center for Health Statistics. Office of Analysis and Epidemiology, Public-use Linked Mortality File, 2015. Hyattsville, Maryland. (Available at the following address: http://www.cdc.gov/nchs/data_access/data_linkage/mortality.htm
library("rnhanesdata") ## process NHANES mortality data using the raw mortality data release from 2011 that comes ## with the package mort_ls <- process_mort()#> | | | 0% | |=================================== | 50% | |======================================================================| 100%## verify that this yields identical results to the processed data included in the package identical(mort_ls$Mortality_2011_C, Mortality_2011_C)#> [1] TRUE#> [1] TRUE