This function creates a clean mortality dataset which can be combined with data from the NHANES 2003-2004/2005-2006 waves.

process_mort(waves = c("C", "D"), mort_release_yr = 2011, localpath = NULL)

Arguments

waves

Character vector indicating the waves . Defaults to a vector with "C" and "D", corresponding to the 2003-2004 and 2005-2006 waves.

mort_release_yr

Nuemric value indicating the year associated with the raw mortality data to be processed. The default, 2011, corresponds to the most recent raw mortality data included in the data package.

localpath

Character scalar describing the location where the raw data are stored. If NULL, the funciton will look in pacakge data directory for the requested raw mortality data. Defaults to NULL.

Value

This function will return a list with number of elements less than or equal to the number of waves of data specified by the "waves" argument. The exact number of elements returned will depend on whether all files specified by the user are found in either: 1) the local directory indicated by the localpath argument; or available in the data package. Because the mortality data provided changes from year-to-year, the columns of each element will depend on the release year.

For the 2011 release year data, each element of the list returned is a data frame with columns:

  • SEQN: Unique subject identifier

  • eligstat: Eligibility status for mortality follow-up

    • 1: Eligible

    • 2: Under age 18, not available for public release

    • 3: Ineligible

  • mortat: Indicator for whether participant was found to be alive or deceased at follow-up time given by permth_exm and permth_int

    • 0: Assumed alive

    • 1: Assumed deceased

    • NA: Under age 18, not available for public release or ineligible for mortality follow-up

  • permth_exm: Time in months from the mobile examination center (MEC) assessment where mortality was assessed.

  • permth_int: Time in months from the household interview where mortality was assessed.

  • ucod_leading: Underlying cause of death recode from UCOD_113 leading causes where available. Specific causes:

    • 001: Diseases of the heart (I00-I09, I11, I13, I20-I51)

    • 002: Malignant neoplasms (C00-C97)

    • 003: Chronic lower respiratory diseases (J40-J47)

    • 004: Accidents (unintentional injuries) (V01-X59, Y85-Y86)

    • 005: Cerebrovascular diseases (I60-I69)

    • 006: Alzheimer's disease (G30)

    • 007: Diabetes mellitus (E10-E14)

    • 008: Influenza and pneumonia (J09-J18)

    • 009: Nephritis, nephrotic syndrome and nephrosis (N00-N07, N17-N19, N25-N27)

    • 010: All other causes (residual)

    • NA: Ineligible, under age 18, assumed alive or no cause data

  • diabetes_mcod: diabetes flag from multiple cause of death (mcod)

  • hyperten_mcod: hyperten flag from multiple cause of death (mcod)

  • mortscrce_ndi: mortality source: NDI match

  • mortscrce_ssa: mortality source: SSA information

  • mortscrce_cms: mortality source: CMS information

  • mortscrce_dc: mortality source: death certificate match

  • mortscrce_dcl: mortality source: data collection

For the 2015 release year data, only the first 8 columns described above are provided.

Details

As of writing, this function has only been tested on the 2011 release for the 2003-2004 and 2005-2006 NHANES mortality data.devtools::check(args = "--as-cran") The raw data comes in the form of a vector of strings, with each string associated with on participant. Assuming mortality releases for other waves use the same format, this function. As future mortality data are released, we will update the package with both the processed and raw mortality data for the NHANES 2003-2006 waves. If necessary, we will modify the code to be able to process all releases of the mortality data for 2011 and beyond. The documentation here will be updated as we confirm future mortality data releases are processed correctly using this function.

References

National Center for Health Statistics. Office of Analysis and Epidemiology, Public-use Linked Mortality File, 2015. Hyattsville, Maryland. (Available at the following address: http://www.cdc.gov/nchs/data_access/data_linkage/mortality.htm

Examples

library("rnhanesdata") ## process NHANES mortality data using the raw mortality data release from 2011 that comes ## with the package mort_ls <- process_mort()
#> | | | 0% | |=================================== | 50% | |======================================================================| 100%
## verify that this yields identical results to the processed data included in the package identical(mort_ls$Mortality_2011_C, Mortality_2011_C)
#> [1] TRUE
identical(mort_ls$Mortality_2011_D, Mortality_2011_D)
#> [1] TRUE