- Notifications
You must be signed in to change notification settings - Fork7
Variable transformation and harmonization for the Canadian Community Health Survey
License
Big-Life-Lab/cchsflow
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
cchsflow supports the use of the Canadian Community Health Survey (CCHS) bytransforming variables from each cycle into harmonized, consistent versions thatspan survey cycles (currently, 2001 to 2018).
The CCHS is a population-based cross-sectional survey of Canadians that has beenadministered every two years since 2001. There are approximately 130,000respondents per cycle. Studies use multiple CCHS cycles to examine trends overtime and increase sample size to examine sub-groups that are too small to examinein a single cycle.
The CCHS is one of the largest and most robust ongoing population health surveysworldwide. The CCHS, administered by Statistics Canada, is Canada's main generalpopulation health survey. Information about the survey is foundhere.The CCHS has aStatistic Canada Open Licence.
Each cycle of the CCHS contains over 1000 variables that cover the four maintopics: sociodemographic measures, health behaviours, health status and healthcare use. Theseemingly consistent questions across CCHS cycles entice you tocombine them together to increase sample size; however, you soon realize achallenge...
Imagine you want to use BMI (body mass index) for a study that spans CCHS 2001to 2018. BMIseems like a straightforward measure that is routinely-collectedworldwide. Indeed, BMI is included in all CCHS cycles. You examine thedocumentation and find the variableHWTAGBMI in the CCHS 2001 corresponds tobody mass index, but that in other cycles, the variable name changes toHWTCGBMI,HWTDGBMI,HWTEGBMI, etc. On reading the documentation, younotice that some cycles round the value to one decimal, whereas other cyclesround to two digits. Furthermore, some cycles don't calculate BMI forrespondents < age 20 or > 64 years. Also, some cycles calculate BMI only ifheight and weight are within specific ranges. These types of changes occur foralmost all CCHS variables. Sometimes the changes are subtle and difficult tofind in the documentation, even for seemingly straightforward variables such asBMI.cchsflow harmonizes the BMI variable across different cycles.
cchsflow creates harmonized variables (where possible) between CCHS cycles.Searching BMI invariables (described in the Introduction section ofvariableDetails.csvvignette)showsHWTGBMI calculates BMI with two decimal places for all cycles for allrespondents using the respondents' untruncated height and weight.
Calculate a harmonized BMI variable for CCHS 2001 cycle
# load test cchs data - included in cchsflow cchs2001_BMI <- rec_with_table(cchs2001_p, "HWTGBMI")Notes printed to console indicate issues that may affect BMI classification foryour study.
Loading cchsflow variable_detailsUsing the passed data variable name as database_nameNOTE for HWTGBMI : CCHS 2001 restricts BMI to ages 20-64NOTE for HWTGBMI : CCHS 2001 and 2003 codes not applicable and missing variables as 999.6 and 999.7-999.9 respectively, while CCHS 2005 onwards codes not applicable and missing variables as 999.96 and 999.7-999.99 respectivelyNOTE for HWTGBMI : Don't know (999.7) and refusal (999.8) not includedin 2001 CCHS"Combining CCHS across survey cycles will result in misclassification error andother forms of bias that affects studies in different ways. The transformationsthat are described in this repository have been used in several researchprojects, but there are no guarantees regarding the accuracy or appropriateuses.Thomas and Wannell describe methodolgy issues when combining CCHS cycles.
Care must be taken to understand how specific variable transformation andharmonization withcchsflow affect your study or use of CCHS data. Acrosssurvey cycles, almost all CCHS variables have had at least some change inwording and category responses. Furthermore, there have been changes in surveysampling, response rates, weighting methods and other survey design changes thataffect responses.
# Install release version from CRAN install.packages("cchsflow") # Install the most recent version from GitHub devtools::install_github("Big-Life-Lab/cchsflow")You can download and use the latest version ofvariables.csvandvariable_details.csvfrom GitHub.
cchsflow package includes:
variables.csv- a list of variables that can be transformed across CCHSsurveys.variable_details.csv- information that describes how the variables arerecoded.- Vignettes - that describe how to use R to transform or generate new derivedvariables that are listed in
variables.csv. Transformations are performedusingrec_with_table().variables.csvandvariable_details.csvcan beused with other statistics programs (seeissue). - Demonstration CCHS data -
cchsflowincludes a random sample of 200respondents from each CCHS PUMF file from 2001 to 2018. These data are used forthe vignettes.The CCHS test data is stored in /data as .RData files. They can be read as apackage database.
# read the CCHS 2017-2018 PUMF test datatest_data <- cchs2017_2018_pThis repository does not include the full CCHS data. Information on how toaccess the CCHS data can ishere.The Canadian university community can also access the CCHS throughODESI(see health/Canada/Canadian Community Health Survey).
Project on the roadmap can be found onhere.
Please followthis guideif you would like to contribute to thecchsflow package.
We encourage PRs for additional variable transformations and derived variablesthat you believe may be helpful to the broad CCHS community.
Currently,cchsflow supports R through therec_with_table() function. TheCCHS community commonly uses SAS, Stata and other statistical packages. Pleasefeel free to contribute tocchsflow by making a PR that creates versions ofrec_with_table() for other statistical and programming languages.
CCHS data used in this library is accessed and adapted in accordance to theStatistics Canada Open Licence Agreement.
Source from Statistics Canada, Canadian Community Health Survey 2001 to 2018PUMF, accessed March 2022. Reproduced and distributed on an "as is" basis with thepermission of Statistics Canada.
Adapted from Statistics Canada, Canadian Community Health Surveys 2001 to 2018PUMF, accessed March 2022. This does not constitute an endorsement by StatisticsCanada of this product.
About
Variable transformation and harmonization for the Canadian Community Health Survey
Topics
Resources
License
Code of conduct
Contributing
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors12
Uh oh!
There was an error while loading.Please reload this page.