Type: | Package |
Title: | Transforming and Harmonizing CCHS Variables |
Version: | 2.1.0 |
Date: | 2022-05-05 |
Depends: | R (≥ 3.5.0), haven (≥ 1.1.2), dplyr (≥ 0.8.2), sjlabelled (≥ 1.0.17), stringr (≥ 1.2.0), magrittr |
Description: | Supporting the use of the Canadian Community Health Survey (CCHS) by transforming variables from each cycle into harmonized, consistent versions that span survey cycles (currently, 2001 to 2018). CCHS data used in this library is accessed and adapted in accordance to the Statistics Canada Open Licence Agreement. This package uses rec_with_table(), which was developed from 'sjmisc' rec(). Lüdecke D (2018). "sjmisc: Data and Variable Transformation Functions". Journal of Open Source Software, 3(26), 754. <doi:10.21105/joss.00754>. |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
LazyData: | true |
URL: | https://github.com/Big-Life-Lab/cchsflow |
BugReports: | https://github.com/Big-Life-Lab/cchsflow/issues |
RoxygenNote: | 7.1.2 |
Suggests: | testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
NeedsCompilation: | no |
Packaged: | 2022-05-26 13:12:18 UTC; Kitty |
Author: | Doug Manuel |
Maintainer: | Kitty Chen <kitchen@ohri.ca> |
Repository: | CRAN |
Date/Publication: | 2022-05-26 14:10:02 UTC |
Type of drinker (12 months)
Description
NOTE: this is not a function.
This is a categorical variable derived by Statistics Canada that uses various intermediate alcohol variables to categorize individuals into 3 distinct groups:
Regular Drinker
Occasional Drinker
No drink in the last 12 months.
Usage
ALCDTTM(ALCDTTM)
Arguments
ALCDTTM |
cchsflow variable name for type of drinker (12 months) |
Details
This variable was introduced in the 2007-2008 cycle of the CCHS, and became the sole derived variable that categorized people into various drinker types from 2009 onwards. Unlike ALCDTYP, this variable does not distinguish between former and never drinkers.
Examples
library(cchsflow)
?ALCDTTM
Type of drinker
Description
NOTE: this is not a function.
This is a categorical variable derived by Statistics Canada that uses various intermediate alcohol variables to categorize individuals into 4 distinct groups:
Regular Drinker
Occasional Drinker
Former Drinker
Never Drinker
Usage
ALCDTYP(ALCDTYP)
Arguments
ALCDTYP |
cchsflow variable name for type of drinker |
Details
This variable is used in CCHS cycles from 2001 to 2007. How it was derived remained consistent during these years.
Starting in 2007, Statistics Canada created a derived variable that looked at drinking type in the last 12 months. This new derived variable did not distinguish between former and never drinkers. If your research requires you to differentiate between former and never drinkers, we recommend using earlier cycles of the CCHS.
Examples
library(cchsflow)
?ALCDTYP
Average daily alcohol consumption
Description
NOTE: this is not a function.
This is a continuous variable derived by Statistics Canada that quantifies the mean daily consumption of alcohol. This takes the value of ALWDWKY and divides it by 7.
Usage
ALWDDLY(ALWDDLY)
Arguments
ALWDDLY |
cchsflow variable name for average daily alcohol consumption |
Details
This variable is present in every CCHS cycle used in cchsflow, and how it was derived remains consistent.
Examples
library(cchsflow)
?ALWDDLY
Number of drinks consumed in the past week
Description
NOTE: this is not a function.
This is a continuous variable derived by Statistics Canada that quantifies the amount of alcohol that is consumed in a week. This is calculated by adding the number of drinks consumed each day in the past week. Respondents of each CCHS cycle are asked how much alcohol they have consumed each day in the past week (ie. how much alcohol did you consume on Sunday, how much did you consume on Monday etc.). Each day is considered an individual variable and ALWDWKY takes the sum of all daily variables.
Usage
ALWDWKY(ALWDWKY)
Arguments
ALWDWKY |
cchsflow variable name for number of drinks consumed in the past week |
Details
This variable is present in every CCHS cycle used in cchsflow, and how it was derived remains consistent.
Examples
library(cchsflow)
?ALWDWKY
Any alcohol past week
Description
NOTE: this is not a function.
This is a categorical variable derived by Statistics Canada that determines if alcohol was consumed in the past week. The variable is optional in selected provinces and territories.
Usage
ALW_1(ALW_1)
Arguments
ALW_1 |
cchsflow variable name for any alcohol past week |
Details
This variable is present in every CCHS cycle used in cchsflow. In 2007 and 2008, the variable is optional for Newfoundland and Labrador, Nova Scotia, Ontario, British Columbia and Nunavut.In 2009 and 2010, the variable is optional for Newfoundland and Labrador, Ontario, and Saskatchewan. In 2011, the variable is optional for Newfoundland and Labrador, Quebec, Ontario, Manitoba, and Saskatchewan. In 2012, the variable is optional for Newfoundland and Labrador, Quebec, Ontario, Manitoba, Nunavut, and Saskatchewan.In 2013, the variable is optional for Quebec, Ontario, Prince Edward Island, Manitoba, Yukon, and Saskatchewan. In 2014, the variable is optional for Nunavut, Quebec, Ontario, Prince Edward Island, Manitoba, Newfoundland and Labrador, Saskatchewan, and British Columbia.
Examples
library(cchsflow)
?ALW_1
Number of drinks - Sunday
Description
NOTE: this is not a function.
This is a continuous variable derived by Statistics Canada that quantifies the number of alcoholic drinks consumed on Sunday. The variable is optional in selected provinces and territories.
Usage
ALW_2A1(ALW_2A1)
Arguments
ALW_2A1 |
cchsflow variable name for number of drinks on Sunday |
Details
This variable is present in every CCHS cycle used in cchsflow. In 2007 and 2008, the variable is optional for Newfoundland and Labrador, Nova Scotia, Ontario, British Columbia and Nunavut.In 2009 and 2010, the variable is optional for Newfoundland and Labrador, Ontario, and Saskatchewan. In 2011, the variable is optional for Newfoundland and Labrador, Quebec, Ontario, Manitoba, and Saskatchewan. In 2012, the variable is optional for Newfoundland and Labrador, Quebec, Ontario, Manitoba, Nunavut, and Saskatchewan.In 2013, the variable is optional for Quebec, Ontario, Prince Edward Island, Manitoba, Yukon, and Saskatchewan. In 2014, the variable is optional for Nunavut, Quebec, Ontario, Prince Edward Island, Manitoba, Newfoundland and Labrador, Saskatchewan, and British Columbia.
Examples
library(cchsflow)
?ALW_2A1
Number of drinks - Monday
Description
NOTE: this is not a function.
This is a continuous variable derived by Statistics Canada that quantifies the number of alcoholic drinks consumed on Monday. The variable is optional in selected provinces and territories.
Usage
ALW_2A2(ALW_2A2)
Arguments
ALW_2A2 |
cchsflow variable name for number of drinks on Monday |
Details
This variable is present in every CCHS cycle used in cchsflow. In 2007 and 2008, the variable is optional for Newfoundland and Labrador, Nova Scotia, Ontario, British Columbia and Nunavut.In 2009 and 2010, the variable is optional for Newfoundland and Labrador, Ontario, and Saskatchewan. In 2011, the variable is optional for Newfoundland and Labrador, Quebec, Ontario, Manitoba, and Saskatchewan. In 2012, the variable is optional for Newfoundland and Labrador, Quebec, Ontario, Manitoba, Nunavut, and Saskatchewan.In 2013, the variable is optional for Quebec, Ontario, Prince Edward Island, Manitoba, Yukon, and Saskatchewan. In 2014, the variable is optional for Nunavut, Quebec, Ontario, Prince Edward Island, Manitoba, Newfoundland and Labrador, Saskatchewan, and British Columbia.
Examples
library(cchsflow)
?ALW_2A2
Number of drinks - Tuesday
Description
NOTE: this is not a function.
This is a continuous variable derived by Statistics Canada that quantifies the number of alcoholic drinks consumed on Tuesday. The variable is optional in selected provinces and territories.
Usage
ALW_2A3(ALW_2A3)
Arguments
ALW_2A3 |
cchsflow variable name for number of drinks on Tuesday |
Details
This variable is present in every CCHS cycle used in cchsflow. In 2007 and 2008, the variable is optional for Newfoundland and Labrador, Nova Scotia, Ontario, British Columbia and Nunavut.In 2009 and 2010, the variable is optional for Newfoundland and Labrador, Ontario, and Saskatchewan. In 2011, the variable is optional for Newfoundland and Labrador, Quebec, Ontario, Manitoba, and Saskatchewan. In 2012, the variable is optional for Newfoundland and Labrador, Quebec, Ontario, Manitoba, Nunavut, and Saskatchewan.In 2013, the variable is optional for Quebec, Ontario, Prince Edward Island, Manitoba, Yukon, and Saskatchewan. In 2014, the variable is optional for Nunavut, Quebec, Ontario, Prince Edward Island, Manitoba, Newfoundland and Labrador, Saskatchewan, and British Columbia.
Examples
library(cchsflow)
?ALW_2A3
Number of drinks - Wednesday
Description
NOTE: this is not a function.
This is a continuous variable derived by Statistics Canada that quantifies the number of alcoholic drinks consumed on Wednesday. The variable is optional in selected provinces and territories.
Usage
ALW_2A4(ALW_2A4)
Arguments
ALW_2A4 |
cchsflow variable name for number of drinks on Wednesday |
Details
This variable is present in every CCHS cycle used in cchsflow. In 2007 and 2008, the variable is optional for Newfoundland and Labrador, Nova Scotia, Ontario, British Columbia and Nunavut.In 2009 and 2010, the variable is optional for Newfoundland and Labrador, Ontario, and Saskatchewan. In 2011, the variable is optional for Newfoundland and Labrador, Quebec, Ontario, Manitoba, and Saskatchewan. In 2012, the variable is optional for Newfoundland and Labrador, Quebec, Ontario, Manitoba, Nunavut, and Saskatchewan.In 2013, the variable is optional for Quebec, Ontario, Prince Edward Island, Manitoba, Yukon, and Saskatchewan. In 2014, the variable is optional for Nunavut, Quebec, Ontario, Prince Edward Island, Manitoba, Newfoundland and Labrador, Saskatchewan, and British Columbia.
Examples
library(cchsflow)
?ALW_2A4
Number of drinks - Thursday
Description
NOTE: this is not a function.
This is a continuous variable derived by Statistics Canada that quantifies the number of alcoholic drinks consumed on Thursday. The variable is optional in selected provinces and territories.
Usage
ALW_2A5(ALW_2A5)
Arguments
ALW_2A5 |
cchsflow variable name for number of drinks on Thursday |
Details
This variable is present in every CCHS cycle used in cchsflow. In 2007 and 2008, the variable is optional for Newfoundland and Labrador, Nova Scotia, Ontario, British Columbia and Nunavut.In 2009 and 2010, the variable is optional for Newfoundland and Labrador, Ontario, and Saskatchewan. In 2011, the variable is optional for Newfoundland and Labrador, Quebec, Ontario, Manitoba, and Saskatchewan. In 2012, the variable is optional for Newfoundland and Labrador, Quebec, Ontario, Manitoba, Nunavut, and Saskatchewan.In 2013, the variable is optional for Quebec, Ontario, Prince Edward Island, Manitoba, Yukon, and Saskatchewan. In 2014, the variable is optional for Nunavut, Quebec, Ontario, Prince Edward Island, Manitoba, Newfoundland and Labrador, Saskatchewan, and British Columbia.
Examples
library(cchsflow)
?ALW_2A5
Number of drinks - Friday
Description
NOTE: this is not a function.
This is a continuous variable derived by Statistics Canada that quantifies the number of alcoholic drinks consumed on Friday. The variable is optional in selected provinces and territories.
Usage
ALW_2A6(ALW_2A6)
Arguments
ALW_2A6 |
cchsflow variable name for number of drinks on Friday |
Details
This variable is present in every CCHS cycle used in cchsflow. In 2007 and 2008, the variable is optional for Newfoundland and Labrador, Nova Scotia, Ontario, British Columbia and Nunavut.In 2009 and 2010, the variable is optional for Newfoundland and Labrador, Ontario, and Saskatchewan. In 2011, the variable is optional for Newfoundland and Labrador, Quebec, Ontario, Manitoba, and Saskatchewan. In 2012, the variable is optional for Newfoundland and Labrador, Quebec, Ontario, Manitoba, Nunavut, and Saskatchewan.In 2013, the variable is optional for Quebec, Ontario, Prince Edward Island, Manitoba, Yukon, and Saskatchewan. In 2014, the variable is optional for Nunavut, Quebec, Ontario, Prince Edward Island, Manitoba, Newfoundland and Labrador, Saskatchewan, and British Columbia.
Examples
library(cchsflow)
?ALW_2A6
Number of drinks - Saturday
Description
NOTE: this is not a function.
This is a continuous variable derived by Statistics Canada that quantifies the number of alcoholic drinks consumed on Saturday. The variable is optional in selected provinces and territories.
Usage
ALW_2A7(ALW_2A7)
Arguments
ALW_2A7 |
cchsflow variable name for number of drinks on Saturday |
Details
This variable is present in every CCHS cycle used in cchsflow. In 2007 and 2008, the variable is optional for Newfoundland and Labrador, Nova Scotia, Ontario, British Columbia and Nunavut.In 2009 and 2010, the variable is optional for Newfoundland and Labrador, Ontario, and Saskatchewan. In 2011, the variable is optional for Newfoundland and Labrador, Quebec, Ontario, Manitoba, and Saskatchewan. In 2012, the variable is optional for Newfoundland and Labrador, Quebec, Ontario, Manitoba, Nunavut, and Saskatchewan.In 2013, the variable is optional for Quebec, Ontario, Prince Edward Island, Manitoba, Yukon, and Saskatchewan. In 2014, the variable is optional for Nunavut, Quebec, Ontario, Prince Edward Island, Manitoba, Newfoundland and Labrador, Saskatchewan, and British Columbia.
Examples
library(cchsflow)
?ALW_2A7
COPD_Emph_der_fun1
Description
This is one of 2 functions used to create a derived variable (COPD_Emph_der) that determines if a respondents has either COPD or Emphysema. 2 different functions have been created to account for the fact that different respiratory variables are used across CCHS cycles. This function is for CCHS cycles (2005-2008) that use COPD and Emphysema as a combined variable.
Usage
COPD_Emph_der_fun1(DHHGAGE_cont, CCC_91E, CCC_91F)
Arguments
DHHGAGE_cont |
continuous age variable. |
CCC_91E |
variable indicating if respondent has Emphysema |
CCC_91F |
variable indicating if respondent has COPD |
Value
a categorical variable (COPD_Emph_der) with 3 levels:
respondent is over the age of 35 and has a respiratory condition
respondent is under the age of 35 and has a respiratory condition
respondent does not have a respiratory condition
See Also
Examples
# COPD_Emph_der_fun1() to create values across CCHS cycles
# (2005-2008) COPD_Emph_der_fun1() is specified in
# variable_details.csv along with the CCHS variables and cycles included.
# To transform COPD_Emph_der, use rec_with_table() for each CCHS cycle
# and specify COPD_Emph_der, along with the various respiratory
# variables. Then by using merge_rec_data() you can combine COPD_Emph_der
# across cycles.
library(cchsflow)
COPD2005 <- suppressWarnings(rec_with_table(
cchs2005_p, c(
"DHHGAGE_cont", "CCC_91E", "CCC_91F",
"COPD_Emph_der"
)
))
head(COPD2005)
COPD2007_2008 <- suppressWarnings(rec_with_table(
cchs2007_2008_p, c(
"DHHGAGE_cont", "CCC_91E", "CCC_91F",
"COPD_Emph_der"
)
))
tail(COPD2007_2008)
combined_COPD <- suppressWarnings(merge_rec_data(COPD2005, COPD2007_2008))
head(combined_COPD)
tail(combined_COPD)
COPD_Emph_der_fun2
Description
This is one of 2 functions used to create a derived variable (COPD_Emph_der) that determines if a respondents has either COPD or Emphysema. 2 different functions have been created to account for the fact that different respiratory variables are used across CCHS cycles. This function is for CCHS cycles (2001-2003, 2009-2014) that use COPD and Emphysema as a combined variable.
Usage
COPD_Emph_der_fun2(DHHGAGE_cont, CCC_091)
Arguments
DHHGAGE_cont |
continuous age variable. |
CCC_091 |
variable indicating if respondent has either COPD or Emphysema |
Value
a categorical variable (COPD_Emph_der) with 3 levels:
respondent is over the age of 35 and has a respiratory condition
respondent is under the age of 35 and has a respiratory condition
respondent does not have a respiratory condition
See Also
Examples
# COPD_Emph_der_fun2() to create values across CCHS cycles
# (2001-2003, 2009-2014) COPD_Emph_der_fun2() is specified in
# variable_details.csv along with the CCHS variables and cycles included.
# To transform COPD_Emph_der, use rec_with_table() for each CCHS cycle
# and specify COPD_Emph_der, along with the various respiratory
# variables. Then by using merge_rec_data() you can combine COPD_Emph_der
# across cycles.
library(cchsflow)
COPD2001 <- suppressWarnings(rec_with_table(
cchs2001_p, c(
"DHHGAGE_cont", "CCC_091",
"COPD_Emph_der"
)
))
head(COPD2001)
COPD2014 <- suppressWarnings(rec_with_table(
cchs2007_2008_p, c(
"DHHGAGE_cont", "CCC_091",
"COPD_Emph_der"
)
))
tail(COPD2014)
combined_COPD <- suppressWarnings(merge_rec_data(COPD2001, COPD2014))
head(combined_COPD)
tail(combined_COPD)
Depression Scale - Predicted Probability
Description
NOTE: this is not a function.
This is categorical variable derived by Statistics Canada that predicts
the probability that a respondent would be diagnosed as having a major
depressive episode if a diagnostic interview was completed. This variable
is derived from DPSDSF
in which probabilities are assigned
to respondents based on their depression scale score. For more details on
how the variable was derived click here.
Usage
DPSDPP(DPSDPP)
Arguments
DPSDPP |
cchsflow variable name for derived depression scale predicted probability. |
Details
While this variable was considered to be categorical in CCHS documentation, the values range from 0 to 0.90 with no distinct names or metadata for each category. As such, this variable was specified as a continuous variable in cchsflow. This has no bearing on the final output of the variable as there are no recode changes. This means that a respondent who was coded with a probability of 0.50 will still have a probability value of 0.50 when the variable goes through harmonization.
This variable is present in every CCHS cycle used in cchsflow, and how it was derived remains consistent.
See Also
Examples
library(cchsflow)
?DPSDPP
Derived Depression Scale - Short Form Score
Description
NOTE: this is not a function.
This is a continuous variable derived by Statistics Canada that assesses the level of depression of respondents who have identified that they have felt depressed or loss of interest within the last two weeks. This variable is scaled from 0 to 8, with 0 indicating a respondent has not felt depressed or loss of interest, and 8 representing the highest level of depression.
Usage
DPSDSF(DPSDSF)
Arguments
DPSDSF |
cchsflow variable name for derived depression scale. |
Details
The derivation of this variable is based on the work of Kessler & Mroczek from the University of Michigan. For more details on the items used and how the variable was derived click here.
This variable is present in every CCHS cycle used in cchsflow, and how it was derived remains consistent.
See Also
Examples
library(cchsflow)
?DPSDSF
Satisfaction with life (GEN_02A/GEN_02A2)
Description
NOTE: this is not a function.
These are two variables asked in the CCHS that asks respondents to rate their satisfaction with their lives. The variable GEN_02A is a categorical variable with 5 categories:
Very satisfied
Satisfied
Neither satisfied nor unsatisfied
Dissatisfied
Very dissatisfied
The GEN_02A2 is a continuous variable from 0 to 10, where 0 represents very dissatisfied and 10 represents very satisfied.
Usage
GEN_02A2(GEN_02A, GEN_02A2)
Arguments
GEN_02A |
- categorical life satisfaction variable asked from 2003-2007 |
GEN_02A2 |
- continuous life satisfaction variable asked from 2009-2014, and derived for 2003-2007 |
Details
GEN_02A was asked to respondents in the 2003, 2005, and 2007-2008 CCHS survey cycles; while GEN_02A2 was asked to respondents in CCHS survey cycles from 2009 to 2014. To harmonize GEN_02A2 across more cycles, GEN_02A2 was derived for earlier cycles by converting GEN_02A values to match the scale used in GEN_02A2. The very satisfied category was converted to a score of 10; the satisfied category was converted to a score of 7; the neither satisfied nor unsatisfied category was converted to a score of 5; the dissatisfied category was converted to a score of 2; and the very dissatisfied category was converted to a score of 0.
When using earlier CCHS cycles (2003-2007), it is appropriate to use GEN_02A. When using multiple CCHS cycles that include cycles from 2009-2014, GEN_02A2 is recommended.
Examples
library(cchsflow)
?GEN_02A2
Occupation Group (9 categories)
Description
NOTE: this is not a function.
This is a 9 category variable (LBFA_31A) that is in the CCHS that asks which occupation group best describes a respondent. Occupation group is asked in the 2001 CCHS cycle and in CCHS cycles from 2007-2014.
Usage
LBFA_31A(LBFA_31A)
Arguments
LBFA_31A |
cchsflow variable name for Occupation Group (9 categories) |
Details
While occupation group is asked in many survey cycles, the 2001 CCHS survey cycle is the only survey that has 9 categories. The categories are as follows:
Management
Professional (including accountants)
Technologist, Technician or Tech Occupation
Administrative, Financial or Clerical
Sales or Service
Trades, Transport or Equipment Operator
Farming, Forestry, Fishing, Mining
Processing, Manufacturing, Utilities
Other
To harmonize the 2001 CCHS cycle with other survey cycles,
LBFA_31A_a
and LBFA_31A_b
were created in which
categories in the 2001 survey cycle were collapsed.
See Also
Examples
library(cchsflow)
?LBFA_31A
Occupation Group (5 categories)
Description
NOTE: this is not a function.
This is a 5 category variable (LBFA_31A_a) that is in the CCHS that asks which occupation group best describes a respondent. Occupation group is asked in the 2001 CCHS cycle and in CCHS cycles from 2007-2014.
Usage
LBFA_31A_a(LBFA_31A_a)
Arguments
LBFA_31A_a |
cchsflow variable name for Occupation Group (5 categories) |
Details
In the 2007-2014 CCHS survey cycles, occupation group has 5 categories. The categories are as follows:
Management, Health, Education, Art, Culture
Business, Finance, Admin
Sales or Service
Trades, Transport or Equipment Operator
Unique to Primary Industry/Processing/Manufacturing
In this variable, categories from the 2001 CCHS survey cycle were collapsed to harmonize with the other survey cycles. "Management, Professional (including accountants), Technologist, Technician or Tech Occupation" were combined into one category "Management, Health, Education, Art, Culture". "Farming, Forestry, Fishing, Mining" and "Processing, Manufacturing, Utilities", were combined into one category "Farming, Forestry, Fishing, Mining, Processing, Manufacturing, Utilities".
The "other" category in the 2001 CCHS survey cycle was assigned to missing
(NA(b)). This is consistent with other studies
(doi: 10.4103/IJCIIS.IJCIIS_43_18)
that group the "other" category as "missing". LBFA_31A_b
is a 6 category variable that keeps the "other" category in the 2001 survey
cycle as "other".
See Also
Examples
library(cchsflow)
?LBFA_31A_a
Occupation Group (6 categories)
Description
NOTE: this is not a function.
This is a 6 category variable (LBFA_31A_b) that is in the CCHS that asks which occupation group best describes a respondent. Occupation group is asked in the 2001 CCHS cycle and in CCHS cycles from 2007-2014.
Usage
LBFA_31A_b(LBFA_31A_b)
Arguments
LBFA_31A_b |
cchsflow variable name for Occupation Group (6 categories) |
Details
In the 2007-2014 CCHS survey cycles, occupation group has 5 categories. This variable, however, includes a sixth category to account for the "other" category asked in the 2001 CCHS survey cycle. The categories are as follows:
Management, Health, Education, Art, Culture
Business, Finance, Admin
Sales or Service
Trades, Transport or Equipment Operator
Unique to Primary Industry/Processing/Manufacturing
Other
In this variable, categories from the 2001 CCHS survey cycle were collapsed to harmonize with the other survey cycles. "Management, Professional (including accountants), Technologist, Technician or Tech Occupation" were combined into one category "Management, Health, Education, Art, Culture". "Farming, Forestry, Fishing, Mining" and "Processing, Manufacturing, Utilities", were combined into one category "Farming, Forestry, Fishing, Mining, Processing, Manufacturing, Utilities".
See Also
Examples
library(cchsflow)
?LBFA_31A_b
Participation and Activity Limitation
Description
This is a derived variable used in the CCHS (RACDPAL) to classify respondents according to the frequency with which they experience activity limitations due to disability.
Usage
RACDPAL_fun(RAC_1, RAC_2A, RAC_2B, RAC_2C)
Arguments
RAC_1 |
Has difficulty with activities due to disability |
RAC_2A |
Reduction in activities at home due to disability |
RAC_2B |
Reduction in activities at school or work due to disability |
RAC_2C |
Reduction in other activities |
Details
This derived variable is generated in CCHS cycles 2003-2014. The 2001 CCHS cycle, however, contains the same base variables used to derive this variable. To include respondents in the 2001 CCHS cycle, this custom function was created using the same derivation conditions used in later cycles.
Value
the CCHS derived variable RACDPAL with 3 categories:
Sometimes
Often
Never
Examples
# Using RACDPAL_fun() to transform RACDPAL in 2001.
# RACDPAL_fun() is specified in variable_details.csv along with the
# CCHS variables and cycles included.
# To transform RACDPAL, use rec_with_table() for each the 2001 cycle
# and specify RACDPAL, along with the various ADL variables.
library(cchsflow)
RACDPAL_2001 <- rec_with_table(
cchs2001_p, c(
"RAC_1", "RAC_2A", "RAC_2B", "RAC_2C", "RACDPAL"
)
)
head(RACDPAL_2001)
# Note: In other CCHS cycles you only need to specify RACDPAL as the variable
# was included in those survey cycles.
# Using RACDPAL_fun() with user inputted data.
# Let's say you're an individual that sometimes has difficulties with
# activities due to disability, sometimes has a reduction in activities at
# home, often has a reduction at school or work, and never has a reduction
# in other activities. Your participation and activity limitation can be
# determined as follows:
library(cchsflow)
RACDPAL <- RACDPAL_fun(1, 1, 2, 3)
print(RACDPAL)
Type of smokers
Description
This function creates a derived variable (SMKDSTY_A) for smoker type with 5 categories:
daily smoker
current occasional smoker (former daily)
current occasional smoker (never daily)
current nonsmoker (former daily)
current nonsmoker (never daily)
nonsmoker
Usage
SMKDSTY_fun(SMK_005, SMK_030, SMK_01A)
Arguments
SMK_005 |
type of smoker presently |
SMK_030 |
smoked daily - lifetime (occasional/former smoker) |
SMK_01A |
smoked 100 or more cigarettes in lifetime |
Details
For CCHS 2001-2014, smoker type is derived from smoking more than 100 cigarettes in lifetime, type of smoker at present time, and ever smoked daily. For CCHS 2015-2018, smoker type was derived differently with different variables and categories. A function was created for a consistent smoker status across all cycles.
Value
value for smoker type in the SMKDSTY_A variable
Examples
# Using SMKDSTY_fun() to derive smoke type values across CCHS cycles
# SMKDSTY_fun() is specified in variable_details.csv along with the
# CCHS variables and cycles included.
# To transform SMKDSTY_A across cycles, use rec_with_table() for each
# CCHS cycle and specify SMKDSTY_A.
# For CCHS 2001-2014, only specify SMKDSTY_A for smoker type.
# For CCHS 2015-2018, specify the parameters and SMKDSTY_A for smoker type.
library(cchsflow)
smoker_type_2009_2010 <- rec_with_table(
cchs2009_2010_p, "SMKDSTY_A")
head(smoker_type_2009_2010)
smoker_type_2017_2018 <- rec_with_table(
cchs2017_2018_p,c(
"SMK_01A", "SMK_005","SMK_030","SMKDSTY_A"
)
)
tail(smoker_type_2017_2018)
combined_smoker_type <- suppressWarnings(merge_rec_data
(smoker_type_2009_2010,smoker_type_2017_2018))
head(combined_smoker_type)
tail(combined_smoker_type)
Age started smoking daily - daily/former daily smokers
Description
This function creates a continuous derived variable (SMKG040_fun) that calculates the approximate age that a daily or former daily smoker began smoking daily.
Usage
SMKG040_fun(SMKG203_cont, SMKG207_cont)
Arguments
SMKG203_cont |
age started smoking daily. Variable asked to daily smokers. |
SMKG207_cont |
age started smoking daily. Variable asked to former daily smokers. |
Details
SMKG203 (daily smoker) and SMKG207 (former daily) are present in CCHS 2001-2014, and are separate variables. For CCHS 2015 and onward, SMKG040 (daily/former daily) combines the two previous variables. SMKG040_fun takes the continuous functions (SMKG203_cont and SMKG207_cont) to create SMKG040 for 2001-2014.
Value
value for age started smoking daily for daily/former daily smokers in the SMKG040_cont variable
Note
In previous cycles, both SMKG203 and SMKG207 included respondents who did not state their smoking status. From CCHS 2015 and onward, SMKG040 only included respondents who specified daily smoker or former daily smoker. As a result, SMKG040 has a large number of missing respondents for CCHS 2015 survey cycles and onward.
Examples
# Using SMKG040_fun() to create age values across CCHS cycles
# SMKG040_fun() is specified in variable_details.csv under SMKG040_cont.
# To create a continuous harmonized variable for SMKG040, use rec_with_table()
# for each CCHS cycle and specify SMKG040_cont.
library(cchsflow)
age_smoke_dfd_2009_2010 <- rec_with_table(
cchs2009_2010_p, c(
"SMKG203_cont", "SMKG207_cont","SMKG040_cont"
)
)
head(age_smoke_dfd_2009_2010)
age_smoke_dfd_2011_2012 <- rec_with_table(
cchs2011_2012_p,c(
"SMKG203_cont", "SMKG207_cont","SMKG040_cont"
)
)
tail(age_smoke_dfd_2011_2012)
combined_age_smoke_dfd <- suppressWarnings(merge_rec_data
(age_smoke_dfd_2009_2010,age_smoke_dfd_2011_2012))
head(combined_age_smoke_dfd)
tail(combined_age_smoke_dfd)
Age started to smoke daily - daily smoker
Description
This function creates a continuous derived variable (SMKG203_cont) for age started to smoke daily for daily smokers.
Usage
SMKG203_fun(SMK_005, SMKG040)
Arguments
SMK_005 |
type of smoker presently |
SMKG040 |
age started to smoke daily - daily/former daily smoker |
Details
For CCHS 2015-2018, age started to smoke daily was combined for daily and former daily smokers.Previous cycles had separate variables for age started to smoke daily. Type of smoker presently is used to define daily smoker.
Value
value for continuous age started to smoke daily for daily smokers in the SMKG203_cont variable
Examples
# Using SMKG203_fun() to derive age started to smoke daily values across
# CCHS cycles.
# SMKG203_fun() is specified in variable_details.csv along with the
# CCHS variables and cycles included.
# To transform SMKG203_A across cycles, use rec_with_table() for each
# CCHS cycle and specify SMKG203_A.
# For CCHS 2001-2014, only specify SMKG203_A.
# For CCHS 2015-2018, specify the parameters and SMKG203_A for daily smoker
# age.
library(cchsflow)
agecigd_2009_2010 <- rec_with_table(
cchs2009_2010_p, "SMKG203_A")
head(agecigd_2009_2010)
agecigd_2017_2018 <- rec_with_table(
cchs2017_2018_p,c(
"SMK_005","SMKG040","SMKG203_A"
)
)
tail(agecigd_2017_2018)
combined_agecigd <- suppressWarnings(merge_rec_data
(agecigd_2009_2010,agecigd_2017_2018))
head(combined_agecigd)
tail(combined_agecigd)
Age started to smoke daily - former daily smoker
Description
This function creates a continuous derived variable (SMKG207_cont) for age started to smoke daily for former daily smokers.
Usage
SMKG207_fun(SMK_030, SMKG040)
Arguments
SMK_030 |
smoked daily - lifetime (occasional/former smoker) |
SMKG040 |
age started to smoke daily - daily/former daily smoker |
Details
For CCHS 2015-2018, age started to smoke daily was combined for daily and former daily smokers.Previous cycles had separate variables for age started to smoke daily. Smoked daily in lifetime is used to define former daily smoker.
Value
value for continuous age started to smoke daily for former daily smokers in the SMKG207_cont variable
Examples
# Using SMKG207_fun() to derive age started to smoke daily values across
# CCHS cycles.
# SMKG207_fun() is specified in variable_details.csv along with the
# CCHS variables and cycles included.
# To transform SMKG207_A across cycles, use rec_with_table() for each
# CCHS cycle and specify SMKG207_A.
# For CCHS 2001-2014, only specify SMKG207_A.
# For CCHS 2015-2018, specify the parameters and SMKG207_A for former daily
# smoker age.
library(cchsflow)
agecigfd_2009_2010 <- rec_with_table(
cchs2009_2010_p, "SMKG207_A")
head(agecigfd_2009_2010)
agecigfd_2017_2018 <- rec_with_table(
cchs2017_2018_p,c(
"SMK_030","SMKG040","SMKG207_A"
)
)
tail(agecigfd_2017_2018)
combined_agecigfd <- suppressWarnings(merge_rec_data
(agecigfd_2009_2010,agecigfd_2017_2018))
head(combined_agecigfd)
tail(combined_agecigfd)
Five-item social provision scale (SPS-5)
Description
This function creates a derived variable for the five-item social provision scale (SPS_5_fun). The range is 0-20, where a higher score reflects a higher level of perceived social support.
Usage
SPS_5_fun(SPS_03, SPS_04, SPS_05, SPS_07, SPS_10)
Arguments
SPS_03 |
close relationships that provide sense of emotional security and well-being |
SPS_04 |
talk to about important decisions with someone |
SPS_05 |
relationships where competence and skill are recognized |
SPS_07 |
part of a group who share attitudes and beliefs |
SPS_10 |
strong emotional bond with a least one person |
Details
The Social Provisions Scale (SPS) is commonly used to measure social support. The ten-item social provisions scale (SPS-10) has been reduced to a five-item scale (SPS-5).Reducing the SPS-10 items by half decreases the respondent burden on surveys. SPS-5 is a valid measure of social support while maintaining adequate measurement properties (Orpana et al., 2019). Validation of SDS-5 using Canadian national survey data can be found here.
SPS-10 and their items were available in CCHS from 2011-2018.
Examples
# Using the SPS_5_fun function to create the derived SPS5_der variable
# across CCHS cycles.
# SPS_5_fun() is specified in the variable_details.csv.
# To create a harmonized SPS5_der variable across CCHS cycles, use
# rec_with_table() for each CCHS cycle and specify SPS_5_fun and the
# required base variables.
# Using merge_rec_data(), you can combine SPS5_der across cycles.
library(cchsflow)
SPS5_2011_2012 <- rec_with_table(
cchs2011_2012_p, c(
"SPS_03", "SPS_04", "SPS_05", "SPS_07", "SPS_10", "SPS5_der"
)
)
head(SPS5_2011_2012)
SPS5_2017_2018 <- rec_with_table(
cchs2017_2018_p,c(
"SPS_03", "SPS_04", "SPS_05", "SPS_07", "SPS_10", "SPS5_der"
)
)
tail(SPS5_2017_2018)
combined_SPS5 <- suppressWarnings(merge_rec_data(SPS5_2011_2012,
SPS5_2017_2018))
head(combined_SPS5)
tail(combined_SPS5)
Daily active transportation (2001-2005)
Description
This function creates a derived variable for daily time spent traveling in active ways. This includes walking and biking. This function is used for CCHS 2001-2005.
Usage
active_transport1_fun(PAC_4A_cont, PAC_4B_cont)
Arguments
PAC_4A_cont |
number of hours walk work/school in week in the past 3 months. |
PAC_4B_cont |
number of hours bike work/school in week in the past 3 months. |
Value
Continuous variable for active transportation (active_transport)
Examples
# Using active_transport1_fun() to determine daily time spent
# traveling in active ways values across CCHS 2001-2005.
# active_transport1_fun() is specified in variable_details.csv along with the CCHS
# variables and cycles included.
# To transform active_transport across cycles, use rec_with_table() for each
# CCHS cycle and specify active_transport, along with each activity variable.
# Then by using merge_rec_data(), you can combine active_transport across
# cycles
library(cchsflow)
active_transport2001 <- rec_with_table(
cchs2001_p, c(
"PAC_4A_cont", "PAC_4B_cont", "active_transport"
)
)
head(active_transport2001)
active_transport2005 <- rec_with_table(
cchs2005_p, c(
"PAC_4A_cont", "PAC_4B_cont", "active_transport"
)
)
tail(active_transport2005)
combined_active_transport <- suppressWarnings(merge_rec_data(active_transport2001,
active_transport2005))
head(combined_active_transport)
tail(combined_active_transport)
Daily active transportation (2007-2014)
Description
This function creates a derived variable for daily time spent traveling in active ways. This includes walking and biking. This function is used for CCHS 2007-2014.
Usage
active_transport2_fun(PAC_7, PAC_7A, PAC_7B_cont, PAC_8, PAC_8A, PAC_8B_cont)
Arguments
PAC_7 |
have walked to work or school in the past 3 months? |
PAC_7A |
number of times walked to work/school in the past 3 months. |
PAC_7B_cont |
number of minutes walk to work/school. |
PAC_8 |
have biked to work or school in the past 3 months? |
PAC_8A |
number of times biked to work/school in the past 3 months. |
PAC_8B_cont |
number of minutes bike to work/school. |
Value
Continuous variable for active transportation (active_transport)
Examples
# Using active_transport2_fun() to determine daily time spent
# traveling in active ways values across CCHS 2007-2014.
# active_transport2_fun() is specified in variable_details.csv along with the CCHS
# variables and cycles included.
# To transform active_transport across cycles, use rec_with_table() for each
# CCHS cycle and specify active_transport, along with each activity variable.
# Then by using merge_rec_data(), you can combine active_transport across
# cycles
library(cchsflow)
active_transport2007_2008 <- rec_with_table(
cchs2007_2008_p, c(
"PAC_7", "PAC_7A", "PAC_7B_cont", "PAC_8", "PAC_8A", "PAC_8B_cont",
"active_transport"
)
)
head(active_transport2007_2008)
active_transport2013_2014 <- rec_with_table(
cchs2013_2014_p, c(
"PAC_7", "PAC_7A", "PAC_7B_cont", "PAC_8", "PAC_8A", "PAC_8B_cont",
"active_transport"
)
)
tail(active_transport2013_2014)
combined_active_transport <- suppressWarnings(merge_rec_data(
active_transport2007_2008, active_transport2013_2014))
head(combined_active_transport)
tail(combined_active_transport)
Daily active transportation (2015-2018)
Description
This function creates a derived variable for daily time spent traveling in active ways. This includes walking and biking. This function is used for CCHS 2015-2018.
Usage
active_transport3_fun(PAYDVTTR, PAADVTRV)
Arguments
PAYDVTTR |
number of minutes of active transportation in a week for 12-17 years old. |
PAADVTRV |
number of minutes of active transportation in a week for 18+ years old. |
Value
Continuous variable for active transportation (active_transport)
Examples
# Using active_transport3_fun() to determine daily time spent
# traveling in active ways values across CCHS 2015-2018.
# active_transport3_fun() is specified in variable_details.csv along with the CCHS
# variables and cycles included.
# To transform active_transport across cycles, use rec_with_table() for each
# CCHS cycle and specify active_transport, along with each activity variable.
# Then by using merge_rec_data(), you can combine active_transport across
# cycles
library(cchsflow)
active_transport2015_2016 <- rec_with_table(
cchs2015_2016_p, c(
"PAYDVTTR", "PAADVTRV","active_transport"
)
)
head(active_transport2015_2016)
active_transport2017_2018 <- rec_with_table(
cchs2017_2018_p, c(
"PAYDVTTR", "PAADVTRV","active_transport"
)
)
tail(active_transport2017_2018)
combined_active_transport <- suppressWarnings(merge_rec_data(
active_transport2015_2016, active_transport2017_2018))
head(combined_active_transport)
tail(combined_active_transport)
Adjusted Body Mass Index (BMI) derived variable
Description
This function creates a harmonized adjusted BMI variable. A systematic review of the literature concluded that the use of self-reported data among adults underestimates weight and overestimates height, resulting in lower estimates of obesity than those obtained from measured data. Using data from the 2005 Canadian Community Health Survey (CCHS) subsample, where both measured and self-reported values were collected, correction equations have been developed (Connor Gorber et al. 2008). Differences between corrected estimates of obesity from the CCHS and measured estimates from the Canadian Health Measures Survey is monitored over time to determine if the bias in self-reported values is changing and if new correction equations need to be developed. Adjusted BMI variable is first introduced in the CCHS 2015 cycle.
adjusted_bmi_fun() creates a derived variable (HWTGCOR_der) that is harmonized across all CCHS cycles. This function takes the BMI by dividing weight by the square of height, and adds a correction value based on sex.
Usage
adjusted_bmi_fun(DHH_SEX, HWTGHTM, HWTGWTK)
Arguments
DHH_SEX |
CCHS variable for sex; 1 = male, 2 = female |
HWTGHTM |
CCHS variable for height (in meters) |
HWTGWTK |
CCHS variable for weight (in kilograms) |
Details
For HWTGCOR_der, there are no restrictions to age, height, weight, or pregnancy status. While pregnancy was consistent across all CCHS cycles, its variable (MAM_037) was not available in the PUMF CCHS datasets so it could not be harmonized and included into the function.
HWTGCOR_der uses the CCHS variables for sex, height and weight that have been transformed by cchsflow. In order to generate a value for adjusted BMI across CCHS cycles, sex, height and weight must be transformed and harmonized.
Value
numeric value for adjusted BMI in the HWTGCOR_der variable
Note
In earlier CCHS cycles (2001 and 2003), height was collected in inches; while in later CCHS cycles (2005+) it was collected in meters. To harmonize values across cycles, height was converted to meters (to 3 decimal points). Weight was collected in kilograms across all CCHS cycles, so no transformations were required in the harmonization process.
Examples
# Using adjusted_bmi_fun() to create adjusted BMI values between cycles
# adjusted_bmi_fun() is specified in variable_details.csv along with the
# CCHS variables and cycles included.
# To transform the derived BMI variable, use rec_with_table() for each cycle
# and specify HWTGCOR_der, along with sex (DHH_SEX), height (HWTGHTM) and
# weight (HWTGWTK).Then by using merge_rec_data(), you can combined
# HWTGBMI_der across cycles.
library(cchsflow)
adjustedbmi2001 <- rec_with_table(
cchs2001_p, c(
"HWTGHTM",
"HWTGWTK",
"DHH_SEX",
"HWTGCOR_der"
)
)
head(adjustedbmi2001)
adjustedbmi2011_2012 <- rec_with_table(
cchs2011_2012_p, c(
"HWTGHTM",
"HWTGWTK",
"DHH_SEX",
"HWTGCOR_der"
)
)
tail(adjustedbmi2011_2012)
combined_bmi <- merge_rec_data(adjustedbmi2001, adjustedbmi2011_2012)
head(combined_bmi)
tail(combined_bmi)
# adjusted_bmi_fun() can also generate a value for BMI if you input your sex,
# and a value for height and weight. Let's say your sex is male, height is
# 170cm (1.7m) and your weight is 50kg, your BMI can be calculated as follows:
library(cchsflow)
adjusted_BMI <- adjusted_bmi_fun(DHH_SEX = 1, HWTGHTM = 1.7, HWTGWTK = 50)
print(adjusted_BMI)
Derived needs help with tasks
Description
This derived variable (ADL_der) is based on the CCHS derived variable ADLF6R which flags respondents who need help with tasks based on their response to the various activities of daily living (ADL) variables.
Usage
adl_fun(ADL_01, ADL_02, ADL_03, ADL_04, ADL_05)
Arguments
ADL_01 |
Needs help preparing meals |
ADL_02 |
Needs help getting to appointments/errands |
ADL_03 |
Needs help doing housework |
ADL_04 |
Needs help doing personal care |
ADL_05 |
Needs help moving inside house |
Details
The CCHS derived variable ADLF6R uses different ADL variables across the various CCHS survey cycles. This newly derived variable (ADL_der) uses ADL variables that are consistent across CCHS cycles.
In the 2001 CCHS survey cycle, the ADLF6R variable examines the following ADL variables:
ADL_01 - Needs help preparing meals
ADL_02 - Needs help getting to appointments/errands
ADL_03 - Needs help doing housework
ADL_04 - Needs help doing personal care
ADL_05 - Needs help moving inside house
ADL_07 - Needs help doing heavy household chores
In the 2003-2005 CCHS survey cycles, the ADLF6R variable examines the following ADL variables:
ADL_01 - Needs help preparing meals
ADL_02 - Needs help getting to appointments/errands
ADL_03 - Needs help doing housework
ADL_04 - Needs help doing personal care
ADL_05 - Needs help moving inside house
ADL_06 - Needs help doing finances
ADL_07 - Needs help doing heavy household chores
In the 2007-2014 CCHS survey cycles, the ADLF6R variable examines the following ADL variables:
ADL_01 - Needs help preparing meals
ADL_02 - Needs help getting to appointments/errands
ADL_03 - Needs help doing housework
ADL_04 - Needs help doing personal care
ADL_05 - Needs help moving inside house
ADL_06 - Needs help doing finances
This newly derived variable (ADL_der) uses ADL_01 to ADL_05 which are consistent across all survey cycles. For any single CCHS survey year, it is appropriate to use ADLF6R. ADL_der is recommended when using multiple survey cycles.
Value
A derived variable (ADL_der) with 2 categories:
- Needs help with tasks
- Does not need help with tasks
Examples
# Using adl_fun() to create ADL_der values across CCHS cycles
# adl_fun() is specified in variable_details.csv along with the
# CCHS variables and cycles included.
# To transform ADL_der, use rec_with_table() for each CCHS cycle
# and specify ADL_der, along with the various ADL variables.
# Then by using merge_rec_data() you can combine ADL_der across cycles.
library(cchsflow)
adl2001 <- rec_with_table(
cchs2001_p, c(
"ADL_01", "ADL_02", "ADL_03", "ADL_04", "ADL_05", "ADL_der"
)
)
head(adl2001)
adl2009_2010 <- rec_with_table(
cchs2009_2010_p, c(
"ADL_01", "ADL_02", "ADL_03", "ADL_04", "ADL_05", "ADL_der"
)
)
tail(adl2009_2010)
combined_adl <- merge_rec_data(adl2001, adl2009_2010)
head(combined_adl)
tail(combined_adl)
# Using adl_fun() to generate to ADL_der based on user inputted values.
#
# Let's say you do not need help preparing meals, you need help getting to
# appointments or errands, you need help doing housework, do not need help
# doing personal care, and do not need help moving inside the house. Using
# adl_fun() we can check if you need help doing tasks
ADL_der <- adl_fun(2, 1, 1, 2, 2)
print(ADL_der)
The number of activities of daily living tasks that require help.
Description
A 6 category variable (ADL_score_5) representing the number of activities of daily living tasks that require help. This variable tallies the number of daily living tasks that a respondent requires help with based on various ADL variables that a respondent answered yes or no to. The ADL variables used are common across all CCHS cycles from 2001 to 2014.
Usage
adl_score_5_fun(ADL_01, ADL_02, ADL_03, ADL_04, ADL_05)
Arguments
ADL_01 |
Needs help preparing meals. |
ADL_02 |
Needs help getting to appointments/errands. |
ADL_03 |
Needs help doing housework. |
ADL_04 |
Needs help doing personal care. |
ADL_05 |
Needs help moving inside house. |
Value
A derived variable (ADL_score_5) with 6 categories:
0 - Needs help with 0 tasks
1 - Needs help with at least 1 task
2 - Needs help with at least 2 tasks
3 - Needs help with at least 3 tasks
4 - Needs help with at least 4 tasks
5 - Needs help with at least 5 tasks
Examples
# Use adl_score_5_fun() to create the variable ADL_score_5 across CCHS
# cycles adl_score_5_fun() is specified in variable_details.csv along with
# the CCHS variables and cycles included.
# To transform ADL_score_5, use rec_with_table() for each CCHS cycle
# and specify ADL_score_5, along with the various ADL variables.
# Then by using merge_rec_data() you can combine ADL_der across cycles.
library(cchsflow)
adl2001 <- rec_with_table(
cchs2001_p, c(
"ADL_01", "ADL_02", "ADL_03", "ADL_04", "ADL_05", "ADL_score_5"
)
)
head(adl2001)
adl2009_2010 <- rec_with_table(
cchs2009_2010_p, c(
"ADL_01", "ADL_02", "ADL_03", "ADL_04", "ADL_05", "ADL_score_5"
)
)
tail(adl2009_2010)
combined_adl <- merge_rec_data(adl2001, adl2009_2010)
head(combined_adl)
tail(combined_adl)
# Using adl_score_5_fun() to generate to ADL_score_5 based on user inputted
# values.
# Let's say you do not need help preparing meals, you need help getting to
# appointments or errands, you need help doing housework, do not need help
# doing personal care, and do not need help moving inside the house. Using
# adl_score_5_fun() we can check the number of tasks you need help with
ADL_score_5 <- adl_score_5_fun(2, 1, 1, 2, 2)
print(ADL_score_5)
Derived categorical age
Description
This is a derived categorical age variable (DHHGAGE_C) that groups various age categories across all CCHS cycles. This is based on the continuous age variable (DHHGAGE_cont) that is harmonious across all CCHS cycles.
The categories of this new age variable are as follows:
12 to 14 years
15 to 17 years
18 to 19 years
20 to 24 years
25 to 29 years
30 to 34 years
35 to 39 years
40 to 44 years
45 to 49 years
50 to 54 years
55 to 59 years
60 to 64 years
65 to 69 years
70 to 74 years
75 to 79 years
80 years or more
Usage
age_cat_fun(DHHGAGE_cont)
Arguments
DHHGAGE_cont |
continuous age variable |
Details
The categories in the grouped age variable (DHHGAGE) vary between CCHS cycles. As such, a continuous age variable (DHHGAGE_cont) was created that harmonized age across all CCHS cycles by taking the midpoint of each age category. This new age variable (DHHGAGE_C) categorizes age based on the categories used in CCHS cycles from 2007 to 2014.
Value
a categorical age variable (DHHGAGE_C)
Examples
# Using age_cat_fun() to create categorical age values from DHHGAGE_cont
# age_cat_fun() is specified in variable_details.csv along with the CCHS
# variables and cycles included.
# To generate DHHGAGE_C in a cycle, use rec_with_table() and specify
# DHHGAGE_C along with DHHGAGE_cont.
library(cchsflow)
cat_age2009_2010 <- rec_with_table(
cchs2009_2010_p, c(
"DHHGAGE_cont", "DHHGAGE_C"
)
)
Binge drinking
Description
This function creates a derived categorical variable that flags for binge drinking based on the number drinks consumed on a single day.
Usage
binge_drinker_fun(
DHH_SEX,
ALW_1,
ALW_2A1,
ALW_2A2,
ALW_2A3,
ALW_2A4,
ALW_2A5,
ALW_2A6,
ALW_2A7
)
Arguments
DHH_SEX |
sex of respondent (1 - male, 2 - female) |
ALW_1 |
Drinks in the last week (1 - yes, 2 - no) |
ALW_2A1 |
Number of drinks on Sunday |
ALW_2A2 |
Number of drinks on Monday |
ALW_2A3 |
Number of drinks on Tuesday |
ALW_2A4 |
Number of drinks on Wednesday |
ALW_2A5 |
Number of drinks on Thursday |
ALW_2A6 |
Number of drinks on Friday |
ALW_2A7 |
Number of drinks on Saturday |
Details
In health research, binge drinking is defined as having an excess amount of alcohol in a single day. For males, this is defined as having five or more drinks; and for females it is four or more drinks. In the CCHS, respondents are asked to count the number of drinks they had during each day of the last week.
Value
Categorical variable (binge_drinker) with two categories:
1 - binge drinker
2 - non-binge drinker
Examples
# Using binge_drinker_fun() to create binge_drinker values across CCHS cycles
# binge_drinker_fun() is specified in variable_details.csv along with the
# CCHS variables and cycles included.
# To transform binge_drinker, use rec_with_table() for each CCHS cycle
# and specify binge_drinker, along with the various alcohol and sex
# variables. Then by using bind_rows() you can combine binge_drinker
# across cycles.
library(cchsflow)
binge2001 <- rec_with_table(
cchs2001_p, c(
"ALW_1", "DHH_SEX", "ALW_2A1", "ALW_2A2", "ALW_2A3", "ALW_2A4",
"ALW_2A5", "ALW_2A6", "ALW_2A7", "binge_drinker"
)
)
head(binge2001)
binge2009_2010 <- rec_with_table(
cchs2009_2010_p, c(
"ALW_1", "DHH_SEX", "ALW_2A1", "ALW_2A2", "ALW_2A3", "ALW_2A4",
"ALW_2A5", "ALW_2A6", "ALW_2A7", "binge_drinker"
)
)
tail(binge2009_2010)
combined_binge <- bind_rows(binge2001, binge2009_2010)
head(combined_binge)
tail(combined_binge)
# Using binge_drinker_fun() to generate binge_drinker with user-inputted
# values.
#
# Let's say you are a male, and you had drinks in the last week. Let's say
# you had 3 drinks on Sunday, 1 drink on
# Monday, 6 drinks on Tuesday, 0 drinks on Wednesday, 3 drinks on Thurday,
# 8 drinks on Friday, and 2 drinks on Saturday. Using binge_drinker_fun(),
# we can check if you would be classified as a drinker.
binge <- binge_drinker_fun(DHH_SEX = 1, ALW_1 = 1, ALW_2A1 = 3, ALW_2A2 = 1,
ALW_2A3 = 6, ALW_2A4 = 0, ALW_2A5 = 3,
ALW_2A6 = 8, ALW_2A7 = 2)
print(binge)
Body Mass Index (BMI) derived variable
Description
This function creates a harmonized BMI variable. The BMI variable provided by the CCHS calculates BMI using methods that vary across cycles, leading to measurement error when using multiple CCHS cycles. In certain CCHS cycles (2001-2003, 2007+), there are age restrictions in which respondents under the age of 20 and over the age of 64 were not included. Across all CCHS cycles, female respondents who identified as being pregnant were excluded; and in certain CCHS cycles (2003-2007, 2013-2014), females who did not answer the pregnancy question were coded as NS (not stated) for HWTGBMI. As well, in certain CCHS cycles (2001-2003, 2009-2014), respondents outside certain height and weight ranges (0.914-2.108m for height, 0-260kg for weight) were excluded from HWTGBMI.
bmi_fun() creates a derived variable (HWTGBMI_der) that is harmonized across all CCHS cycles. This function divides weight by the square of height.
Usage
bmi_fun(HWTGHTM, HWTGWTK)
Arguments
HWTGHTM |
CCHS variable for height (in meters) |
HWTGWTK |
CCHS variable for weight (in kilograms) |
Details
For HWTGBMI_der, there are no restrictions to age, height, weight, or pregnancy status. While pregnancy was consistent across all CCHS cycles, its variable (MAM_037) was not available in the PUMF CCHS datasets so it could not be harmonized and included into the function.
For any single CCHS survey year, it is appropriate to use the CCHS BMI variable (HWTGBMI) that is also available on cchsflow. HWTGBMI_der is recommended when using multiple survey cycles.
HWTGBMI_der uses the CCHS variables for height and weight that have been transformed by cchsflow. In order to generate a value for BMI across CCHS cycles, height and weight must be transformed and harmonized.
Value
numeric value for BMI in the HWTGBMI_der variable
Note
In earlier CCHS cycles (2001 and 2003), height was collected in inches; while in later CCHS cycles (2005+) it was collected in meters. To harmonize values across cycles, height was converted to meters (to 3 decimal points). Weight was collected in kilograms across all CCHS cycles, so no transformations were required in the harmonization process.
Examples
# Using bmi_fun() to create BMI values between cycles
# bmi_fun() is specified in variable_details.csv along with the
# CCHS variables and cycles included.
# To transform the derived BMI variable, use rec_with_table() for each cycle
# and specify HWTGBMI_der, along with height (HWTGHTM) and weight (HWTGWTK).
# Then by using merge_rec_data(), you can combined HWTGBMI_der across
# cycles.
library(cchsflow)
bmi2001 <- rec_with_table(
cchs2001_p, c(
"HWTGHTM",
"HWTGWTK", "HWTGBMI_der"
)
)
head(bmi2001)
bmi2011_2012 <- rec_with_table(
cchs2011_2012_p, c(
"HWTGHTM",
"HWTGWTK", "HWTGBMI_der"
)
)
tail(bmi2011_2012)
combined_bmi <- merge_rec_data(bmi2001, bmi2011_2012)
head(combined_bmi)
tail(combined_bmi)
# Using bmi_fun() to generate a BMI value with user inputted height and
# weight values. bmi_fun() can also generate a value for BMI if you input a
# value for height and weight. Let's say your height is 170cm (1.7m) and
# your weight is 50kg, your BMI can be calculated as follows:
library(cchsflow)
BMI <- bmi_fun(HWTGHTM = 1.7, HWTGWTK = 50)
print(BMI)
Categorical BMI (international standard)
Description
This function creates a categorical derived variable (HWTGBMI_der_cat4) that categorizes derived BMI (HWTGBMI_der).
Usage
bmi_fun_cat(HWTGBMI_der)
Arguments
HWTGBMI_der |
derived variable that calculates numeric value for BMI.
See |
Details
The categories were based on international standards and are divided into four categories: underweight for BMI < 18.5 (1), normal weight for BMI between 18.5 to 25 (2), overweight for BMI between 25 to 30 (3), and obese for BMI over 30 (4).
HWTGBMI_der_cat4 uses the derived variable HWTGBMI_der. HWTGBMI_der uses height and weight that have been transformed by cchsflow. In order to categorize BMI across CCHS cycles, height and weight variables must be transformed and harmonized.
Value
value for BMI categories in the HWTGBMI_der_cat4 variable.
Examples
# Using bmi_fun_cat() to categorize BMI across CCHS cycles
# bmi_fun_cat() is specified in variable_details.csv along with the
# CCHS variables and cycles included.
# To transform HWTGBMI_der_cat4 across all cycles, use rec_with_table() for
# each CCHS cycle.
# Since HWTGBMI_der is also a derived variable, you will have to specify
# the variables that are derived from it.
library(cchsflow)
bmi_cat_2009_2010 <- rec_with_table(
cchs2009_2010_p, c(
"HWTGHTM",
"HWTGWTK",
"HWTGBMI_der",
"HWTGBMI_der_cat4"
)
)
head(bmi_cat_2009_2010)
bmi_cat_2011_2012 <- rec_with_table(
cchs2011_2012_p,c(
"HWTGHTM",
"HWTGWTK",
"HWTGBMI_der",
"HWTGBMI_der_cat4"
)
)
tail(bmi_cat_2011_2012)
combined_bmi_cat <- suppressWarnings(merge_rec_data
(bmi_cat_2009_2010,bmi_cat_2011_2012))
head(combined_bmi_cat)
tail(combined_bmi_cat)
2001 CCHS PUMF subset data (200 respondents)
Description
This is a subset of 200 observations from the 2001 cycle of the Canadian Community Health Survey (CCHS) Public Use Microdata file (PUMF) dataset. The CCHS survey is conducted by Statistics Canada.
Details
See here for the open license. Source from Statistics Canada, Canadian Community Health Survey PUMF, accessed Jan 2020. Reproduced and distributed on an "as is" basis with the permission of Statistics Canada.
Long name: cchs-82M0013-E-2001-c1-1-general-file
Additional documentation (PDFs): https://osf.io/hkuy3/
Value
cchs2001_p |
a data frame |
Source
https://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&Id=3359
Examples
data(cchs2001_p)
str(cchs2001_p)
2003 CCHS PUMF subset data (200 respondents)
Description
This is a subset of 200 observations from the 2003 cycle of the Canadian Community Health Survey (CCHS) Public Use Microdata file (PUMF) dataset. The CCHS survey is conducted by Statistics Canada.
Details
See here for the open license. Source from Statistics Canada, Canadian Community Health Survey PUMF, accessed Jan 2020. Reproduced and distributed on an "as is" basis with the permission of Statistics Canada.
Long name: cchs-82M0013-E-2003-c2-1-General File
Additional documentation (PDFs): https://osf.io/hkuy3/
Value
cchs2003_p |
a data frame |
Source
https://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&Id=4995
Examples
data(cchs2003_p)
str(cchs2003_p)
2005 CCHS PUMF subset data (200 respondents)
Description
This is a subset of 200 observations from the 2005 cycle of the Canadian Community Health Survey (CCHS) Public Use Microdata file (PUMF) dataset. The CCHS survey is conducted by Statistics Canada.
Details
See here for the open license. Source from Statistics Canada, Canadian Community Health Survey PUMF, accessed Jan 2020. Reproduced and distributed on an "as is" basis with the permission of Statistics Canada.
Long name: cchs-82M0013-E-2005-c3-1-main-file
Additional documentation (PDFs): https://osf.io/hkuy3/
Value
cchs2005_p |
a data frame |
Source
https://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&Id=22642
Examples
data(cchs2005_p)
str(cchs2005_p)
2007-2008 CCHS PUMF subset data (200 respondents)
Description
This is a subset of 200 observations from the 2007-2008 cycle of the Canadian Community Health Survey (CCHS) Public Use Microdata file (PUMF) dataset. The CCHS survey is conducted by Statistics Canada.
Details
See here for the open license. Source from Statistics Canada, Canadian Community Health Survey PUMF, accessed Jan 2020. Reproduced and distributed on an "as is" basis with the permission of Statistics Canada.
Long name: cchs-E-2007-2008-AnnualComponent
Additional documentation (PDFs): https://osf.io/hkuy3/
Value
cchs2007_2008_p |
a data frame |
Source
https://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&Id=29539
Examples
data(cchs2007_2008_p)
str(cchs2007_2008_p)
2009-2010 CCHS PUMF subset data (200 respondents)
Description
This is a subset of 200 observations from the 2009-2010 cycle of the Canadian Community Health Survey (CCHS) Public Use Microdata file (PUMF) dataset. The CCHS survey is conducted by Statistics Canada.
Details
See here for the open license. Source from Statistics Canada, Canadian Community Health Survey PUMF, accessed Jan 2020. Reproduced and distributed on an "as is" basis with the permission of Statistics Canada.
Long name: CCHS-82M0013-E-2009-2010-Annualcomponent
Additional documentation (PDFs): https://osf.io/hkuy3/
Value
cchs2009_2010_p |
a data frame |
Source
https://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&Id=67251
Examples
data(cchs2009_2010_p)
str(cchs2009_2010_p)
2009 CCHS synthetic subset data (200 respondents)
Description
This is a subset of 200 observations from the 2009 cycle of the Canadian Community Health Survey (CCHS) synthetic dataset. The CCHS survey is conducted by Statistics Canada.
Details
NOTE: this subset of respondents may also be in the 2009 synthetic subset. Please see the "CCHS datasets that overlap each other" article to see how the two datasets contain overlap.
See here for the open license. Source from Statistics Canada, Canadian Community Health Survey PUMF, accessed May 2022. Reproduced and distributed on an "as is" basis with the permission of Statistics Canada.
Long name: synthetic-CCHS-E-2009-FullSampleFile_F1
Additional documentation (PDFs): https://osf.io/hkuy3/
Value
cchs2009_s |
a data frame |
Source
https://www.statcan.gc.ca/en/statistical-programs/document/3226_D56_T9_V1
Examples
data(cchs2009_s)
str(cchs2009_s)
2010 CCHS PUMF subset data (200 respondents)
Description
This is a subset of 200 observations from the 2010 cycle of the Canadian Community Health Survey (CCHS) Public Use Microdata file (PUMF) dataset. The CCHS survey is conducted by Statistics Canada.
Details
NOTE: this subset of respondents may also be in the 2009-2010 PUMF subset. Please see the "CCHS datasets that overlap each other" article to see how the two datasets contain overlap.
See here for the open license. Source from Statistics Canada, Canadian Community Health Survey PUMF, accessed Jan 2020. Reproduced and distributed on an "as is" basis with the permission of Statistics Canada.
Long name: CCHS-82M0013-E-2010-AnnualComponent
Additional documentation (PDFs): https://osf.io/hkuy3/
Value
cchs2010_p |
a data frame |
Source
https://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&Id=81424
Examples
data(cchs2010_p)
str(cchs2010_p)
2010 CCHS synthetic subset data (200 respondents)
Description
This is a subset of 200 observations from the 2010 cycle of the Canadian Community Health Survey (CCHS) synthetic dataset. The CCHS survey is conducted by Statistics Canada.
Details
NOTE: this subset of respondents may also be in the 2010 synthetic subset. Please see the "CCHS datasets that overlap each other" article to see how the two datasets contain overlap.
See here for the open license. Source from Statistics Canada, Canadian Community Health Survey PUMF, accessed May 2022. Reproduced and distributed on an "as is" basis with the permission of Statistics Canada.
Long name: synthetic-CCHS-E-2010-AnnualComponent_F1
Additional documentation (PDFs): https://osf.io/hkuy3/
Value
cchs2010_s |
a data frame |
Source
https://www.statcan.gc.ca/en/statistical-programs/document/3226_D56_T9_V1
Examples
data(cchs2010_s)
str(cchs2010_s)
2011-2012 CCHS PUMF subset data (200 respondents)
Description
This is a subset of 200 observations from the 2011-2012 cycle of the Canadian Community Health Survey (CCHS) Public Use Microdata file (PUMF) dataset. The CCHS survey is conducted by Statistics Canada.
Details
See here for the open license. Source from Statistics Canada, Canadian Community Health Survey PUMF, accessed Jan 2020. Reproduced and distributed on an "as is" basis with the permission of Statistics Canada.
Long name: cchs-82M0013-E-2011-2012-Annual-component
Additional documentation (PDFs): https://osf.io/hkuy3/
Value
cchs2011_2012_p |
a data frame |
Source
https://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&Id=114112
Examples
data(cchs2011_2012_p)
str(cchs2011_2012_p)
2012 CCHS PUMF subset data (200 respondents)
Description
This is a subset of 200 observations from the 2012 cycle of the Canadian Community Health Survey (CCHS) Public Use Microdata file (PUMF) dataset. The CCHS survey is conducted by Statistics Canada.
Details
NOTE: this subset of respondents may also be in the 2011-2012 PUMF subset. Please see the "CCHS datasets that overlap each other" article to see how the two datasets contain overlap.
See here for the open license. Source from Statistics Canada, Canadian Community Health Survey PUMF, accessed Jan 2020. Reproduced and distributed on an "as is" basis with the permission of Statistics Canada.
Long name: cchs-82M0013-E-2012-Annual-component
Additional documentation (PDFs): https://osf.io/hkuy3/
Value
cchs2012_p |
a data frame |
Source
https://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&Id=135927
Examples
data(cchs2012_p)
str(cchs2012_p)
2012 CCHS synthetic subset data (200 respondents)
Description
This is a subset of 200 observations from the 2012 cycle of the Canadian Community Health Survey (CCHS) synthetic dataset. The CCHS survey is conducted by Statistics Canada.
Details
NOTE: this subset of respondents may also be in the 2012 synthetic subset. Please see the "CCHS datasets that overlap each other" article to see how the two datasets contain overlap.
See here for the open license. Source from Statistics Canada, Canadian Community Health Survey PUMF, accessed May 2022. Reproduced and distributed on an "as is" basis with the permission of Statistics Canada.
Long name: synthetic-CCHS-E-2012-AnnualComponent_F1
Additional documentation (PDFs): https://osf.io/hkuy3/
Value
cchs2012_s |
a data frame |
Source
https://www.statcan.gc.ca/en/statistical-programs/document/3226_D56_T9_V1
Examples
data(cchs2012_s)
str(cchs2012_s)
2013-2014 CCHS PUMF subset data (200 respondents)
Description
This is a subset of 200 observations from the 2013-2014 cycle of the Canadian Community Health Survey (CCHS) Public Use Microdata file (PUMF) dataset. The CCHS survey is conducted by Statistics Canada.
Details
See here for the open license. Source from Statistics Canada, Canadian Community Health Survey PUMF, accessed Jan 2020. Reproduced and distributed on an "as is" basis with the permission of Statistics Canada.
Long name: cchs-82M0013-E-2013-2014-Annual-component
Additional documentation (PDFs): https://osf.io/hkuy3/
Value
cchs2013_2014_p |
a data frame |
Source
https://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&Id=144170
Examples
data(cchs2013_2014_p)
str(cchs2013_2014_p)
2014 CCHS PUMF subset data (200 respondents)
Description
This is a subset of 200 observations from the 2014 cycle of the Canadian Community Health Survey (CCHS) Public Use Microdata file (PUMF) dataset. The CCHS survey is conducted by Statistics Canada.
Details
NOTE: this subset of respondents may also be in the 2013-2014 PUMF subset. Please see the "CCHS datasets that overlap each other" article to see how the two datasets contain overlap.
See here for the open license. Source from Statistics Canada, Canadian Community Health Survey PUMF, accessed Jan 2020. Reproduced and distributed on an "as is" basis with the permission of Statistics Canada.
Long name: cchs-82M0013-E-2014-Annual-component
Additional documentation (PDFs): https://osf.io/hkuy3/
Value
cchs2014_p |
a data frame |
Source
https://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&Id=164081
Examples
data(cchs2014_p)
str(cchs2014_p)
2015-2016 CCHS PUMF subset data (200 respondents)
Description
This is a subset of 200 observations from the 2015-2016 cycle of the Canadian Community Health Survey (CCHS) Public Use Microdata file (PUMF) dataset. The CCHS survey is conducted by Statistics Canada.
Details
NOTE: this subset of respondents may also be in the 2015-2016 PUMF subset. Please see the "CCHS datasets that overlap each other" article to see how the two datasets contain overlap.
See here for the open license. Source from Statistics Canada, Canadian Community Health Survey PUMF, accessed Oct 2021. Reproduced and distributed on an "as is" basis with the permission of Statistics Canada.
Long name: cchs-82M0013-E-2015-2016-Annual-component
Additional documentation (PDFs): https://osf.io/hkuy3/
Value
cchs2015_2016_p |
a data frame |
Source
https://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&Id=238854
Examples
data(cchs2015_2016_p)
str(cchs2015_2016_p)
2017-2018 CCHS PUMF subset data (200 respondents)
Description
This is a subset of 200 observations from the 2017-2018 cycle of the Canadian Community Health Survey (CCHS) Public Use Microdata file (PUMF) dataset. The CCHS survey is conducted by Statistics Canada.
Details
NOTE: this subset of respondents may also be in the 2017-2018 PUMF subset. Please see the "CCHS datasets that overlap each other" article to see how the two datasets contain overlap.
See here for the open license. Source from Statistics Canada, Canadian Community Health Survey PUMF, accessed Oct 2021. Reproduced and distributed on an "as is" basis with the permission of Statistics Canada.
Long name: cchs-82M0013-E-2017-2018-Annual-component
Additional documentation (PDFs): https://osf.io/hkuy3/
Value
cchs2017_2018_p |
a data frame |
Source
https://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&Id=329241
Examples
data(cchs2017_2018_p)
str(cchs2017_2018_p)
Compare Value Based On Interval
Description
Compare values on the scientific notation interval
Usage
compare_value_based_on_interval(
left_boundary,
right_boundary,
data,
compare_columns,
interval
)
Arguments
left_boundary |
the min value |
right_boundary |
the max value |
data |
the data that contains values being compared |
compare_columns |
The columns inside data being checked |
interval |
The scientific notation interval |
Value
a boolean vector containing true for rows where the comparison is true
Diet score
Description
This function creates a derived diet variable (diet_score) based on consumption of fruit, salad, potatoes, carrots, other vegetables and juice. 2 baseline points plus summation of total points for diet attributes. Negative overall scores are recoded to 0, resulting in a range from 0 to 10.
1 point per daily fruit and vegetable consumption, excluding fruit juice (maximum 8 points).
-2 points for high potato intake (>=7 (males), >=5 (females) times/week)
-2 points for no carrot intake
-2 points per daily frequency of fruit juice consumption greater than once/day (maximum -10 points)
Usage
diet_score_fun(FVCDFRU, FVCDSAL, FVCDPOT, FVCDCAR, FVCDVEG, FVCDJUI, DHH_SEX)
Arguments
FVCDFRU |
daily consumption of fruit |
FVCDSAL |
daily consumption of green salad |
FVCDPOT |
daily consumption of potatoes |
FVCDCAR |
daily consumption of carrots |
FVCDVEG |
daily consumption of other vegetables |
FVCDJUI |
daily consumption of fruit juice |
DHH_SEX |
sex; 1 = male, 2 = female |
Details
While diet score can be calculated for all survey respondents, in the 2005 CCHS survey cycle, fruit and vegetable consumption was an optional section in which certain provinces had opted in to be asked to respondents. In this survey cycle, fruit and vegetable consumption was asked to respondents in British Columbia, Ontario, Alberta, and Prince Edward Island. As such, diet score has a large number of missing respondents for this cycle.
Examples
# Using the diet_score_fun function to create the derived diet variable
# across CCHS cycles.
# diet_score_fun() is specified in the variable_details.csv.
# To create a harmonized diet_score variable across CCHS cycles, use
# rec_with_table() for each CCHS cycle and specify diet_score_fun and the
# required base variables.
# Using merge_rec_data(), you can combine diet_score across cycles.
library(cchsflow)
diet_score2009_2010 <- rec_with_table(
cchs2009_2010_p, c(
"FVCDFRU", "FVCDSAL", "FVCDPOT", "FVCDCAR", "FVCDVEG", "FVCDJUI",
"DHH_SEX", "diet_score"
)
)
head(diet_score2009_2010)
diet_score2011_2012 <- rec_with_table(
cchs2011_2012_p,c(
"FVCDFRU", "FVCDSAL", "FVCDPOT", "FVCDCAR", "FVCDVEG", "FVCDJUI",
"DHH_SEX", "diet_score"
)
)
tail(diet_score2011_2012)
combined_diet_score <- suppressWarnings(merge_rec_data(diet_score2009_2010,
diet_score2011_2012))
head(combined_diet_score)
tail(combined_diet_score)
Categorized diet score
Description
This function creates a categorical derived diet variable (diet_score_cat3) that categorizes derived diet score (diet_score).
Usage
diet_score_fun_cat(diet_score)
Arguments
diet_score |
derived variable that calculates diet score.
See |
Details
The diet score is based on consumption of fruit, salad, potatoes, carrots, other vegetables and juice. 2 baseline points plus summation of total points for diet attributes. Negative overall scores are recoded to 0, resulting in a range from 0 to 10.The categories were based on the Mortality Population Risk Tool (Douglas Manuel et al. 2016).
diet_score_cat3 uses the derived variable diet_score. diet_score uses sex, and fruit and vegetable variables that have been transformed by cchsflow (see documentation on diet_score). In order to categorize diet across CCHS cycles, sex, and fruit and vegetable variables must be transformed and harmonized.
Value
value for diet score categories using diet_score_cat3 variable.
Examples
# Using the diet_score_fun_cat function to categorize the derived diet
# variable across CCHS cycles.
# diet_score_fun_cat() is specified in the variable_details.csv.
# To create a harmonized diet_score_cat3 variable across CCHS cycles, use
# rec_with_table() for each CCHS cycle.
# Since diet_score is also a derived variable, you will have to specify
# the variables that are derived from it.
# Using merge_rec_data(), you can combine diet_score_cat3 across cycles.
library(cchsflow)
diet_score_cat2009_2010 <- rec_with_table(
cchs2009_2010_p, c(
"FVCDFRU", "FVCDSAL", "FVCDPOT", "FVCDCAR", "FVCDVEG", "FVCDJUI",
"DHH_SEX", "diet_score", "diet_score_cat3"
)
)
head(diet_score_cat2009_2010)
diet_score_cat2011_2012 <- rec_with_table(
cchs2011_2012_p,c(
"FVCDFRU", "FVCDSAL", "FVCDPOT", "FVCDCAR", "FVCDVEG", "FVCDJUI",
"DHH_SEX", "diet_score", "diet_score_cat3"
)
)
tail(diet_score_cat2011_2012)
combined_diet_score_cat <- suppressWarnings(merge_rec_data(
diet_score_cat2009_2010, diet_score_cat2011_2012))
head(combined_diet_score_cat)
tail(combined_diet_score_cat)
Daily energy expenditure in leisure activity
Description
This function creates a derived variable for daily leisure energy expenditure.A MET is a conceptual value that represents energy expended during physical activity. The volume of activity is calculated by multiplying the amount of minutes of activity (by level of intensity) by the MET value associated with that intensity. A MET (metabolic equivalent) is the energy cost of activity expressed as kilocalories expended per kilogram of body weight per hour of activity.
In CCHS 2001-2014, PACDEE is the variable used to determine the daily expenditure of leisure activity for all ages. In CCHS 2015-2018, ages 12-17 and 18+ years old have separate activity variables, where 12-17 year olds use PAY_XXX and 18+ year olds use PAA_XXX. Leisure activity is not directly measured. We used the derived variable, PAADVVOL, and removed active transportation in the new function. With this function, we combined leisure activity for ages 12+. We calculate the daily energy expenditure which uses the frequency and duration per session of the physical activity as well as the MET value (3 METS for leisure and 6 METS for vigorous activity).
EE (Daily Energy Expenditure) = ((N X D X METvalue) / 60)/7 Where: N = the number of times a respondent engaged in an activity over a 7 day period D = the average duration in minutes of the activity MET value = the energy cost of the activity expressed as kilocalories expended per kilogram of body weight per hour of activity (kcal/kg per hour)
Usage
energy_exp_fun(
DHHGAGE_cont,
PAA_045,
PAA_050,
PAA_075,
PAA_080,
PAADVDYS,
PAADVVIG,
PAYDVTOA,
PAYDVADL,
PAYDVVIG,
PAYDVDYS
)
Arguments
DHHGAGE_cont |
continuous age variable. |
PAA_045 |
number of hours of sports, fitness, or recreational activity that make you sweat or breathe harder for CCHS 2015-2018 for 18+ years old. |
PAA_050 |
number of minutes of sports, fitness, or recreational activity that make you sweat or breathe harder for CCHS 2015-2018 for 18+ years old. |
PAA_075 |
number of hours of other physical activity while at work, home or volunteering for CCHS 2015-2018 for 18+ years old. |
PAA_080 |
number of minutes of other physical activity while at work, home or volunteering for CCHS 2015-2018 for 18+ years old. |
PAADVDYS |
number of active days - 7 day for CCHS 2015-2018 for 18+ years old. |
PAADVVIG |
number of minutes of vigorous activity over 7 days or CCHS 2015-2018 for 18+ years old. |
PAYDVTOA |
total minutes of other activities - 7 day for CCHS 2015-2018 for 12-17 years old. |
PAYDVADL |
total minutes of physical activity - leisure - 7 day for CCHS 2015-2018 for 12-17 years old. |
PAYDVVIG |
total minutes - vigorous physical activity - 7 d for CCHS 2015-2018 for 12-17 years old. |
PAYDVDYS |
total days physically active - 7 day for CCHS 2015-2018 for 12-17 years old. |
Value
Continuous variable for energy expenditure (energy_exp)
Examples
# Using energy_exp_fun() to create energy expenditure values across CCHS
# cycles
# energy_exp_fun() is specified in variable_details.csv along with the CCHS
# variables and cycles included.
# To transform energy_exp across cycles, use rec_with_table() for each
# CCHS cycle and specify energy_exp, along with each activity variable.
# Then by using merge_rec_data(), you can combine energy_exp across
# cycles
library(cchsflow)
energy_exp2015_2016 <- rec_with_table(
cchs2015_2016_p, c(
"DHHGAGE_cont", "PAA_045", "PAA_050", "PAA_075", "PAA_080", "PAADVDYS",
"PAADVVIG", "PAYDVTOA", "PAYDVADL", "PAYDVVIG", "PAYDVDYS", "energy_exp"
)
)
head(energy_exp2015_2016)
energy_exp2017_2018 <- rec_with_table(
cchs2017_2018_p, c(
"DHHGAGE_cont", "PAA_045", "PAA_050", "PAA_075", "PAA_080", "PAADVDYS",
"PAADVVIG", "PAYDVTOA", "PAYDVADL", "PAYDVVIG", "PAYDVDYS", "energy_exp"
)
)
tail(energy_exp2015_2016)
combined_energy_exp <- suppressWarnings(merge_rec_data(energy_exp2015_2016,
energy_exp2017_2018))
head(combined_energy_exp)
tail(combined_energy_exp)
Food insecurity
Description
NOTE: this is not a function.
This is a derived variable that uses the different food insecurity variables from all CCHS cycles to generate food_insecurity_der that is harmonized across all cycles. food_insecurity_der is a categorical variable with two categories:
no food insecurity in the last 12 months
food insecurity in the last 12 months
Usage
food_insecurity_der(FINF1, FSCDHFS, FSCDHFS2)
Arguments
FINF1 |
variable used in 2001 and 2003 survey cycles indicating food insecurity in the past 12 months |
FSCDHFS |
variable used in the 2005 survey cycle measuring food insecurity & hunger in the last 12 months |
FSCDHFS2 |
variable used in 2007-2014 survey cycles measuring household food insecurity in the last 12 months |
Details
Food insecurity is measured differently across CCHS cycles. In 2001 and 2003, FINF1 is used; in 2005, FSCDHFS is used; and in 2007 to 2014, FSCDHFS2 is used. Each variable examines food insecurity in the household over the past 12 months, but use different base variables to derive food insecurity.
If you are using cchsflow for CCHS survey years that use consistent food insecurity variables, it is appropriate to use FINF1, FSCDHFS, or FSCDHFS2 that are available on cchsflow. If you are using cchsflow for only the 2001 and 2003 cycles, it is appropriate to use FINF1. If you are using cchsflow for only the 2005 cycle, FSCDHFS is appropriate. If you are using cchsflow for cycles between 2007 and 2014, FSCDHFS2 is appropriate. For multiple CCHS survey years that do not use the same food insecurity variables (i.e. using cchsflow for years 2001 to 2007), food_insecurity_der is recommended.
Examples
library(cchsflow)
?food_insecurity_der
Get Data Variable Name
Description
Retrieves the name of the column inside data to use for calculations
Usage
get_data_variable_name(
data_name,
data,
row_being_checked,
variable_being_checked
)
Arguments
data_name |
name of the database being checked |
data |
database being checked |
row_being_checked |
the row from variable details that contains information on this variable |
variable_being_checked |
the name of the recoded variable |
Value
the data equivalent of variable_being_checked
if_else2
Description
Custom ifelse function that evaluates missing (NA) values. If the logical argument (x) compares to a value that is 'NA', it is set to 'FALSE'
Usage
if_else2(x, a, b)
Arguments
x |
A logical argument |
a |
value if 'x' is 'TRUE' |
b |
value if 'x' is 'FALSE' |
Details
unlike the base ifelse() function, if_else2() is able to evaluate NA as either a or b. In base ifelse(), anything compared to NA will produce NA, which can break a function. When dealing with large datasets like the CCHS, there are many missing (NA) values. That means a special ifelse function like if_else2() is needed in order for other functions to not break
Value
a or b based on the evaluation of x
Examples
age <- 12
status <- if_else2((age < 18), "child", "invalid age")
print(status)
age <- NA
status <- if_else2((age < 18), "child", "invalid age")
print(status)
Immigration by ethnicity and settlement
Description
This function creates a categorical variable based on immigrant status (SDCFIMM), country of birth (SDCGCBG), ethnicity (SDCGCGT), and time in Canada (SDCGRES).
Usage
immigration_fun(SDCFIMM, SDCGCBG, SDCGCGT, SDCGRES)
Arguments
SDCFIMM |
Immigrant status (1-immigrant, 2-non-immigrant) |
SDCGCBG |
Country of birth (1-Canada, 2-Outside of Canada) |
SDCGCGT |
Cultural or racial origin (1-white, 2-visible minority) |
SDCGRES |
Length/time in Canada since immigration (1- 0-9 years, 2- 10+ years) |
Details
immigration_der uses the CCHS variables that have been transformed by cchsflow. In order to generate a value for BMI across CCHS cycles, the following SDC variables must be transformed and harmonized.
Value
Categorical variable (immigration_der) with six categories:
1 - White Canada-born
2 - Non-white Canadian born
3 - White immigrant born outside of Canada (0-9 years in Canada)
4 - Non-white immigrant born outside of Canada (0-9 years in Canada)
5 - White immigrant born outside of Canada (10+ years in Canada)
6 - Non-white immigrant born outside of Canada (10+ years in Canada)
Examples
# Using immigration_fun() to create immigration_der values across CCHS cycles
# immigration_fun() is specified in variable_details.csv along with the
# CCHS variables and cycles included.
# To transform immigration_der, use rec_with_table() for each CCHS cycle
# and specify immigration_der, along with the various SDC variables.
# Then by using merge_rec_data() you can combine immigration_der across cycles.
library(cchsflow)
immigration2001 <- rec_with_table(
cchs2001_p, c(
"SDCFIMM", "SDCGCBG", "SDCGCGT", "SDCGRES", "immigration_der"
)
)
head(immigration2001)
immigration2009_2010 <- rec_with_table(
cchs2009_2010_p, c(
"SDCFIMM", "SDCGCBG", "SDCGCGT", "SDCGRES", "immigration_der"
)
)
tail(immigration2009_2010)
combined_immigration <- merge_rec_data(immigration2001, immigration2009_2010)
head(combined_immigration)
tail(combined_immigration)
is equal
Description
Function to compare even with NA present This function returns TRUE wherever elements are the same, including NA's, and false everywhere else.
Usage
is_equal(v1, v2)
Arguments
v1 |
variable 1 |
v2 |
variable 2 |
Value
boolean value of whether or not v1 and v2 are equal
Examples
library(cchsflow)
is_equal(1,2)
# FALSE
is_equal(1,1)
# TRUE
1==NA
# NA
is_equal(1,NA)
# FALSE
NA==NA
# NA
is_equal(NA,NA)
# TRUE
label_data
Description
Attaches labels to the DataToLabel to preserve metadata
Usage
label_data(label_list, data_to_label)
Arguments
label_list |
the label list object that contains extracted labels from variable details |
data_to_label |
The data that is to be labeled |
Value
Returns labeled data
Long term risks due to drinking
Description
This function creates a categorical variable that flags for increased long term health risks due to their drinking habits, according to Canada's Low-Risk Alcohol Drinking Guideline.
Usage
low_drink_long_fun(
DHH_SEX,
ALWDWKY,
ALC_1,
ALW_1,
ALW_2A1,
ALW_2A2,
ALW_2A3,
ALW_2A4,
ALW_2A5,
ALW_2A6,
ALW_2A7
)
Arguments
DHH_SEX |
Sex of respondent (1 - male, 2 - female) |
ALWDWKY |
Number of drinks consumed in the past week |
ALC_1 |
Drinks in the past year (1 - yes, 2 - no) |
ALW_1 |
Drinks in the last week (1 - yes, 2 - no) |
ALW_2A1 |
Number of drinks on Sunday |
ALW_2A2 |
Number of drinks on Monday |
ALW_2A3 |
Number of drinks on Tuesday |
ALW_2A4 |
Number of drinks on Wednesday |
ALW_2A5 |
Number of drinks on Thursday |
ALW_2A6 |
Number of drinks on Friday |
ALW_2A7 |
Number of drinks on Saturday |
Details
The classification of drinkers according to their long term health risks comes from guidelines in Alcohol and Health in Canada: A Summary of Evidence and Guidelines for Low-risk Drinking, and is based on the alcohol consumption reported over the past week. Short-term or acute risks include injury and overdose.
Categories are based on CCHS 2015-2016's variable (ALWDVLTR) where long term health risk are increased when drinking more than 10 drinks a week for women, with no more than 2 drinks a day most days, and more than 15 drinks a week for men, with no more than 3 drinks a day most days.
See https://osf.io/ykau5/ for more details on the guideline. See https://osf.io/ycxaq/ for more details on the derivation of the function on page 8.
Value
Categorical variable (ALWDVLTR_der) with two categories:
1 - Increased long term health risk
2 - No increased long term health risk
Examples
# Using low_drink_long_fun() to create ALWDVLTR_der values across CCHS cycles
# low_drink_long_fun() is specified in variable_details.csv along with the
# CCHS variables and cycles included.
# To transform ALWDVLTR_der, use rec_with_table() for each CCHS cycle
# and specify ALWDVLTR_der, along with the various alcohol and sex
# variables.
# Using merge_rec_data(), you can combine ALWDVLTR_der across cycles.
library(cchsflow)
long_low_drink2001 <- rec_with_table(
cchs2001_p, c(
"ALW_1", "DHH_SEX", "ALW_2A1", "ALW_2A2", "ALW_2A3", "ALW_2A4",
"ALW_2A5", "ALW_2A6", "ALW_2A7", "ALWDWKY", "ALC_1","ALWDVLTR_der"
)
)
head(long_low_drink2001)
long_low_drink2009_2010 <- rec_with_table(
cchs2009_2010_p, c(
"ALW_1", "DHH_SEX", "ALW_2A1", "ALW_2A2", "ALW_2A3", "ALW_2A4",
"ALW_2A5", "ALW_2A6", "ALW_2A7", "ALWDWKY", "ALC_1","ALWDVLTR_der"
)
)
tail(long_low_drink2009_2010)
combined_long_low_drink <- bind_rows(long_low_drink2001,
long_low_drink2009_2010)
head(combined_long_low_drink)
tail(combined_long_low_drink)
# Using low_drink_long_fun() to generate ALWDVLTR_der with user-inputted
# values.
#
# Let's say you are a male, you had drinks in the last week and in the last
# year. Let's say you had 5 drinks on Sunday, 1 drink on Monday, 6 drinks on
# Tuesday, 4 drinks on Wednesday, 4 drinks on Thursday, 8 drinks on Friday,
# and 2 drinks on Saturday with a total of 30 drinks in a week.
# Using low_drink_long_fun(), we can check if you would be classified as
# having an increased long term health risk due to drinking.
long_term_drink <- low_drink_long_fun(DHH_SEX = 1, ALWDWKY = 30, ALC_1 = 1,
ALW_1 = 1, ALW_2A1 = 5, ALW_2A2 = 1, ALW_2A3 = 6, ALW_2A4 = 4, ALW_2A5 = 4,
ALW_2A6 = 8, ALW_2A7 = 2)
print(long_term_drink)
Low drinking score (all cycles)
Description
This function creates a derived variable based on their drinking habits and flags for health and social problems from their pattern of alcohol use according to Canada's Low-Risk Alcohol Drinking Guideline.
Usage
low_drink_score_fun(DHH_SEX, ALWDWKY)
Arguments
DHH_SEX |
Sex of respondent (1 - male, 2 - female) |
ALWDWKY |
Number of drinks consumed in the past week |
Details
The low risk drinking score is based on the scoring system in Canada's Low-Risk Alcohol Drinking Guideline. The score is divided into two steps. Step 1 allocates points based on sex and the number of drinks that you usually have each week. In step 2, one point will be awarded for each item that is true related to drinking habits. The total score is obtained from adding the points in step 1 and step 2.
Value
Low risk drinking score (low_drink_score) with four categories:
1 - Low risk (0 points)
2 - Marginal risk (1-2 points)
3 - Medium risk (3-4 points)
4 - High risk (5-9 points)
Note
Step 2 is not included in this function because the questions in step 2 are not asked in any of the CCHS cycles. The score is only based on step 1.
See https://osf.io/eprg7/ for more details on the guideline and score.
Examples
# Using low_drink_score_fun() to create low_drink_score values across
# CCHS cycles low_drink_score_fun() is specified in variable_details.csv
# along with the CCHS variables and cycles included.
# To transform low_drink_score, use rec_with_table() for each CCHS cycle
# and specify low_drink_score, along with the various alcohol and sex
# variables.
# Using merge_rec_data(), you can combine low_drink_score across cycles.
library(cchsflow)
low_drink2001 <- rec_with_table(
cchs2001_p, c(
"DHH_SEX", "ALWDWKY", "low_risk_score"
)
)
head(low_drink2001)
low_drink2009_2010 <- rec_with_table(
cchs2009_2010_p, c(
"DHH_SEX", "ALWDWKY", "low_risk_score"
)
)
tail(low_drink2009_2010)
combined_low_drink <- bind_rows(low_drink2001,
low_drink2009_2010)
head(combined_low_drink)
tail(combined_low_drink)
Low drinking score (select cycles)
Description
This function creates a derived variable based on their drinking habits and flags for health and social problems from their pattern of alcohol use according to Canada's Low-Risk Alcohol Drinking Guideline.
Usage
low_drink_score_fun1(DHH_SEX, ALWDWKY, ALC_005, ALC_1)
Arguments
DHH_SEX |
Sex of respondent (1 - male, 2 - female) |
ALWDWKY |
Number of drinks consumed in the past week |
ALC_005 |
In lifetime, ever had a drink? (1 - yes, 2 - no) |
ALC_1 |
Past year, have you drank alcohol? (1 - yes, 2 - no) |
Details
The low risk drinking score is based on the scoring system in Canada's Low-Risk Alcohol Drinking Guideline. The score is divided into two steps. Step 1 allocates points based on sex and the number of drinks that you usually have each week. In step 2, one point will be awarded for each item that is true related to drinking habits. The total score is obtained from adding the points in step 1 and step 2.
This score has two 0 point categories: low risk (never drank) and low risk (former drinker). The two drinking groups are derived from 'ever had a drink in lifetime'. 'Ever had a drink in lifetime' is only available in CCHS 2001-2008 and 2015-2018.
Value
Low risk drinking score (low_drink_score1) with four categories:
1 - Low risk - never drank (0 points)
2 - Low risk - former drinker (0 points)
3 - Marginal risk (1-2 points)
4 - Medium risk (3-4 points)
5 - High risk (5-9 points)
Note
Step 2 is not included in this function because the questions in step 2 are not asked in any of the CCHS cycles. The score is only based on step 1.
See https://osf.io/eprg7/ for more details on the guideline and score.
Examples
# Using low_drink_score_fun1() to create low_drink_score values across
# CCHS cycles low_drink_score_fun1() is specified in variable_details.csv
# along with the CCHS variables and cycles included.
# To transform low_drink_score1, use rec_with_table() for each CCHS cycle
# and specify low_drink_score1, along with the various alcohol and sex
# variables.
# Using merge_rec_data(), you can combine low_drink_score1 across cycles.
library(cchsflow)
low_drink2001 <- rec_with_table(
cchs2001_p, c(
"DHH_SEX", "ALWDWKY", "ALC_005", "ALC_1", "low_risk_score"
)
)
head(low_drink2001)
low_drink2009_2010 <- rec_with_table(
cchs2009_2010_p, c(
"DHH_SEX", "ALWDWKY", "ALC_005", "ALC_1", "low_risk_score"
)
)
tail(low_drink2009_2010)
combined_low_drink1 <- bind_rows(low_drink2001,
low_drink2009_2010)
head(combined_low_drink1)
tail(combined_low_drink1)
Short term risks due to drinking
Description
This function creates a categorical variable that flags for increased short term health risks due to their drinking habits, according to Canada's Low-Risk Alcohol Drinking Guideline.
Usage
low_drink_short_fun(
DHH_SEX,
ALWDWKY,
ALC_1,
ALW_1,
ALW_2A1,
ALW_2A2,
ALW_2A3,
ALW_2A4,
ALW_2A5,
ALW_2A6,
ALW_2A7
)
Arguments
DHH_SEX |
Sex of respondent (1 - male, 2 - female) |
ALWDWKY |
Number of drinks consumed in the past week |
ALC_1 |
Drinks in the past year (1 - yes, 2 - no) |
ALW_1 |
Drinks in the last week (1 - yes, 2 - no) |
ALW_2A1 |
Number of drinks on Sunday |
ALW_2A2 |
Number of drinks on Monday |
ALW_2A3 |
Number of drinks on Tuesday |
ALW_2A4 |
Number of drinks on Wednesday |
ALW_2A5 |
Number of drinks on Thursday |
ALW_2A6 |
Number of drinks on Friday |
ALW_2A7 |
Number of drinks on Saturday |
Details
The classification of drinkers according to their short term health risks comes from guidelines in Alcohol and Health in Canada: A Summary of Evidence and Guidelines for Low-risk Drinking, and is based on the alcohol consumption reported over the past week. Short-term or acute risks include injury and overdose.
Categories are based on CCHS 2015-2016's variable (ALWDVSTR) where short term health risk are increased when drinking more than 3 drinks (for women) or 4 drinks (for men) on any single occasion.
See https://osf.io/ykau5/ for more details on the guideline. See https://osf.io/ycxaq/ for more details on derivation of the function on page 9.
Value
Categorical variable (ALWDVSTR_der) with two categories:
1 - Increased short term health risk
2 - No increased short term health risk
Examples
# Using low_drink_short_fun() to create ALWDVSTR_der values across CCHS cycles
# low_drink_short_fun() is specified in variable_details.csv along with the
# CCHS variables and cycles included.
# To transform ALWDVSTR_der, use rec_with_table() for each CCHS cycle
# and specify ALWDVSTR_der, along with the various alcohol and sex
# variables.
# Using merge_rec_data(), you can combine ALWDVSTR_der across cycles.
library(cchsflow)
short_low_drink2001 <- rec_with_table(
cchs2001_p, c(
"ALW_1", "DHH_SEX", "ALW_2A1", "ALW_2A2", "ALW_2A3", "ALW_2A4",
"ALW_2A5", "ALW_2A6", "ALW_2A7", "ALWDWKY", "ALC_1","ALWDVSTR_der"
)
)
head(short_low_drink2001)
short_low_drink2009_2010 <- rec_with_table(
cchs2009_2010_p, c(
"ALW_1", "DHH_SEX", "ALW_2A1", "ALW_2A2", "ALW_2A3", "ALW_2A4",
"ALW_2A5", "ALW_2A6", "ALW_2A7", "ALWDWKY", "ALC_1","ALWDVSTR_der"
)
)
tail(short_low_drink2009_2010)
combined_short_low_drink <- bind_rows(short_low_drink2001,
short_low_drink2009_2010)
head(combined_short_low_drink)
tail(combined_short_low_drink)
# Using low_drink_short_fun() to generate ALWDVSTR_der with user-inputted
# values.
#
# Let's say you are a male, you had drinks in the last week and in the last
# year. Let's say you had 5 drinks on Sunday, 1 drink on Monday, 6 drinks on
# Tuesday, 4 drinks on Wednesday, 4 drinks on Thursday, 8 drinks on Friday,
# and 2 drinks on Saturday with a total of 30 drinks in a week.
# Using low_drink_short_fun(), we can check if you would be classified as
# having an increased short term health risk due to drinking.
short_term_drink <- low_drink_short_fun(DHH_SEX = 1, ALWDWKY = 30, ALC_1 = 1,
ALW_1 = 1, ALW_2A1 = 5, ALW_2A2 = 1, ALW_2A3 = 6, ALW_2A4 = 4, ALW_2A5 = 4,
ALW_2A6 = 8, ALW_2A7 = 2)
print(short_term_drink)
Merge recoded data
Description
This function allows users to merge CCHS data transformed by the
rec_with_table
function. This function generates a labelled
merged data frame with multiple transformed CCHS cycles.
Usage
merge_rec_data(...)
Arguments
... |
recoded data frames to be merged. |
Details
When merging recoded CCHS data, there are variables that are missing in certain CCHS cycles. This function tags missing variable observations as NA(c), indicating that the variable was not asked or included in the CCHS cycle of the respondent.
Click here for more details on how NA's are treated in cchsflow.
Value
a merged data frame consisting of multiple recoded CCHS cycles with labels for variable names and tags for variables not included in particular CCHS cycles.
Examples
# Merging two CCHS cycles with variables missing in each cycle.
# INCGHH_A is a cchsflow variable available for the 2001 CCHS cycle, while
# INCGHH_B is a cchsflow variable available for the 2003 CCHS cycle.
# Using merge_rec_data(), datasets containing INCGHH_A & INCGHH_B can be
# merged and tagged.
library(cchsflow)
income2001 <- rec_with_table(cchs2001_p, "INCGHH_A")
income2003 <- rec_with_table(cchs2001_p, "INCGHH_B")
income_merged <- merge_rec_data(income2001, income2003)
head(income_merged)
tail(income_merged)
Number of chronic conditions (5 chronic conditions)
Description
This function generates a derived variable (number_conditions) that counts the number of chronic conditions a respondent has. This function takes 5 CCHS-defined conditions (heart disease, cancer, stroke, bowel disorder, and arthritis), and well one derived variable (respiratory condition) to count the number of conditions a respondent has.
Usage
multiple_conditions_fun1(
CCC_121,
CCC_131,
CCC_151,
CCC_171,
resp_condition_der,
CCC_051
)
Arguments
CCC_121 |
variable indicating if respondent has heart disease (1 = respondent has heart disease, 2 = respondent does not have heart disease) |
CCC_131 |
variable indicating if respondent has active cancer (1 = respondent has active cancer, 2 = respondent does not have active cancer) |
CCC_151 |
variable indicating if respondent suffers from the effects of a stroke (1 = respondent suffers from stroke effects, 2 = respondent does not suffer from stroke effects) |
CCC_171 |
variable indicating if respondent has a bowel disorder (1 = respondent has bowel disorder, 2 = respondent does not have a bowel disorder) |
resp_condition_der |
derived variable indicating if respondent has a
respiratory condition (1 = respondent is over the age of 35 and has
a respiratory condition, 2 = respondent is under the age of 35 and has a
respiratory conditions, 3 = respondent does not have a respiratory
condition). See |
CCC_051 |
variable indicating if respondent has arthritis or rheumatism (1 = respondent has arthritis or rheumatism, 2 = respondent does not have arthritis or rheumatism) |
Details
mood disorder (CCC_280) was not asked to respondents in the 2001
CCHS survey cycle. This mean respondents in this cycle will only be able to
have a maximum of 6 chronic conditions as opposed to 7 for respondents in
other cycles. multiple_conditions_fun2
is used for CCHS cycles
from 2003 to 2014.
Value
A categorical variable indicating the number of chronic conditions a respondent has. Respondents with 5 or more conditions are grouped in the "5+" category.
See Also
Examples
# Using rec_with_table() to generate multiple_conditions in a CCHS
# cycle.
# multiple_conditions_fun1() is specified in variable_details.csv along with
# the CCHS variables and cycles included.
# To generate multiple_conditions, use rec_with_table() and specify the
# multiple_conditions, along with the variables that are derived from it.
# Since resp_condition_der is also a derived variable, you will have to
# specify the variables that are derived from it. In this example, data
# from the 2001 CCHS will be used, so DHHGAGE_cont, CCC_091, and CCC_91A,
# and CCC_031 will be specified along with resp_condition_der.
library(cchsflow)
conditions_2001 <- suppressWarnings(rec_with_table(cchs2001_p,
c("DHHGAGE_cont", "CCC_091",
"CCC_91A", "CCC_031", "CCC_121","CCC_131","CCC_151", "CCC_171","CCC_280",
"resp_condition_der","CCC_051", "number_conditions")))
head(conditions_2001)
# Generating multiple_conditions with user inputted values
# Let's say you are an individual that has heart disease, bowel disorder,
# and arthritis. multiple_conditions_fun1() can be used to count the number
# of chronic conditions you have
library(cchsflow)
num_conditions <- multiple_conditions_fun1(CCC_121 = 1, CCC_131 = 2,
CCC_151 = 2, CCC_171 = 1, resp_condition_der = 3, CCC_051 = 1)
print(num_conditions)
Number of chronic conditions (6 chronic conditions)
Description
This function generates a derived variable (number_conditions) that counts the number of chronic conditions a respondent has. This function takes 6 CCHS-defined conditions (heart disease, cancer, stroke, bowel disorder, mood disorder and arthritis), and well one derived variable (respiratory condition) to count the number of conditions a respondent has.
Usage
multiple_conditions_fun2(
CCC_121,
CCC_131,
CCC_151,
CCC_171,
CCC_280,
resp_condition_der,
CCC_051
)
Arguments
CCC_121 |
variable indicating if respondent has heart disease (1 = respondent has heart disease, 2 = respondent does not have heart disease) |
CCC_131 |
variable indicating if respondent has active cancer (1 = respondent has active cancer, 2 = respondent does not have active cancer) |
CCC_151 |
variable indicating if respondent suffers from the effects of a stroke (1 = respondent suffers from stroke effects, 2 = respondent does not suffer from stroke effects) |
CCC_171 |
variable indicating if respondent has a bowel disorder (1 = respondent has bowel disorder, 2 = respondent does not have a bowel disorder) |
CCC_280 |
variable indicating if respondent has a mood disorder (1 = respondent has a mood disorder, 2 = respondent does not have a mood disorder. Note, variable was not asked to respondents in the 2001 CCHS survey cycle. |
resp_condition_der |
derived variable indicating if respondent has a
respiratory condition. (1 = respondent is over the age of 35 and has
a respiratory condition, 2 = respondent is under the age of 35 and has a
respiratory conditions, 3 = respondent does not have a respiratory
condition). See |
CCC_051 |
variable indicating if respondent has arthritis or rheumatism (1 = respondent has arthritis or rheumatism, 2 = respondent does not have arthritis or rheumatism) |
Details
mood disorder (CCC_280) was not asked to respondents in the 2001
CCHS survey cycle. This mean respondents in this cycle will only be able to
have a maximum of 6 chronic conditions as opposed to 7 for respondents in
other cycles. multiple_conditions_fun1
is used for CCHS cycles
from 2003 to 2014.
Value
A categorical variable indicating the number of chronic conditions a respondent has. Respondents with 5 or more conditions are grouped in the "5+" category.
See Also
Examples
# Using rec_with_table() to generate multiple_conditions in a CCHS
# cycle.
# multiple_conditions_fun2() is specified in variable_details.csv along with
# the CCHS variables and cycles included.
# To generate multiple_conditions, use rec_with_table() and specify the
# multiple_conditions, along with the variables that are derived from it.
# Since resp_condition_der is also a derived variable, you will have to
# specify the variables that are derived from it. In this example, data
# from the 2010 CCHS will be used, so DHHGAGE_cont, CCC_091, and CCC_031
# will be specified along with resp_condition_der.
library(cchsflow)
conditions_2009_2010 <- suppressWarnings(rec_with_table(cchs2009_2010_p,
c("DHHGAGE_cont", "CCC_091",
"CCC_031", "CCC_121","CCC_131","CCC_151", "CCC_171","CCC_280",
"resp_condition_der","CCC_051", "number_conditions")))
head(conditions_2009_2010)
# Generating multiple_conditions with user inputted values
# Let's say you are an individual that has heart disease, bowel disorder,
# and arthritis. multiple_conditions_fun2() can be used to count the number
# of chronic conditions you have
library(cchsflow)
num_conditions <- multiple_conditions_fun2(CCC_121 = 1, CCC_131 = 2,
CCC_151 = 2, CCC_171 = 1, CCC_280 = 2, resp_condition_der = 3, CCC_051 = 1)
print(num_conditions)
Smoking pack-years
Description
This function creates a derived variable (pack_years_der) that measures an individual's smoking pack-years based on various CCHS smoking variables. This is a popular variable used by researchers to quantify lifetime exposure to cigarette use.
Usage
pack_years_fun(
SMKDSTY_A,
DHHGAGE_cont,
time_quit_smoking,
SMKG203_cont,
SMKG207_cont,
SMK_204,
SMK_05B,
SMK_208,
SMK_05C,
SMKG01C_cont,
SMK_01A
)
Arguments
SMKDSTY_A |
variable used in CCHS cycles 2001-2014 that classifies an individual's smoking status. |
DHHGAGE_cont |
continuous age variable. |
time_quit_smoking |
derived variable that calculates the approximate
time a former smoker has quit smoking.
See |
SMKG203_cont |
age started smoking daily. Variable asked to daily smokers. |
SMKG207_cont |
age started smoking daily. Variable asked to former daily smokers. |
SMK_204 |
number of cigarettes smoked per day. Variable asked to daily smokers. |
SMK_05B |
number of cigarettes smoked per day. Variable asked to occasional smokers |
SMK_208 |
number of cigarettes smoked per day. Variable asked to former daily smokers |
SMK_05C |
number of days smoked at least one cigarette |
SMKG01C_cont |
age smoked first cigarette |
SMK_01A |
smoked 100 cigarettes in lifetime (y/n) |
Details
pack-years is calculated by multiplying the number of cigarette packs per day (20 cigarettes per pack) by the number of years. Example 1: a respondent who is a current smoker who smokes 1 package of cigarettes for the last 10 years has smoked 10 pack-years. Pack-years is also calculated for former smokers. Example 2: a respondent who started smoking at age 20 years and smoked half a pack of cigarettes until age 40 years smoked for 10 pack-years.
Value
value for smoking pack-years in the pack_years_der variable
Examples
# Using pack_years_fun() to create pack-years values across CCHS cycles
# pack_years_fun() is specified in variable_details.csv along with the CCHS
# variables and cycles included.
# To transform pack_years_der across cycles, use rec_with_table() for each
# CCHS cycle and specify pack_years_der, along with each smoking variable.
# Since time_quit_smoking_der is also a derived
# variable, you will have to specify the variables that are derived from it.
# Then by using merge_rec_data(), you can combine pack_years_der across
# cycles
library(cchsflow)
pack_years2009_2010 <- rec_with_table(
cchs2009_2010_p, c(
"SMKDSTY_A", "DHHGAGE_cont", "SMK_09A_B", "SMKG09C", "time_quit_smoking",
"SMKG203_cont", "SMKG207_cont", "SMK_204", "SMK_05B", "SMK_208",
"SMK_05C", "SMK_01A", "SMKG01C_cont", "pack_years_der"
)
)
head(pack_years2009_2010)
pack_years2011_2012 <- rec_with_table(
cchs2011_2012_p,c(
"SMKDSTY_A", "DHHGAGE_cont", "SMK_09A_B", "SMKG09C", "time_quit_smoking",
"SMKG203_cont", "SMKG207_cont", "SMK_204", "SMK_05B", "SMK_208",
"SMK_05C", "SMK_01A", "SMKG01C_cont", "pack_years_der"
)
)
tail(pack_years2011_2012)
combined_pack_years <- suppressWarnings(merge_rec_data(pack_years2009_2010,
pack_years2011_2012))
head(combined_pack_years)
tail(combined_pack_years)
Categorical smoking pack-years
Description
This function creates a categorical derived variable (pack_years_cat) that categorizes smoking pack-years (pack_years_der).
Usage
pack_years_fun_cat(pack_years_der)
Arguments
pack_years_der |
derived variable that calculates smoking pack-years
See |
Details
pack-years is calculated by multiplying the number of cigarette packs per day (20 cigarettes per pack) by the number of years.The categories were based on the Cardiovascular Disease Population Risk Tool (Douglas Manuel et al. 2018).
pack_years_cat uses the derived variable pack_years_der. Pack_years_der uses age and various smoking variables that have been transformed by cchsflow (see documentation on pack_year_der). In order to categorize pack years across CCHS cycles, age and smoking variables must be transformed and harmonized.
Value
value for pack year categories in the pack_years_cat variable.
Examples
# Using pack_years_fun_cat() to categorize pack year values across CCHS cycles
# pack_years_fun_cat() is specified in variable_details.csv along with the
# CCHS variables and cycles included.
# To transform pack_years_cat across cycles, use rec_with_table() for each
# CCHS cycle and specify pack_years_cat.
# Since pack_year_der is also also derived variable, you will have to specify
# the variables that are derived from it.
# Since time_quit_smoking_der is also a derived variable in pack_year_der,
# you will have to specify the variables that are derived from it.
# Then by using merge_rec_data(), you can combine pack_years_cat across
# cycles.
library(cchsflow)
pack_years_cat_2009_2010 <- rec_with_table(
cchs2009_2010_p, c(
"SMKDSTY_A", "DHHGAGE_cont", "SMK_09A_B", "SMKG09C", "time_quit_smoking",
"SMKG203_cont", "SMKG207_cont", "SMK_204", "SMK_05B", "SMK_208",
"SMK_05C", "SMK_01A", "SMKG01C_cont", "pack_years_der", "pack_years_cat"
)
)
head(pack_years_cat_2009_2010)
pack_years_cat_2011_2012 <- rec_with_table(
cchs2011_2012_p,c(
"SMKDSTY_A", "DHHGAGE_cont", "SMK_09A_B", "SMKG09C", "time_quit_smoking",
"SMKG203_cont", "SMKG207_cont", "SMK_204", "SMK_05B", "SMK_208",
"SMK_05C", "SMK_01A", "SMKG01C_cont", "pack_years_der", "pack_years_cat"
)
)
tail(pack_years_cat_2011_2012)
combined_pack_years_cat <- suppressWarnings(merge_rec_data
(pack_years_cat_2009_2010,pack_years_cat_2011_2012))
head(combined_pack_years_cat)
tail(combined_pack_years_cat)
Percent time in Canada
Description
This function creates a derived variable (pct_time_der) that provides an estimated percentage of the time a person's life was spent in Canada.
Usage
pct_time_fun(DHHGAGE_cont, SDCGCBG, SDCGRES)
Arguments
DHHGAGE_cont |
continuous age variable. |
SDCGCBG |
whether or not someone was born in Canada (1 - born in Canada, 2 - born outside Canada) |
SDCGRES |
how long someone has lived in Canada. Note: in the PUMF CCHS datasets, this is a categorical variable with two categories (1 - 0-9 years; 2 - 10+ years). |
Value
Numeric value between 0 and 100 that represents percentage of a respondent's time in Canada
Note
Since SDCGRES is a categorical variable measuring length of time, we've set midpoints in the function. A respondent identified as being in Canada for 0-9 years is assigned a value of 4.5 years, and someone who has been in Canada for over 10 years is assigned a value of 15 years.
Examples
# Using pct_time_fun() to create percent time values between CCHS cycles
# pct_time_fun() is specified in variable_details.csv along with the CCHS
# variables and cycles included.
# To transform pct_time_der across cycles, use rec_with_table() for each CCHS
# cycle and specify pct_time_der, along with age (DHHGAGE_cont), whether or
# not someone was born in Canada (SDCGCBG), how long someone has lived in
# Canada (SDCGRES). Then by using merge_rec_data(), you can combine
# pct_time_der across cycles
library(cchsflow)
pct_time2009_2010 <- rec_with_table(
cchs2009_2010_p, c(
"DHHGAGE_cont", "SDCGCBG",
"SDCGRES", "pct_time_der"
)
)
head(pct_time2009_2010)
pct_time2011_2012 <- rec_with_table(
cchs2011_2012_p, c(
"DHHGAGE_cont", "SDCGCBG",
"SDCGRES", "pct_time_der"
)
)
tail(pct_time2011_2012)
combined_pct_time <- merge_rec_data(pct_time2009_2010, pct_time2011_2012)
head(combined_pct_time)
tail(combined_pct_time)
# Using pct_time_fun() to generate a value for percent time spent in Canada
# with user inputted values Let's say you are 27 years old who was born
# outside of Canada and have been living in Canada for less than 10 years.
# Your estimated percent time spent in Canada can be calculated as follows:
pct_time <- pct_time_fun(DHHGAGE_cont = 27, SDCGCBG = 2, SDCGRES = 1)
print(pct_time)
Categorical percent time in Canada
Description
This function creates a categorical derived variable (pct_time_der_cat10) that categorizes the derived percent time in Canada variable (pct_time_der).
Usage
pct_time_fun_cat(pct_time_der)
Arguments
pct_time_der |
derived continuous percent time in Canada.
See |
Details
The percent time in Canada provides an estimated percentage of the time a person's life was spent in Canada.The categorical percent time in Canada divides the continuous value into 10 percent intervals.
pct_time_der_cat10 uses the derived variable pct_time_der. pct_time_der uses various variables that have been transformed by cchsflow (see documentation on pct_time_der). In order to categorize percent time in Canada across CCHS cycles, the variables must be transformed and harmonized.
Value
value for categorical percent time in Canada using pct_time_der variable.
Examples
# Using pct_time_fun_cat() to create categorical percent time values
# between CCHS cycles.
# pct_time_fun_cat() is specified in variable_details.csv along with the CCHS
# variables and cycles included.
# To transform pct_time_der_cat10 across cycles, use rec_with_table() for
# each CCHS cycle.
# Since pct_time_der is a derived variable, you will have to specify the
# variables that are derived from it.
# Then by using merge_rec_data(), you can combine pct_time_der_cat10 across
# cycles.
library(cchsflow)
pct_time_cat2009_2010 <- rec_with_table(
cchs2009_2010_p, c(
"DHHGAGE_cont", "SDCGCBG",
"SDCGRES", "pct_time_der", "pct_time_der_cat10"
)
)
head(pct_time_cat2009_2010)
pct_time_cat2011_2012 <- rec_with_table(
cchs2011_2012_p, c(
"DHHGAGE_cont", "SDCGCBG",
"SDCGRES", "pct_time_der", "pct_time_der_cat10"
)
)
tail(pct_time_cat2011_2012)
combined_pct_time_cat <- merge_rec_data(pct_time_cat2009_2010,
pct_time_cat2011_2012)
head(combined_pct_time_cat)
tail(combined_pct_time_cat)
Recode with Table
Description
Recode with Table is responsible for recoding values of a dataset based on the specifications in variable_details.
Usage
rec_with_table(
data,
variables = NULL,
database_name = NULL,
variable_details = NULL,
else_value = NA,
append_to_data = FALSE,
log = FALSE,
notes = TRUE,
var_labels = NULL,
custom_function_path = NULL,
attach_data_name = FALSE
)
Arguments
data |
A dataframe containing the variables to be recoded. Can also be a list of dataframes |
variables |
character vector containing variable names to recode or a variables csv containing additional variable info |
database_name |
String, the name of the dataset containing the variables to be recoded. Can also be a vector of strings if data is a list |
variable_details |
A dataframe containing the specifications (rules) for recoding. |
else_value |
Value (string, number, integer, logical or NA) that is used to replace any values that are outside the specified ranges (no rules for recoding). |
append_to_data |
Logical, if |
log |
Logical, if |
notes |
Logical, if |
var_labels |
labels vector to attach to variables in variables |
custom_function_path |
path to location of the function to load |
attach_data_name |
to attach name of database to end table |
Details
The variable_details dataframe needs the following variables to function:
- variable
name of new (mutated) variable that is recoded
- toType
type the variable is being recoded to cat = categorical, cont = continuous
- databaseStart
name of dataframe with original variables to be recoded
- variableStart
name of variable to be recoded
- fromType
variable type of start variable. cat = categorical or factor variable cont = continuous variable (real number or integer)
- recTo
Value to recode to
- recFrom
Value/range being recoded from
Each row in variable_details comprises one category in a newly transformed variable. The rules for each category the new variable are a string in recFrom and value in recTo. These recode pairs are the same syntax as sjmisc::rec(), except in sjmisc::rec() the pairs are a string for the function attribute rec =, separated by '='. For example in rec_w_table variable_details$recFrom = 2; variable_details$recTo = 4 is the same as sjmisc::rec(rec = "2=4"). the pairs are obtained from the RecFrom and RecTo columns
- recode pairs
each recode pair is row. see above example or PBC-variableDetails.csv
- multiple values
multiple old values that should be recoded into a new single value may be separated with comma, e.g. recFrom = "1,2"; recTo = 1
- value range
a value range is indicated by a colon, e.g. recFrom= "1:4"; recTo = 1 (recodes all values from 1 to 4 into 1)
- value range for doubles
for double vectors (with fractional part), all values within the specified range are recoded; e.g. recFrom = "1:2.5'; recTo = 1 recodes 1 to 2.5 into 1, but 2.55 would not be recoded (since it's not included in the specified range)
- "min" and "max"
minimum and maximum values are indicates by min (or lo) and max (or hi), e.g. recFrom = "min:4"; recTo = 1 (recodes all values from minimum values of x to 4 into 1)
- "else"
all other values, which have not been specified yet, are indicated by else, e.g. recFrom = "else"; recTo = NA (recode all other values (not specified in other rows) to "NA")
- "copy"
the "else"-token can be combined with copy, indicating that all remaining, not yet recoded values should stay the same (are copied from the original value), e.g. recFrom = "else"; recTo = "copy"
- NA's
NA values are allowed both as old and new value, e.g. recFrom "NA"; recTo = 1. or "recFrom = "3:5"; recTo = "NA" (recodes all NA into 1, and all values from 3 to 5 into NA in the new variable)
Value
a dataframe that is recoded according to rules in variable_details.
Examples
library(cchsflow)
bmi2001 <- rec_with_table(
data = cchs2001_p, c(
"HWTGHTM",
"HWTGWTK", "HWTGBMI_der"
)
)
head(bmi2001)
bmi2011_2012 <- rec_with_table(
data = cchs2011_2012_p, c(
"HWTGHTM",
"HWTGWTK", "HWTGBMI_der"
)
)
tail(bmi2011_2012)
combined_bmi <- bind_rows(bmi2001, bmi2011_2012)
head(combined_bmi)
tail(combined_bmi)
recode_columns
Description
Recodes columns from passed row and returns just table with those columns and same rows as the data
Usage
recode_columns(
data,
variables_to_process,
data_name,
log,
print_note,
else_default
)
Arguments
data |
The source database |
variables_to_process |
rows from variable details that are applicable to this DB |
data_name |
Name of the database being passed |
log |
The option of printing log |
print_note |
the option of printing the note columns |
else_default |
default else value to use if no else is present |
Value
Returns recoded and labeled data
Recode NA formatting
Description
Recodes the NA depending on the var type
Usage
recode_variable_NA_formating(cell_value, var_type)
Arguments
cell_value |
The value inside the recTo column |
var_type |
the toType of a variable |
Value
an appropriately coded tagged NA
resp_condition_fun1
Description
This is one of 3 functions used to create a derived variable (resp_condition_der) that determines if a respondents has a respiratory condition. 3 different functions have been created to account for the fact that different respiratory variables are used across CCHS cycles. This function is for CCHS cycles (2009-2014) that only use COPD and Emphysema as a combined variable. Asthma is used across CCHS cycles as a separate variable.
Usage
resp_condition_fun1(DHHGAGE_cont, CCC_091, CCC_031)
Arguments
DHHGAGE_cont |
continuous age variable. |
CCC_091 |
variable indicating if respondent has either COPD or Emphysema |
CCC_031 |
variable indicating if respondent has asthma |
Value
a categorical variable (resp_condition_der) with 3 levels:
respondent is over the age of 35 and has a respiratory condition
respondent is under the age of 35 and has a respiratory condition
respondent does not have a respiratory condition
See Also
resp_condition_fun2
, resp_condition_fun3
Examples
# Using resp_condition_fun1() to create values across CCHS cycles
# (2009-2014) resp_condition_fun1() is specified in
# variable_details.csv along with the CCHS variables and cycles included.
# To transform resp_condition_der, use rec_with_table() for each CCHS cycle
# and specify resp_condition_der, along with the various respiratory
# variables. Then by using merge_rec_data() you can combine
# resp_condition_der across cycles.
library(cchsflow)
resp2009_2010 <- suppressWarnings(rec_with_table(
cchs2009_2010_p, c(
"DHHGAGE_cont", "CCC_091", "CCC_031",
"resp_condition_der"
)
))
head(resp2009_2010)
resp2011_2012 <- suppressWarnings(rec_with_table(
cchs2011_2012_p, c(
"DHHGAGE_cont", "CCC_091", "CCC_031",
"resp_condition_der"
)
))
tail(resp2011_2012)
combined_resp <-
suppressWarnings(merge_rec_data(resp2009_2010, resp2011_2012))
head(combined_resp)
tail(combined_resp)
resp_condition_fun2
Description
This is one of 3 functions used to create a derived variable (resp_condition_der) that determines if a respondents has a respiratory condition. This function is for CCHS cycles (2005-2007) that use COPD & Emphysema as separate variables, as well as Bronchitis. Asthma is used across CCHS cycles as a separate variable.
Usage
resp_condition_fun2(DHHGAGE_cont, CCC_91E, CCC_91F, CCC_91A, CCC_031)
Arguments
DHHGAGE_cont |
continuous age variable. |
CCC_91E |
variable indicating if respondent has emphysema |
CCC_91F |
variable indicating if respondent has COPD |
CCC_91A |
variable indicating if respondent has chronic bronchitis |
CCC_031 |
variable indicating if respondent has asthma |
Value
a categorical variable (resp_condition_der) with 3 levels:
respondent is over the age of 35 and has a respiratory condition
respondent is under the age of 35 and has a respiratory condition
respondent does not have a respiratory condition
See Also
resp_condition_fun1
, resp_condition_fun3
Examples
# Using resp_condition_fun2() to create values across CCHS cycles
# (2005-2007) resp_condition_fun2() is specified in
# variable_details.csv along with the CCHS variables and cycles included.
# To transform resp_condition_der, use rec_with_table() for each CCHS cycle
# and specify resp_condition_der, along with the various respiratory
# variables. Then by using merge_rec_data() you can combine
# resp_condition_der across cycles.
library(cchsflow)
resp2005 <- suppressWarnings(rec_with_table(
cchs2005_p, c(
"DHHGAGE_cont", "CCC_91E", "CCC_91F", "CCC_91A", "CCC_031",
"resp_condition_der"
)
))
head(resp2005)
resp2007_2008 <- suppressWarnings(rec_with_table(
cchs2007_2008_p, c(
"DHHGAGE_cont", "CCC_91E", "CCC_91F", "CCC_91A", "CCC_031",
"resp_condition_der"
)
))
tail(resp2007_2008)
combined_resp <- suppressWarnings(merge_rec_data(resp2005, resp2007_2008))
head(combined_resp)
tail(combined_resp)
resp_condition_fun3
Description
This is one of 3 functions used to create a derived variable (resp_condition_der) that determines if a respondents has a respiratory condition. This function for CCHS cycles (2001-2003) that use COPD and Emphysema as a combined variable, as well as Bronchitis. Asthma is used across CCHS cycles as a separate variable.
Usage
resp_condition_fun3(DHHGAGE_cont, CCC_091, CCC_91A, CCC_031)
Arguments
DHHGAGE_cont |
continuous age variable. |
CCC_091 |
variable indicating if respondent has either COPD or Emphysema |
CCC_91A |
variable indicating if respondent has chronic bronchitis |
CCC_031 |
variable indicating if respondent has asthma |
Value
a categorical variable (resp_condition_der) with 3 levels:
respondent is over the age of 35 and has a respiratory condition
respondent is under the age of 35 and has a respiratory condition
respondent does not have a respiratory condition
See Also
resp_condition_fun1
, resp_condition_fun2
Examples
# Using resp_condition_fun3() to create values across CCHS cycles
# (2001-2003) resp_condition_fun3() is specified in
# variable_details.csv along with the CCHS variables and cycles included.
# To transform resp_condition_der, use rec_with_table() for each CCHS cycle
# and specify resp_condition_der, along with the various respiratory
# variables. Then by using merge_rec_data() you can combine
# resp_condition_der across cycles.
library(cchsflow)
resp2001 <- suppressWarnings(rec_with_table(
cchs2001_p, c(
"DHHGAGE_cont", "CCC_091", "CCC_91A", "CCC_031",
"resp_condition_der"
)
))
head(resp2001)
resp2003 <- suppressWarnings(rec_with_table(
cchs2003_p,c(
"DHHGAGE_cont", "CCC_091", "CCC_91A", "CCC_031",
"resp_condition_der"
)
))
tail(resp2003)
combined_resp <- suppressWarnings(merge_rec_data(resp2001, resp2003))
head(combined_resp)
tail(combined_resp)
Set Data Labels
Description
sets labels for passed database, Uses the names of final variables in variable_details/variables_sheet as well as the labels contained in the passed dataframes
Usage
set_data_labels(data_to_label, variable_details, variables_sheet = NULL)
Arguments
data_to_label |
newly transformed dataset |
variable_details |
variable_details.csv |
variables_sheet |
variables.csv |
Value
labeled data_to_label
Examples
library(cchsflow)
library(sjlabelled)
bmi2001 <- rec_with_table(
cchs2001_p, c(
"HWTGHTM",
"HWTGWTK", "HWTGBMI_der"
)
)
bmi2003 <- rec_with_table(
cchs2003_p, c(
"HWTGHTM",
"HWTGWTK", "HWTGBMI_der"
)
)
combined_bmi <- bind_rows(bmi2001, bmi2003)
get_label(combined_bmi)
labeled_combined_data <- set_data_labels(combined_bmi,
variable_details,
variables)
get_label(labeled_combined_data)
Simple smoking status
Description
This function creates a derived smoking variable (smoke_simple) with four categories:
non-smoker (never smoked)
current smoker (daily and occasional?)
former daily smoker quit =<5 years or former occasional smoker
former daily smoker quit >5 years
Usage
smoke_simple_fun(SMKDSTY_cat5, time_quit_smoking)
Arguments
SMKDSTY_cat5 |
derived variable that classifies an individual's smoking status. This variable captures cycles 2001-2018. |
time_quit_smoking |
derived variable that calculates the approximate
time a former smoker has quit smoking.
See |
Examples
# Using the 'smoke_simple_fun' function to create the derived smoking
# variable across CCHS cycles.
# smoke_simple_fun() is specified in the variable_details.csv
# To create a harmonized smoke_simple variable across CCHS cycles, use
# rec_with_table() for each CCHS cycle and specify smoke_simple_fun and
# the required base variables. Since time_quit_smoking_der is also a derived
# variable, you will have to specify the variables that are derived from it.
# Using merge_rec_data(), you can combine smoke_simple across cycles.
library(cchsflow)
smoke_simple2009_2010 <- rec_with_table(
cchs2009_2010_p, c(
"SMKDSTY", "SMK_09A_B", "SMKG09C", "time_quit_smoking",
"smoke_simple"
)
)
head(smoke_simple2009_2010)
smoke_simple2011_2012 <- rec_with_table(
cchs2011_2012_p,c(
"SMKDSTY", "SMK_09A_B", "SMKG09C", "time_quit_smoking",
"smoke_simple"
)
)
tail(smoke_simple2011_2012)
combined_smoke_simple <-
suppressWarnings(merge_rec_data(smoke_simple2009_2010,smoke_simple2011_2012))
head(combined_smoke_simple)
tail(combined_smoke_simple)
Time since quit smoking
Description
This function creates a derived variable (time_quit_smoking_der) that calculates the approximate time a former smoker has quit smoking based on various CCHS smoking variables. This variable is for CCHS respondents in CCHS surveys 2003-2014.
Usage
time_quit_smoking_fun(SMK_09A_B, SMKG09C)
Arguments
SMK_09A_B |
number of years since quitting smoking. Variable asked to former daily smokers who quit <3 years ago. |
SMKG09C |
number of years since quitting smoking. Variable asked to former daily smokers who quit >=3 years ago. |
Value
value for time since quit smoking in time_quit_smoking_der.
Examples
# Using time_quit_smoking_fun() to create pack-years values across CCHS
# cycles.
# time_quit_smoking_fun() is specified in variable_details.csv along with the
# CCHS variables and cycles included.
# To transform time_quit_smoking across cycles, use rec_with_table() for each
# CCHS cycle and specify time_quit_smoking, along with each smoking variable.
# Then by using merge_rec_data(), you can combine time_quit_smoking across
# cycles.
library(cchsflow)
time_quit2009_2010 <- rec_with_table(
cchs2009_2010_p, c(
"SMK_09A_B", "SMKG09C", "time_quit_smoking"
)
)
head(time_quit2009_2010)
time_quit2011_2012 <- rec_with_table(
cchs2011_2012_p, c(
"SMK_09A_B", "SMKG09C", "time_quit_smoking"
)
)
tail(time_quit2011_2012)
combined_time_quit <- suppressWarnings(merge_rec_data(time_quit2009_2010,
time_quit2011_2012))
head(combined_time_quit)
tail(combined_time_quit)
variable_details.csv
Description
This dataset provides details on how variables are recoded in cchsflow.
Details
See the below link for more details about how the worksheet is structured https://big-life-lab.github.io/cchsflow/articles/variable_details.html
Value
variable_details |
a data frame |
Examples
data(variable_details)
str(variable_details)
variables.csv
Description
This dataset lists all the variables that are present in cchsflow.
Details
See the below link for more details about how the worksheet is structured https://big-life-lab.github.io/cchsflow/articles/variables_sheet.html
Value
variables |
a data frame |
Examples
data(variables)
str(variables)