
The purpose of cepumd is to make working with Consumer ExpenditureSurveys (CE) Public-Use Microdata (PUMD) easier toward calculating mean,weighted, annual expenditures (henceforth “mean expenditures”). Thechallenges cepumd seeks to address deal primarily with pulling togetherthe necessary data toward this end. Some of the overarching ideasunderlying the package are as follows:
Use a Tidyverse framework for most operations and be (hopefully)generally Tidyverse friendly
Balance the effort to make the end user’s experience with CE PUMDeasier while being flexible enough to allow that user to perform anyanalysis with the data they wish
Only designed to help users calculate mean expenditures on and ofthe consumer unit (CU), i.e., not income, not assets, not liabilities,not gifts.
cepumdcepumd seeks to address challenges in three categories:data gathering/organization; managing data inconsistencies; andcalculating weighted, annual metrics.
ce_hg()ce_hg()andce_uccs()ce_prepdata()ce_mean() orexpenditure quantile withce_quantile()Install the production version withinstall.packages("cepumd")
You can install the development version ofcepumd fromGitHub, but you’ll first need thedevtools package:
if (!"devtools"%in%installed.packages()[,"Package"]) {install.packages("devtools",dependencies =TRUE)}devtools::install_github("arcenis-r/cepumd")The workhorse ofcepumd isce_prepdata(). It merges the household characteristics file(FMLI/-D) with the corresponding expenditure tabulation file (MTBI/EXPD)for a specified year, adjusts weights for months-in-scope and the numberof collection quarters, adjusts some cost values by their periodicityfactor (some cost categories are represented as annual figures andothers as quarterly). With the recent update it only requires the first3 arguments to function: the year, the survey type, and one or morevalid UCCs.ce_prepdata() now creates all of the othernecessary objects within the function if not provided.
There are two functions for wrangling hierarchical grouping datainto more usable formats:
ce_hg() pulls the requested type of HG file (Interview,Diary, or Integrated) for a specified year.ce_uccs() filters the HG file for the specifiedexpenditure category and returns either a data frame with only thatsection of the HG file or the Universal Classification Codes (UCCs) thatmake up that expenditure category.There are two functions that the user can use to calculate CEsummary statistics:
ce_mean() calculates a mean expenditure, standard errorof the mean, coefficient of variation, and an aggregateexpenditure.ce_quantiles() calculates weighted expenditurequantiles. It is important to note that calculating medians forintegrated expenditures is not recommended because the calculationinvolves using weights from both the Diary and Survey instruments.