- Notifications
You must be signed in to change notification settings - Fork0
License
Big-Life-Lab/MockData
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Generate mock testing data from recodeflow metadata (variables.csv and variable-details.csv).
MockData creates realistic mock data for testing harmonisation workflows across recodeflow projects (CHMS, CCHS, etc.). It reads variable specifications from metadata files and generates appropriate categorical and continuous variables with correct value ranges, tagged NAs, and reproducible seeds.
- Metadata-driven: Uses existing
variables.csvandvariable-details.csv- no duplicate specifications needed - Recodeflow-standard: Supports all recodeflow notation formats (database-prefixed, bracket, mixed)
- Metadata validation: Tools to check metadata quality
- Universal: Works across CHMS, CCHS, and future recodeflow projects
- Test availability: 224 tests covering parsers, helpers, and generators
# Install from local directorydevtools::install_local("~/github/mock-data")# Or install from GitHub (once published)# devtools::install_github("your-org/MockData")
Note: Package vignettes are in Quarto format (.qmd). To build vignettes locally, you needQuarto installed. For team use, this is our standard going forward.
library(MockData)# Load metadata (CHMS example with sample data)variables<- read.csv( system.file("extdata/chms/chmsflow_sample_variables.csv",package="MockData"),stringsAsFactors=FALSE)variable_details<- read.csv( system.file("extdata/chms/chmsflow_sample_variable_details.csv",package="MockData"),stringsAsFactors=FALSE)# Get variables for a specific cyclecycle1_vars<- get_cycle_variables("cycle1",variables,variable_details)# Get unique raw variables to generateraw_vars<- get_raw_variables("cycle1",variables,variable_details)# Create empty data framedf_mock<-data.frame(id=1:1000)# Generate a categorical variableresult<- create_cat_var("alc_11","cycle1",variable_details,variables,length=1000,df_mock=df_mock,seed=123)if (!is.null(result)) {df_mock<- cbind(df_mock,result)}# Generate a continuous variableresult<- create_con_var("alcdwky","cycle1",variable_details,variables,length=1000,df_mock=df_mock,seed=123)if (!is.null(result)) {df_mock<- cbind(df_mock,result)}
Located inmockdata-tools/:
# Validate metadata qualityRscript mockdata-tools/validate-metadata.R# Test all cyclesRscript mockdata-tools/test-all-cycles.R# Compare different approachesRscript mockdata-tools/create-comparison.R
Seemockdata-tools/README.md for detailed documentation.
Parsers (
R/mockdata-parsers.R):parse_variable_start(): Extracts raw variable names from variableStartparse_range_notation(): Handles range syntax like[7,9],[18.5,25),else
Helpers (
R/mockdata-helpers.R):get_cycle_variables(): Filters metadata by cycleget_raw_variables(): Returns unique raw variables with harmonisation groupingsget_variable_details_for_raw(): Retrieves category specificationsget_variable_categories(): Extracts valid category codes
Generators:
create_cat_var()(R/create_cat_var.R): Generates categorical variables with tagged NA supportcreate_con_var()(R/create_con_var.R): Generates continuous variables with realistic distributions
# Run all testsdevtools::test()# Run specific test filetestthat::test_file("tests/testthat/test-mockdata.R")
This package is part of the recodeflow ecosystem. SeeCONTRIBUTING.md for details.
MIT License - seeLICENSE file for details.
About
Resources
License
Code of conduct
Contributing
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Contributors4
Uh oh!
There was an error while loading.Please reload this page.