Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Big-Life-Lab/MockData

Repository files navigation

Generate mock testing data from recodeflow metadata (variables.csv and variable-details.csv).

Overview

MockData creates realistic mock data for testing harmonisation workflows across recodeflow projects (CHMS, CCHS, etc.). It reads variable specifications from metadata files and generates appropriate categorical and continuous variables with correct value ranges, tagged NAs, and reproducible seeds.

Features

  • Metadata-driven: Uses existingvariables.csv andvariable-details.csv - no duplicate specifications needed
  • Recodeflow-standard: Supports all recodeflow notation formats (database-prefixed, bracket, mixed)
  • Metadata validation: Tools to check metadata quality
  • Universal: Works across CHMS, CCHS, and future recodeflow projects
  • Test availability: 224 tests covering parsers, helpers, and generators

Installation

# Install from local directorydevtools::install_local("~/github/mock-data")# Or install from GitHub (once published)# devtools::install_github("your-org/MockData")

Note: Package vignettes are in Quarto format (.qmd). To build vignettes locally, you needQuarto installed. For team use, this is our standard going forward.

Quick start

library(MockData)# Load metadata (CHMS example with sample data)variables<- read.csv(  system.file("extdata/chms/chmsflow_sample_variables.csv",package="MockData"),stringsAsFactors=FALSE)variable_details<- read.csv(  system.file("extdata/chms/chmsflow_sample_variable_details.csv",package="MockData"),stringsAsFactors=FALSE)# Get variables for a specific cyclecycle1_vars<- get_cycle_variables("cycle1",variables,variable_details)# Get unique raw variables to generateraw_vars<- get_raw_variables("cycle1",variables,variable_details)# Create empty data framedf_mock<-data.frame(id=1:1000)# Generate a categorical variableresult<- create_cat_var("alc_11","cycle1",variable_details,variables,length=1000,df_mock=df_mock,seed=123)if (!is.null(result)) {df_mock<- cbind(df_mock,result)}# Generate a continuous variableresult<- create_con_var("alcdwky","cycle1",variable_details,variables,length=1000,df_mock=df_mock,seed=123)if (!is.null(result)) {df_mock<- cbind(df_mock,result)}

Validation tools

Located inmockdata-tools/:

# Validate metadata qualityRscript mockdata-tools/validate-metadata.R# Test all cyclesRscript mockdata-tools/test-all-cycles.R# Compare different approachesRscript mockdata-tools/create-comparison.R

Seemockdata-tools/README.md for detailed documentation.

Architecture

Core modules

  1. Parsers (R/mockdata-parsers.R):

    • parse_variable_start(): Extracts raw variable names from variableStart
    • parse_range_notation(): Handles range syntax like[7,9],[18.5,25),else
  2. Helpers (R/mockdata-helpers.R):

    • get_cycle_variables(): Filters metadata by cycle
    • get_raw_variables(): Returns unique raw variables with harmonisation groupings
    • get_variable_details_for_raw(): Retrieves category specifications
    • get_variable_categories(): Extracts valid category codes
  3. Generators:

    • create_cat_var() (R/create_cat_var.R): Generates categorical variables with tagged NA support
    • create_con_var() (R/create_con_var.R): Generates continuous variables with realistic distributions

Testing

# Run all testsdevtools::test()# Run specific test filetestthat::test_file("tests/testthat/test-mockdata.R")

Contributing

This package is part of the recodeflow ecosystem. SeeCONTRIBUTING.md for details.

License

MIT License - seeLICENSE file for details.

Related Projects

Releases

No releases published

Packages

No packages published

Contributors4

  •  
  •  
  •  
  •  

Languages


[8]ページ先頭

©2009-2025 Movatter.jp