Movatterモバイル変換

Title:

Impact Measurement Toolkit

Version:

1.0.0

Description:

A toolkit for causal inference in experimental and observational studies. Implements various simple Bayesian models including linear, negative binomial, and logistic regression for impact estimation. Provides functionality for randomization and checking baseline equivalence in experimental designs. The package aims to simplify the process of impact measurement for researchers and analysts across different fields. Examples and detailed usage instructions are available athttps://book.martinez.fyi.

License:

Apache License (≥ 2.0)

Encoding:

UTF-8

RoxygenNote:

7.3.2

Biarch:

true

Depends:

R (≥ 3.4.0)

Imports:

R6, bayesplot, caret, dplyr, ggplot2, glue, magrittr, methods,Rcpp, RcppParallel, rlang, rstan, rstantools, tibble, tidyr,vizdraws, purrr, scales, tidyselect

LinkingTo:

BH (≥ 1.66.0), Rcpp (≥ 0.12.0), RcppEigen (≥ 0.3.3.3.0),RcppParallel (≥ 5.0.1), rstan (≥ 2.18.1), StanHeaders (≥2.18.0)

Suggests:

knitr, rmarkdown, roxygen2, testthat

SystemRequirements:

GNU make

URL:

https://github.com/google/imt

BugReports:

https://github.com/google/imt/issues

NeedsCompilation:

yes

Packaged:

2024-08-28 13:19:01 UTC; ignacio

Author:

Ignacio Martinez

[aut, cre]

Maintainer:

Ignacio Martinez <martinezig@google.com>

Repository:

CRAN

Date/Publication:

2024-09-02 21:40:10 UTC

The 'imt' package

Description

A DESCRIPTION OF THE PACKAGE

Author(s)

Maintainer: Ignacio Martinezmartinezig@google.com (ORCID)

References

Stan Development Team (NA). RStan: the R interface to Stan. R package version 2.32.6. https://mc-stan.org

Pipe operator

Description

Seemagrittr::%>% for details.

Usage

lhs %>% rhs

Arguments

lhs

A value or the magrittr placeholder.

rhs

A function call using the magrittr semantics.

Value

The result of callingrhs(lhs).

Combine and Unite Columns

Description

This function takes a data frame, identifies columns that are not specifiedin the exclusion list, and combines them into a new column called 'group'.The original columns used for the combination are then removed.Finally, it returns a data frame with only the 'group','variables', 'std_diff', and 'balance' columns.

Usage

.combineColumns(df)

Arguments

df

A data frame containing the columns to be combined.

Value

A data frame with the combined 'group' column and specified columns.

Add a Random Treatment Indicator Column to a Data Frame

Description

This function takes a data frame and adds a new column named "treated" withrandomly assigned TRUE/FALSE values. Randomization can be done either onthe entire data frame or stratified by specified columns. The probability ofbeing assigned to the treatment group can be specified, with adefault of 0.5.

Usage

.randomize_internal(data, group_by = NULL, seed = NULL, pr_treated = 0.5)

Arguments

data

The input data frame.

group_by

(Optional) A character vector of column names to stratify therandomization. If provided, the randomization will be done withind each groupefined by the specified columns.

seed

(Optional) An integer to set the random seed for reproducibility.

pr_treated

(Optional) The probability of a row being assigned to thetreatment group (TRUE). Default is 0.5.

Value

A new data frame with the added "treated" column.

Create a Baseline Balance Plot.

Description

Create a Baseline Balance Plot.

Usage

balancePlot(data)

Arguments

data

tibble produced withcheckBaseline.

Value

ggplot2 baseline balance plot.

Examples

library(imt)set.seed(123)  # for reproducibilityN <- 1000fake_data <- tibble::tibble(x1 = rnorm(N), x2 = rnorm(N), t = rbinom(N, 1, 0.5))baseline <- checkBaseline(data = fake_data, variables = c("x1", "x2"), treatment = "t")balancePlot(data = baseline)

Bayesian Linear Model Factory

Description

Bayesian Linear Model Factory

Active bindings

version: im package version used to fit model
eta_draws: Posterior draws for the treatment effect
mcmChecks: MCMC diagnostics
credible_interval: Credible interval for the treatment effect

Methods

Public methods

Method`new()`

Get the package version

Get the posterior draws for eta

Get the MCMC diagnostics

Get the credible interval

Create a new Bayesian Linear Model object.

Usage

blm$new(  data,  y,  x,  treatment,  eta_mean,  eta_sd,  generate_fake_data = 0,  seed = 1982,  ...)

Arguments

data: Data frame to be used
y: Name of the outcome variable in the data frame
x: Vector of names of all covariates in the data frame
treatment: Name of the treatment indicator variable in the data frame
eta_mean: Prior mean for the treatment effect estimation
eta_sd: Prior standard deviation for the treatment effect estimation
generate_fake_data: Flag for generating fake data
seed: Seed for Stan fitting
...: Additional arguments for Stan

Returns

invisible

Method`ppcDensOverlay()`

This method compares the empirical distribution of the data 'y'to the distributions of simulated/replicated data 'yrep' from theposterior predictive distribution. This is done by creating a densityoverlay plot using thebayesplot::ppc_dens_overlay function.

Usage

blm$ppcDensOverlay(n = 50)

Arguments

n: Number of posterior draws to use for the overlay

Returns

ggplot2 visualization

Method`tracePlot()`

Plot MCMC trace for the eta and sigma parameters.

Usage

blm$tracePlot(...)

Arguments

...: Additional arguments for Stan

Returns

A ggplot object.

Method`posteriorProb()`

Calculate posterior probability of effect exceeding a threshold

This function calculates the posterior probability of the effectbeing larger or smaller than a specified threshold.

Usage

blm$posteriorProb(threshold = 0, gt = TRUE)

Arguments

threshold: A numeric value specifying the threshold.
gt: A logical value indicating whether to calculate the probabilityof the effect being greater than (TRUE) or less than (FALSE)the threshold.

Details

This function uses the private$..eta_draws internal variableto obtain draws from the posterior distribution of the effect size.Based on the specified arguments, the function calculates theproportion of draws exceeding/falling below the threshold andreturns a formatted statement describing the estimated probability.

Calculate point estimate of the effect

This R6 method calculates the point estimate of the effect sizebased on the posterior draws of the eta parameter.

Returns

A character string summarizing the estimated probability.

Method`pointEstimate()`

Usage

blm$pointEstimate(median = TRUE)

Arguments

median: Logical value. If TRUE (default), the median ofthe eta draws is returned. If FALSE, the mean is returned.

Details

This method uses the private$..eta_draws internal variablewhich contains MCMC draws of the eta parameter representing theeffect size. Based on the specified median argument, the methodcalculates and returns either the median or the mean of the draws.Calculates credible interval for the effect of the intervention

This R6 method calculates and returns a formatted statement summarizingthe credible interval of a specified width for the effect of the intervention.

Returns

A numeric value representing the point estimate.

Method`credibleInterval()`

Usage

blm$credibleInterval(width = 0.75, round = 0)

Arguments

width: Numeric value between 0 and 1 representing the desiredwidth of the credible interval (e.g., 0.95 for a 95% credible interval).
round: Integer value indicating the number of decimal places to roundthe lower and upper bounds of the credible interval.

Details

This method uses the private$..eta_draws internal variablecontaining MCMC draws of the eta parameter representing the effectsize. It calculates the credible interval, stores it internally, andreturns a formatted statement summarizing the findings.

Calculate and format probability statement based on prior

This method calculates the probability that the effect is greater than orless than a given threshold based on the prior distribution of the effect.The results are formatted into a statement suitable for reporting.

Returns

A character string with the following information:

The probability associated with the specified width
The lower and upper bounds of the credible interval, rounded to thespecified number of decimal places

Method`priorProb()`

Usage

blm$priorProb(threshold = 0, gt = TRUE)

Arguments

threshold: Numerical threshold for comparison.
gt: Logical indicating whether to calculate probability greater thanor less than the threshold.

Returns

A character string containing the formatted probability statement.Calculates the prior for the effedt of the intervention based onthe hyperpriors.

This method computes and formats a statement about the probability intervalof the effect based on the prior distribution.

Method`priorInterval()`

Usage

blm$priorInterval(width = 0.75, round = 0)

Arguments

width: Desired probability width of the interval (default: 0.75).
round: Number of decimal places for rounding the bounds (default: 0).

Details

This method calculates the lower and upper bounds of the interval basedon the specified probability width and the prior distribution of the effect.It then formats the results into a clear and informative statement.

Note that the probability is checked to be within valid range (0-1) withconsideration of machine precision using .Machine$double.eps.

Returns

A character string containing the formatted probability intervalstatement.

Method`vizdraws()`

Plots impact's prior and posterior distributions.

For more details seevizdraws::vizdraws().

Usage

blm$vizdraws(...)

Arguments

...: other arguments passed to vizdraws.

Returns

An interactive plot of the prior and posterior distributions.

Method`clone()`

The objects of this class are cloneable with this method.

Usage

blm$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Calculate Probability of Posterior Draws Falling Within a Range

Description

This function estimates the probability that a vector of posterior draws,represented by the parametereta, falls within a specified range.It provides flexibility to use either prior distributions or posterior draws,and to specify one-sided or two-sided probability calculations.

Usage

calcProb(x, a = NULL, b = NULL, prior = FALSE, group_name = "group average")

Arguments

x

A numeric vector containing either posterior draws (default)or prior samples of the treatment effect parameter (eta).

a

(Optional) The lower bound of the range (as a proportion, not percentage).

b

(Optional) The upper bound of the range (as a proportion, not percentage).

prior

A logical value indicating whether to use prior samples (TRUE)or posterior draws (FALSE, default) for calculation.

group_name

A string describing the group for which the probabilityis being calculated (default: "group average").

Details

This function checks the following cases:

If botha andb areNULL, it returns an empty string.
Ifb is less than or equal toa, it throws an error.

The calculated probability and range are presented in a human-readable stringusing theglue package for formatting.

Value

A formatted string stating the calculated probability and thespecified range. The probability is the proportion of samples(either prior or posterior) that fall within the defined range.

Calculate Effect Sizes for Treatment vs. Control

Description

This function calculates effect sizes for each variable in a data frame, comparing treatment and control groups. It handles continuous, binary, and categorical variables using appropriate effect size measures.

Usage

calculateEffectSizes(data, treatment_column, to_check = NULL)

Arguments

data

A data frame containing the variables and treatment/control indicator.

treatment_column

The name of the column (as a string) in the data frame that indicates whether an observation is in the treatment or control group. This column must be a factor with exactly two levels.

to_check

(optional) A character vector specifying the names of thevariables for which effect sizes should be calculated. If NULL (default),all variables (except the treatment column) are processed.

Details

- For continuous variables, Hedges' g effect size is calculated. - For binary variables, Cox's Proportional Hazards Index (Cox's C) is calculated. - For categorical variables, the variable is converted into multiple indicator (dummy) variables, and the average Cox's C across these indicators is reported.

Value

A data frame with two columns: *Variable: The name of each variable in the original data frame. *EffectSize: The calculated effect size for each variable.

Check Baseline Equivalency.

Description

Check Baseline Equivalency.

Usage

checkBaseline(data, variables, treatment)

Arguments

data

dataframe with the pre-intervention variables,and the treatment indicator.

variables

vector of with the names of the pre-intervention variables.

treatment

name of the treatment indicator.

Value

tibble with the standardized difference of the pre-interventionvariables. The tibble includesvariables: the variable name,std_diff: the standardized difference for that variable as a numberbalance: the standardized difference for that variable as a factorvariable.For more details about this methodology checkhttps://ies.ed.gov/ncee/wwc/Docs/OnlineTraining/wwc_training_m3.pdf.

Cleans and prepares data for analysis

Description

This function performs a series of data cleaning and preprocessing stepsto ensure the data is suitable for analysis. This includes:

Missing data handling
Variable type checks
Collinearity and zero-variance feature removal

Usage

cleanData(data, y, treatment, x = NULL, binary = FALSE)

Arguments

data

A data.frame containing the data to be cleaned.

y

Name of the dependent variable (character).

treatment

Name of the treatment variable (character, should be logical).

x

Names of the covariates to include in the model (character vector, optional).

binary

Should the dependent variable be treated as binary? Default is FALSE

Value

A list containing the cleaned dataset and relevant metadata:

N: The number of observations after cleaning.
K The number of covariates after cleaning.
X The cleaned covariate matrix.
treat_vec: Treatment vector as integers (1 for TRUE, 0 for FALSE).
Y: The dependent variable vector.

Count missing values (NA) in a dataframe

Description

This function takes a dataframe as input and returns a tibblesummarizing the number of missing values (NA) in each column andthe number of rows with at least one missing value.

Usage

countMissing(df)

Arguments

df

A dataframe to analyze.

Value

A tibble with the following columns:

NA_<column_name>: Number of NAs in each original column
missing_rows: Number of rows with at least one NA across all columns

Cox's Proportional Hazards Index (Cox's C)

Description

Calculates Cox's C, a standardized effect size measure for comparing hazard ratesbetween two groups in survival analysis.

Usage

coxsIndex(p_t, p_c, n_t, n_c)

Arguments

p_t

Numeric value representing the proportion of events (e.g., failures, deaths) in the treatment group.

p_c

Numeric value representing the proportion of events in the control group.

n_t

Numeric value representing the sample size of the treatment group.

n_c

Numeric value representing the sample size of the control group.

Details

Cox's C is a useful effect size for survival analysis when hazard ratios are not constant over time.It's calculated based on the log odds ratio of events and includes a small sample size correction.The value 1.65 is used to approximate a conversion to a Cohen's d-like scale.

Value

The calculated Cox's C effect size.

References

Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society: Series B (Methodological), 34(2), 187-202.

Converts a dataframe into a named list to provide data to a Stan model

Description

Converts a dataframe into a named list to provide data to a Stan model

Usage

createData(  data,  y,  treatment,  x = NULL,  eta_mean = 0,  eta_sd = 1,  run_estimation = 1)

Arguments

data

The data frame to be used

y

The name of the outcome variable in the data frame

treatment

The name of the treatment indicator variable in the data frame

x

A vector of names of all covariates to be used in the data frame

eta_mean

The prior mean to be used for estimating the treatment effect

eta_sd

The prior standard deviation to be used for estimating the treatment effect

run_estimation

Whether to run the estimation, or merely draw data from the priors

Value

Returns
stan_data a named list providing the data for the stan model

Calculate credible interval from MCMC draws

Description

This function calculates a credible interval of the specifiedwidth from a vector of MCMC draws.

Usage

credibleInterval(draws, width)

Arguments

draws

A numeric vector containing MCMC draws.

width

A numeric value between 0 and 1 specifying the desiredwidth of the credible interval.

Details

The function calculates the lower and upper bounds of thecredible interval using the quantile function based on thespecified width.

Value

A named list containing three elements:

width: The specified width of the credible interval.
lower_bound: The lower bound of the credible interval.
upper_bound: The upper bound of the credible interval.

Examples

# Generate example drawsdraws <- rnorm(1000)# Calculate 95% credible intervalcredibleInterval(draws, width = 0.95)

Fits Stan model.

Description

Fits Stan model.

Usage

fit(stan_data, model = "blm", ...)

Arguments

stan_data

A named list providing the data for the Stan model.

model

A character string specifying the model type.Must be either "blm" (Bayesian linear model) or "bnb" (Bayesian negative binomial model).Defaults to "blm".

...

Additional options to be passed through torstan::sampling.

Value

The complete StanFit object from the fitted Stan model.

Extracts parameter from Stan model.

Description

Extracts parameter from Stan model.

Usage

getStanParameter(fit, par)

Arguments

fit

A Stan fit model.

par

Name of the parameter you want to extract.

Value

A long tibble with draws for the pamete.

Hedges' g Effect Size with Pooled Standard Deviation

Description

Calculates Hedges' g, a standardized effect size for comparing means.This version includes a small-sample correction factor (omega)and uses the pooled standard deviation.

Usage

hedgesG(n_t, n_c, y_t, y_c, s_t, s_c)

Arguments

n_t

Numeric value representing the sample size of the treatment group.

n_c

Numeric value representing the sample size of the control group.

y_t

Numeric value representing the mean of the treatment group.

y_c

Numeric value representing the mean of the control group.

s_t

Numeric value representing the standard deviation of the treatment group.

s_c

Numeric value representing the standard deviation of the control group.

Details

Hedges' g is a variation of Cohen's d that adjusts for small-sample bias.It is calculated as the difference in means divided by the pooled standarddeviation, then multiplied by a correction factor.

Value

The calculated Hedges' g effect size.

References

Hedges, L. V. (1981). Distribution theory for Glass's estimator of effect size and related estimators.Journal of Educational Statistics, 6(2), 107-128.

Bayesian Logit Model Factory

Description

A class for creating and managing Bayesian Logit Models

Active bindings

version: im package version used to fit model
tau_draws: Posterior draws for the treatment effect
mcmChecks: MCMC diagnostics
credible_interval: Credible interval for the treatment effect
prior_eta: Prior distribution for eta
prior_tau: Prior distribution for tau
prior_mean_y: Prior distribution for mean y
eta_draws: Posterior draws for eta
predict_list: List of predictions

Methods

Public methods

Method`new()`

Get the package version

Get the posterior draws for tau

Get the MCMC diagnostics

Get the credible interval

Get the prior for eta

Get the prior for tau

Get the prior for mean y

Get the posterior draws for eta

Get the list of predictions

Create a new Bayesian Logit Model object.

Usage

logit$new(  data,  y,  x,  treatment,  mean_alpha,  sd_alpha,  mean_beta,  sd_beta,  tau_mean,  tau_sd,  seed = 1982,  fit = TRUE,  ...)

Arguments

data: Data frame to be used
y: Name of the outcome variable in the data frame
x: Vector of names of all covariates in the data frame
treatment: Name of the treatment indicator variable in the data frame
mean_alpha: Prior mean for alpha
sd_alpha: Prior standard deviation for alpha
mean_beta: Prior mean for beta
sd_beta: Prior standard deviation for beta
tau_mean: Prior mean for the treatment effect estimation
tau_sd: Prior standard deviation for the treatment effect estimation
seed: Seed for Stan fitting
fit: Flag for fitting the data to the model or not
...: Additional arguments for Stan

Returns

invisible

Method`tracePlot()`

Plot MCMC trace for the eta and sigma parameters.

Usage

logit$tracePlot(...)

Arguments

...: Additional arguments for Stan

Returns

A ggplot object.

Method`calcProb()`

Calculates the posterior of an effect being greater than, less than,or within a range defined by thresholds.

Usage

logit$calcProb(a = 0, b = NULL, prior = FALSE)

Arguments

a: Optional. Lower bound for the threshold.
b: Optional. Upper bound for the threshold.
prior: Logical. If TRUE, calculates probabilities based onthe prior distribution.If FALSE (default), uses the posterior distribution.
x: A numeric vector containing draws from the posterior

Returns

A character string summarizing the estimated probability

Calculate point estimate of the effect

This R6 method calculates the point estimate of the effect sizebased on the posterior draws of the eta parameter.

Method`pointEstimate()`

Usage

logit$pointEstimate(median = TRUE)

Arguments

median: Logical value. If TRUE (default), the median ofthe eta draws is returned. If FALSE, the mean is returned.

Details

This R6 method calculates and returns a formatted statement summarizingthe credible interval of a specified width for the effect of the intervention.

Returns

A numeric value representing the point estimate.

Method`credibleInterval()`

Usage

logit$credibleInterval(width = 0.75, round = 0)

Arguments

width: Numeric value between 0 and 1 representing the desiredwidth of the credible interval (e.g., 0.95 for a 95% credible interval).
round: Integer value indicating the number of decimal places to roundthe lower and upper bounds of the credible interval.

Details

Returns

A character string with the following information:

The probability associated with the specified width
The lower and upper bounds of the credible interval, rounded to thespecified number of decimal places

Method`vizdraws()`

Plots impact's prior and posterior distributions.

Usage

logit$vizdraws(tau = FALSE, ...)

Arguments

tau: Logical. If TRUE, plot tau instead of eta
...: other arguments passed to vizdraws.

Returns

An interactive plot of the prior and posterior distributions.

Method`lollipop()`

Plots lollipop chart for the prior and posterior of the impact beinggreater or less than a threshold.

For more details seevizdraws::lollipops().

Usage

logit$lollipop(threshold = 0, ...)

Arguments

threshold: cutoff used to calculate the probability. Defaults tozero percent points
...: other arguments passed to vizdraws.

Returns

A lollipop chart with the prior and posterior probability ofthe impact being above or below a threshold.

Method`plotPrior()`

Plots draws from the prior distribution of the outcome, tau, and impactin percentage points.

Usage

logit$plotPrior()

Method`predict()`

Predict new data

Usage

logit$predict(new_data, name = NULL, M = NULL, ...)

Arguments

new_data: Data frame to be predicted
name: Group name of the prediction
M: Number of posterior draws to sample from
...: Additional arguments

Returns

invisible(self)

Method`getPred()`

Get posterior predictive draws

Usage

logit$getPred(name = NULL, ...)

Arguments

name: Group name of the prediction
...: Additional arguments (not used)

Returns

Matrix of posterior predictive draws

Method`predSummary()`

Get point estimate, credible interval and prob summary of predictive draws

Usage

logit$predSummary(  name = NULL,  subgroup = NULL,  median = TRUE,  width = 0.75,  round = 0,  a = NULL,  b = NULL,  ...)

Arguments

name: Optional. Group name of the prediction
subgroup: Optional. A boolean vector to get summary on the conditional group average
median: Optional. Logical value for using median or mean
width: Optional. Numeric value for credible interval width
round: Optional. Integer value for rounding
a: Optional. Lower bound threshold
b: Optional. Upper bound threshold
...: Additional arguments

Returns

A character string with summary information

Method`predCompare()`

Compare the average of the posterior draws of two groups

Usage

logit$predCompare(  name1,  name2,  subgroup1 = NULL,  subgroup2 = NULL,  median = TRUE,  width = 0.75,  round = 0,  a = NULL,  b = NULL,  ...)

Arguments

name1: Group name of the first prediction to be compared
name2: Group name of the second prediction to be compared
subgroup1: Optional. A boolean vector for the first group
subgroup2: Optional. A boolean vector for the second group
median: Optional. Logical value for using median or mean
width: Optional. Numeric value for credible interval width
round: Optional. Integer value for rounding
a: Optional. Lower bound threshold
b: Optional. Upper bound threshold
...: Additional arguments

Returns

A character string with comparison summary

Method`clone()`

The objects of this class are cloneable with this method.

Usage

logit$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Calculate logit link and sample from binomial distribution

Description

This function takes prior / posterior draws of the Bayesian logit model,applies theinverse logit (logistic) transformation to obtain probabilities,and then generates random samples from binomial distributions.

Usage

logitRng(alpha, tau, beta, treat, X, N)

Arguments

alpha

Numeric. A draw of alpha param from the logit model

tau

Numeric. A draw of tau param from the logit model

beta

Vector. A draw of beta params from the logit model

treat

A 0 / 1 vector of treatment indicator

X

Data to be predicted

N

Numeric. Size of the sample, should depend on size of the data.

Value

A vector of N samples of preditive draws

MCMC Checks

Description

Checks convergence, mixing, effective sample size, and divergent transitions

Methods

$new(fit, pars): Runs diagnostics on the suppliedstanfitobject, restricted to parameters identified by the character vectorpars.
Tests include:
Share of specified parameters with an Rhat less than 1.1. If anyhave an Rhat > 1.1,everything_looks_fine is set toFALSE.
Share of specified parameters with an n_eff at least 0.1% of thetotal number of posterior draws. If any have n_eff < 0.001 * N,everything_looks_fine is set toFALSE.
Share of specified parameters with an n_eff of at least 100. If anyhave n_eff < 100,everything_looks_fine is set toFALSE.
Number of divergent transitions during posterior sampling. If thereare any whatsoever,everything_looks_fine is set toFALSE.
Share of posterior iterations where the sampler reached themaximum treedepth. If more than 25\everything_looks_fine is set toFALSE.

Active bindings

everything_looks_fine: logical indicating whether all MCMC tests passed.
diagnostics: list of the outcome of each MCMC test
warnings: list of the warning messages from failed MCMC tests

Methods

Public methods

Method`new()`

Initialize a new mcmcChecks object and run diagnostics

Usage

mcmcChecks$new(fit, pars)

Arguments

fit: A stanfit object to check
pars: A character vector of parameter names to check

Method`clone()`

The objects of this class are cloneable with this method.

Usage

mcmcChecks$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Create a Meta-Analysis Object Using Data From Previous Studies

Description

Create a Meta-Analysis Object Using Data From Previous Studies

Details

A meta analysis has raw data and draws from the lift's posteriordistribution. This is represented by an R6 Class.

Active bindings

PosteriorATE: Draws from the posterior distribution of theaverage treatment effect.
checks: MCMC diagnostics
CredibleInterval: Lower and upper bounds of the credible interval
PointEstimate: Point estimate of the average treatment effect
fitted: Stan fit object

Methods

Public methods

Method`new()`

Create a new meta analysis object.

Usage

metaAnalysis$new(  data,  point_estimates,  standard_errors,  id,  mean_mu = 0,  sd_mu = 0.05,  ci_width = 0.75,  X = NULL,  run_estimation = 1,  ...)

Arguments

data: Data frame with data point estimates and standarderrors from studies.
point_estimates: Name of the variable in the data frame thatcontains the point estimates.
standard_errors: Name of the variable in the data frame thatcontains the standard errors of the point estimates.
id: Name of the variable in the data frame that contains theid of the studies.
mean_mu: Prior mean for the true lift in the population.
sd_mu: Prior mean for the standard deviation of thetrue lift in the population.
ci_width: Credible interval's width.
X: Covariates matrix.
run_estimation: Integer flag to control whether estimation is run (1) or not (0).
...: other arguments passed torstan::sampling()

Returns

A newmeta_analysis object.

Method`PlotRawData()`

Plots the raw data.

Usage

metaAnalysis$PlotRawData()

Returns

A plot with point estimates and 95% confidence intervals.

Method`PlotLift()`

Plots lift's prior and posterior distributions.

For more details seevizdraws::vizdraws().

Usage

metaAnalysis$PlotLift(...)

Arguments

...: other arguments passed to vizdraws.

Returns

An interactive plot of the prior and posterior distributions.

Method`UpdateCI()`

Update the width of the credible interval.

Usage

metaAnalysis$UpdateCI(ci_width)

Arguments

ci_width: New width for the credible interval. This number in the(0,1) interval.

Method`probability()`

Calculates that probability that lift is between a and b.

Usage

metaAnalysis$probability(a = -Inf, b = Inf, percent = TRUE)

Arguments

a: Lower bound. By default -Inf.
b: Upper bound. By default Inf.
percent: A logical that indicates that a and b shouldbe converted to percentage.

Returns

A string with the probability.

Method`findings()`

Calculates the point estimate a credible interval for the meta analysis.

Usage

metaAnalysis$findings(percent = TRUE)

Arguments

percent: A logical that indicates that the point estimate should beconverted to percent.

Returns

A string with the findings

Method`clone()`

The objects of this class are cloneable with this method.

Usage

metaAnalysis$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Bayesian Negative Binomial Model Factory

Description

Bayesian Negative Binomial Model Factory

Active bindings

version: Package version used to fit the model
mcmChecks: MCMC diagnostics
credible_interval: Credible interval for the treatment effect
tau_draws: Posterior draws for the treatment effect

Methods

Public methods

Method`new()`

Create a new Bayesian Negative Binomial Model object.

Usage

negativeBinomial$new(  data,  y,  x,  treatment,  tau_mean,  tau_sd,  run_estimation = 1,  seed = 1982,  ...)

Arguments

data: Data frame to be used
y: Name of the outcome variable in the data frame
x: Vector of names of all covariates in the data frame
treatment: Name of the treatment indicator variable in the data frame
tau_mean: Prior mean for the treatment effect estimation
tau_sd: Prior standard deviation for the treatment effect estimation
run_estimation: Integer flag to control whether estimation is run (1) or not (0)
seed: Seed for Stan fitting
...: Additional arguments for Stan

Returns

invisible

Method`tracePlot()`

Plot MCMC trace for the eta and sigma parameters.

Usage

negativeBinomial$tracePlot(...)

Arguments

...: Additional arguments for Stan

Returns

A ggplot object.

Method`posteriorProb()`

Calculate posterior probability of effect exceeding a threshold

This function calculates the posterior probability of the effectbeing larger or smaller than a specified threshold.

Usage

negativeBinomial$posteriorProb(threshold = 0, gt = TRUE)

Arguments

threshold: A numeric value specifying the threshold.
gt: A logical value indicating whether to calculate the probabilityof the effect being greater than (TRUE) or less than (FALSE)the threshold.

Details

This function uses the private$..tau_draws internal variableto obtain draws from the posterior distribution of the effect size.Based on the specified arguments, the function calculates theproportion of draws exceeding/falling below the threshold andreturns a formatted statement describing the estimated probability.

Calculate point estimate of the effect

This R6 method calculates the point estimate of the effect sizebased on the posterior draws of the tau parameter.

Returns

A character string summarizing the estimated probability.

Method`pointEstimate()`

Usage

negativeBinomial$pointEstimate(median = TRUE)

Arguments

median: Logical value. If TRUE (default), the median ofthe tau draws is returned. If FALSE, the mean is returned.

Details

This method uses the private$..tau_draws internal variablewhich contains MCMC draws of the tau parameter representing theeffect size. Based on the specified median argument, the methodcalculates and returns either the median or the mean of the draws.Calculates credible interval for the effect of the intervention

This R6 method calculates and returns a formatted statement summarizingthe credible interval of a specified width for the effect of the intervention.

Returns

A numeric value representing the point estimate.

Method`credibleInterval()`

Usage

negativeBinomial$credibleInterval(width = 0.75, round = 0)

Arguments

width: Numeric value between 0 and 1 representing the desiredwidth of the credible interval (e.g., 0.95 for a 95% credible interval).
round: Integer value indicating the number of decimal places to roundthe lower and upper bounds of the credible interval.

Details

This method uses the private$..tau_draws internal variablecontaining MCMC draws of the tau parameter representing the effectsize. It calculates the credible interval, stores it internally, andreturns a formatted statement summarizing the findings.

Returns

A character string with the following information:

The probability associated with the specified width
The lower and upper bounds of the credible interval, rounded to thespecified number of decimal places

Method`vizdraws()`

Plots impact's prior and posterior distributions.

For more details seevizdraws::vizdraws().

Usage

negativeBinomial$vizdraws(...)

Arguments

...: other arguments passed to vizdraws.

Returns

An interactive plot of the prior and posterior distributions.

Method`clone()`

The objects of this class are cloneable with this method.

Usage

negativeBinomial$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Calculate Point Estimate (Median or Mean) as Percentage

Description

This function computes a point estimate from a numeric vector,returning either the median or the mean as a percentage.

Usage

pointEstimate(x, median = TRUE)

Arguments

x

A numeric vector containing the data from which to calculate the point estimate.

median

A logical value indicating whether to use the median (default:TRUE)or the mean (FALSE) as the point estimate.

Details

This function provides a simple way to obtain either the median or meanof a numeric vector as a percentage. The choice between these two measuresof central tendency can be controlled by themedian argument.

Value

A numeric value representing the chosen point estimate (median or mean)of the input vectorx, multiplied by 100 to express it as a percentage.

Randomly Assign Treatment While Controlling for Baseline Equivalency

Description

This function repeatedly randomizes treatmentassignment (using.randomizer) untilbaseline equivalency is achieved across specifiedvariables, as measured by thecheckBaseline function fromtheim package. It can optionally stratify the randomization byspecified groups.

Usage

randomize(  data,  variables,  standard = "Not Concerned",  seed = NULL,  max_attempts = 100,  pr_treated = 0.5,  group_by = NULL)

Arguments

data

The input data frame containing pre-intervention variables.

variables

A vector of the names of the pre-interventionvariables to checkfor baseline equivalency.

standard

The desired level of baseline equivalence.Must be one of "Not Concerned","Concerned", or "Very Concerned". Default is "Not Concerned".("Not Concerned", "Concerned", or "Very Concerned").Must be one of "Not Concerned", "Concerned", or "Very Concerned".

seed

(Optional) An integer to set the randomseed for reproducibility of theinitial randomization attempt. Subsequent attempts will use new random seeds.

max_attempts

The maximum number of randomizationattempts to make before stopping and returning an error.

pr_treated

(Optional) The probability of a row being assigned to thetreatment group (TRUE). Default is 0.5.

group_by

(Optional) A character vector of column names to stratify therandomization. If provided, the randomization will be done within each groupdefined by the specified columns.

Value

A new data frame with the added "treated"column, if baseline equivalency is achieved within the specified number ofattempts. Otherwise, an error is thrown.

Randomization Class for Treatment Assignment

Description

This class provides methods to randomly assign treatmentsto a dataset while ensuring baseline covariate balance. It canhandle both simple and stratified randomization.

Active bindings

version: The version of theim package used for randomization.
data: The data frame with the assigned treatment.
seed: The random seed used for reproducibility.
balance_summary: A summary (or list of summaries) of the balanceassessment after randomization.
balance_plot: A plot (or list of plots) of the balance assessmentafter randomization.

Methods

Public methods

Method`new()`

Initialize a new Randomizer object.

Usage

randomizer$new(  data,  variables,  standard = "Not Concerned",  seed = NULL,  max_attempts = 100,  group_by = NULL)

Arguments

data: The input data frame.
variables: A vector of covariate names to check for balance.
standard: The desired level of baseline equivalence.Must be one of "Not Concerned","Concerned", or "Very Concerned". Default is "Not Concerned".("Not Concerned", "Concerned", or "Very Concerned").
seed: (Optional) An integer to set the random seed.
max_attempts: (Optional) Maximum number of randomization attempts.
group_by: (Optional) A character vector of column names tostratify randomization.

Returns

A newrandomizer object.

Method`clone()`

The objects of this class are cloneable with this method.

Usage

randomizer$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Validate a Logical Subgroup Vector

Description

This function checks if a given vector is a logical vector (TRUE/FALSE)and whether its length matches the number of rows in a specified matrix.It is designed to validate subgroup vectors used for subsettin data.

Usage

validate_logical_vector(subgroup, N, name = NULL)

Arguments

subgroup

A logical vector representing the subgroup to be validated.

N

Length the subgroup should have.

name

(Optional) A string indicating the name of group.

Details

This function performs two key validations:

Checks if thesubgroup vector is logical.
Checks if the length of thesubgroup vector matches the N.

Value

The originalsubgroup vector if it passes all validation checks.

Movatterモバイル変換

The 'imt' package

Description

Author(s)

References

See Also

Pipe operator

Description

Usage

Arguments

Value

Combine and Unite Columns

Description

Usage

Arguments

Value

Add a Random Treatment Indicator Column to a Data Frame

Description

Usage

Arguments

Value

Create a Baseline Balance Plot.

Description

Usage

Arguments

Value

Examples

Bayesian Linear Model Factory

Description

Active bindings

Methods

Public methods

Methodnew()

Usage

Arguments

Returns

MethodppcDensOverlay()

Usage

Arguments

Returns

MethodtracePlot()

Usage

Arguments

Returns

MethodposteriorProb()

Usage

Arguments

Details

Returns

MethodpointEstimate()

Usage

Arguments

Details

Returns

MethodcredibleInterval()

Usage

Arguments

Details

Returns

MethodpriorProb()

Usage

Arguments

Returns

MethodpriorInterval()

Usage

Arguments

Details

Returns

Methodvizdraws()

Usage

Arguments

Returns

Methodclone()

Usage

Arguments

Calculate Probability of Posterior Draws Falling Within a Range

Description

Usage

Arguments

Details

Method`new()`

Method`ppcDensOverlay()`

Method`tracePlot()`

Method`posteriorProb()`

Method`pointEstimate()`

Method`credibleInterval()`

Method`priorProb()`

Method`priorInterval()`

Method`vizdraws()`

Method`clone()`

Method`new()`

Method`tracePlot()`