| Title: | Bayesian Evaluation, Analysis, and Simulation Software Tools forTrials |
| Version: | 0.0.3 |
| Description: | Bayesian dynamic borrowing with covariate adjustment via inverse probability weighting for simulations and data analyses in clinical trials. This makes it easy to use propensity score methods to balance covariate distributions between external and internal data. This methodology based on Psioda et al (2025) <doi:10.1080/10543406.2025.2489285>. |
| License: | GPL (≥ 3) |
| URL: | https://gsk-biostatistics.github.io/beastt/,https://github.com/GSK-Biostatistics/beastt |
| BugReports: | https://github.com/GSK-Biostatistics/beastt/issues |
| Suggests: | knitr, mvtnorm, rmarkdown, spelling, testthat (≥ 3.0.0),tibble, vdiffr, survival |
| Config/testthat/edition: | 3 |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.2 |
| Imports: | cli, cobalt, distributional, dplyr, generics, ggdist, ggplot2,methods, mixtools, purrr, Rcpp (≥ 0.12.0), RcppParallel (≥5.0.1), rlang, rstan (≥ 2.18.1), rstantools (≥ 2.4.0),stringr, tidyr |
| VignetteBuilder: | knitr |
| Depends: | R (≥ 4.1.0) |
| LazyData: | true |
| Language: | en-US |
| LinkingTo: | BH (≥ 1.66.0), Rcpp (≥ 0.12.0), RcppEigen (≥ 0.3.3.3.0),RcppParallel (≥ 5.0.1), rstan (≥ 2.18.1), StanHeaders (≥2.18.0) |
| SystemRequirements: | GNU make |
| NeedsCompilation: | yes |
| Packaged: | 2025-05-15 07:08:48 UTC; christinafillmore |
| Author: | Christina Fillmore |
| Maintainer: | Christina Fillmore <christina.e.fillmore@gsk.com> |
| Repository: | CRAN |
| Date/Publication: | 2025-05-15 11:40:06 UTC |
The 'beastt' package.
Description
Inverse Probability Weighted Bayesian Dynamic Borrowing
References
Stan Development Team (NA). RStan: the R interface to Stan. R package version 2.32.3. https://mc-stan.org
Approximate Multivariate Normal Distribution as Beta at a Specific Time
Description
Converts a multivariate normal distribution for Weibull parameters (or a mixtureof these distributions) into an approximate beta distribution for the survivalprobability at a specific time point. This is particularly useful for visualizingsurvival probabilities in sweet spot plots
Usage
approx_mvn_at_time(x, time)Arguments
x | A vector of distributional objects that must be either multivariate normaldistributions or mixtures of multivariate normal distributions. For Weibull models,these represent distributions of the log(shape) and log(scale) parameters. |
time | A numeric value specifying survival time at which to calculate thesurvival probability. |
Details
The function performs the following steps:
For each multivariate normal distribution, it generates 10,000 samples of theWeibull parameters
Calculates the corresponding survival probabilities at the specified timeusing the Weibull survival function
Fits a beta distribution to match the mean and variance of these survivalprobabilities
For mixture distributions, it performs this approximation for each componentand creates a new mixture with the same weights
The conversion uses the relationship between Weibull parameters and survivalprobability: S(t) = exp(-(t*exp(log_scale))^exp(log_shape)).
Value
A vector of beta distributional (or mixture of beta distributional)objects approximating the survival probabilities at the specified timepoint. If the input is a mixture distribution, the output will be a mixtureof beta distributions with the same weights.
See Also
Examples
library(distributional)# Create a multivariate normal distribution for Weibull parameters# (log(shape), log(scale))mvn_dist <- dist_multivariate_normal( mu = list(c(0, -1)), # log(shape) = 0, log(scale) = -1 sigma = list(matrix(c(0.1, 0, 0, 0.1), nrow = 2)))# Approximate as beta distribution for survival at time t=12beta_approx <- approx_mvn_at_time(mvn_dist, time = 12)Calculate Average Distribution from Multiple Distributional Objects
Description
Compute a single "average" distribution from a vector of distributionalobjects. This function calculates the mean of each hyperparameter across allinput distributions and returns a new distributional object of the samefamily with these averaged hyperparameters.
Usage
avg_dist(x)Arguments
x | A vector of distributional objects of the same family (beta,normal, multivariate normal, or mixture). |
Details
The function supports four distribution families:
Beta distributions: Averages the shape1 and shape2 hyperparameters
Normal distributions: Averages the mean and standard deviationhyperparameters
Multivariate normal distributions: Averages the location vectorsand covariance matrices
Mixture distributions: Same as above for each distribution type,where averaging is done by component. Also averages the mixture weight.
For multivariate normal distributions, both the location vector andcovariance matrix are averaged element-wise.
Value
A single distributional object of the same family as the input,with hyperparameters set equal to the average of all input distributionhyperparameters.
Examples
library(distributional)# Beta distributionsbeta_dists <- c( dist_beta(shape1 = 2, shape2 = 5), dist_beta(shape1 = 3, shape2 = 3), dist_beta(shape1 = 4, shape2 = 2))avg_dist(beta_dists) |> parameters()# Normal distributionsnorm_dists <- c( dist_normal(mu = 0, sigma = 1), dist_normal(mu = 2, sigma = 2), dist_normal(mu = 4, sigma = 3))avg_dist(norm_dists) |> parameters()# Multivariate normal distributionsmvn_dists <- c( dist_multivariate_normal(mu = list(c(0, 0)), sigma = list(matrix(c(1, 0, 0, 1), nrow = 2))), dist_multivariate_normal(mu = list(c(1, 1)), sigma = list(matrix(c(2, 0, 0, 2), nrow = 2))))avg_dist(mvn_dists) |> parameters()Binary Simulation Data
Description
This is an example of output from a simulation study that investigates theoperating characteristics of inverse probability weighted Bayesian dynamicborrowing for the case with a binary outcome. This output was generatedbased on the binary simulation template. For this simulation study, only thedegree of covariate imbalance (as indicated bypopulation) and themarginal treatment effect were varied.
Usage
binary_sim_dfFormat
binary_sim_df A data frame with 255 rows and 6 columns:
- population
Populations defined by different covariate imbalances
- marg_trt_eff
Marginal treatment effect
- true_control_RR
True control response rate on the marginal scale
- reject_H0_yes
Probability of rejecting the null hypothesis in the case with borrowing
- no_borrowing_reject_H0_yes
Probability of rejecting the null hypothesis without borrowing
- pwr_prior
Vector of power priors (or some other informativeprior distribution for the control marginal parameter of interestbased on the external data) as distributional objects
Bootstrap Covariate Data
Description
Bootstrap Covariate Data
Usage
bootstrap_cov( external_dat, n, imbal_var = NULL, imbal_prop = NULL, ref_val = 0)Arguments
external_dat | Data frame of the external data from which to bootstrapcovariate vectors |
n | Number of rows in the output dataset |
imbal_var | Optional variable indicating which covariate's distributionshould be altered to incorporate imbalance compared to the external data.If left |
imbal_prop | Optional imbalance proportion, required if an imbalancevariable is specified. This defines the proportion of individuals with thereference value of the imbalance variable in the returned dataset. Thiscan either be a single proportion or a vector of proportions, in which casea list of datasets is returned. |
ref_val | Optional value corresponding to the reference level of thebinary imbalance variable, if specified |
Details
Covariate data can be generated forn individuals enrolled inthe internal trial by bootstrap sampling entire covariate vectors from theexternal data, thus preserving the correlation between the covariates. Ifbothimbal_var =NULL andimbal_prop =NULL, the function returnsa single data frame in which the distributions of each covariate alignwith the covariate distributions from the external data (i.e., balancedcovariate distributions across the two trials). Alternatively, covariateimbalance can be incorporated into the generated sample with respect to abinary covariate (imbal_var) such that a specified proportion(imbal_prop) of individuals in the resulting sample will have thereference level (ref_val) of this imbalance covariate. In this case,stratified bootstrap sampling is employed with the imbalance covariate asthe stratification factor.
Multiple samples with varying degrees of imbalance can be generatedsimultaneously by definingimbal_prop to be a vector of values. Thefunction then returns a list of data frames with a length equal to thenumber of specified imbalance proportions.
Value
Data frame with the same number of columns as the external data frameand n number of rows (if the length ofimbal_prop is 0 or 1); otherwise,a list of data frames with a length equal to that ofimbal_prop
Examples
# Return one data frame with covariate distributions similar to external datasamp_balance <- bootstrap_cov(ex_binary_df, n = 1000)# Return a list of two data frames that incorporate imbalance w.r.t. covariate 2samp_imbalance <- bootstrap_cov(ex_binary_df, n = 1000, imbal_var = cov2, imbal_prop = c(0.25, 0.5), ref_val = 0)Calculate Conditional Drift and Treatment Effect for Binary Outcome Models
Description
In order to properly generate binary response data for the internal trial aspart of a simulation study that investigates inverse probability weighting,we need to translate the desired marginal drift and treatment effect to thecorresponding conditional drift and treatment effect that can then be addedinto a binary outcome model (e.g., logistic regression model) used tosimulate response data.
Usage
calc_cond_binary(population, glm, marg_drift, marg_trt_eff)Arguments
population | A very large data frame (e.g., number of rows |
glm | Logistic regression model object fit using the external data |
marg_drift | Vector of marginal drift values |
marg_trt_eff | Vector of marginal treatment effect values |
Details
In simulation studies that investigate the properties of inverseprobability weighted Bayesian dynamic borrowing, scenarios should beconsidered in which the underlying response rates for the internal andexternal control populations differ by varying amounts due to unmeasuredconfounding (i.e., drift, where positive values indicate a higher responserate for the internal population). While values of drift and treatmenteffect (i.e., risk difference) can be defined on the marginal scale forsimulation studies, we must first convert these values to the conditionalscale and then include these terms, along with covariates, in a logisticregression outcome model when generating response data for the internalarms. Doing so allows us to assume a relationship between the covariatesand the response variable while properly accounting for drift and treatmenteffect.
To identify the conditional drift and treatment effect that correspond tospecified values of marginal drift and treatment effect, we first bootstrapcovariate vectors from the external data (e.g.,N \ge 100,000) toconstruct a "population" that represents both the internal trial(possibly incorporating intentional covariate imbalance) and the externaltrialafter standardizing it to match the covariate distributionsof the internal trial (allowing us to control for measured confoundingfrom potential imbalance in the covariate distributions). Measuredconfounding can be incorporated into the data generation by bootstrappinga very large data frame (population) in which the distribution of atleast one covariate is intentionally varied from that of the external data;additionalunmeasured drift can be incorporated through thetranslation of specified marginal values (marg_drift) to conditionalvalues.
Let\Delta and\delta denote the marginal and conditional drift,respectively. For a specified value of\Delta, we can identify thecorresponding\delta as the value that, when added as an additionalterm in the logistic regression model (i.e., change in the intercept) foreach individual in the population, increases/decreases thepopulation-averaged conditional probabilities of response by an amountapproximately equal to\Delta. That is, the optimal\deltaminimizes
\left| \left( \frac{1}{N} \sum_{i=1}^N \frac{\exp \left( \boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} + \delta \right)}{1 + \exp\left(\boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} + \delta \right)} - \frac{1}{N} \sum_{i=1}^N \frac{\exp \left( \boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} \right)}{1 + \exp \left(\boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} \right)} \right) - \Delta \right|,
where\boldsymbol{\beta}_{EC} is the vector of regression coefficientsfrom the logistic regression model (glm) fit to the external control data(assumed here to be the "true" covariate effects when generating responsedata) and\boldsymbol{x}_i is a vector of covariates (including anintercept term) from the bootstrapped population of sizeN. In theformula above, the first and second terms correspond to thepopulation-averaged conditional probabilities (i.e., the marginal responserates) of the internal control population with drift and the externalcontrol population (with covariate distributions standardized to match theinternal trial), respectively.
If we now denote the marginal and conditional treatment effect by\Gamma and\gamma, respectively, we can use a similar processto identify the optimal\gamma that approximately corresponds to thespecified value of\Gamma, which is done by minimizing the following:
\left| \left( \frac{1}{N} \sum_{i=1}^N \frac{\exp \left( \boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} + \delta + \gamma \right)}{1 + \exp\left(\boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} + \delta + \gamma \right)} - \frac{1}{N} \sum_{i=1}^N \frac{\exp \left( \boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} + \delta \right)}{1 + \exp \left(\boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} + \delta \right)} \right) - \Gamma \right|,
where the first term is the average of the conditional probabilitiesof response (i.e., the marginal response rate) of the internaltreated population.
Seeherefor a simulation example with a binary outcome.
Value
tibble of all combinations of the marginal drift and treatmenteffect. For each row the conditional drift and treatment effect has beencalculated as well as the true response rates for the control and treatedpopulations.
Examples
library(dplyr)# Model "true" regression coefficients using the external datalogit_mod <- glm(y ~ cov1 + cov2 + cov3 + cov4, data = ex_binary_df, family = binomial)# Bootstrap internal control "population" with imbalance w.r.t. covariate 2pop_int_ctrl <- bootstrap_cov(ex_binary_df, n = 100000, imbal_var = cov2, imbal_prop = 0.25, ref_val = 0) |> select(-subjid, -y) # keep only covariate columns# Convert the marginal drift and treatment effects to conditionalcalc_cond_binary(population = pop_int_ctrl, glm = logit_mod, marg_drift = c(-.1, 0, .1), marg_trt_eff = c(0, .15))Calculate Conditional Drift and Treatment Effect for Time-to-Event Outcome Models
Description
In order to properly generate time-to-event (TTE) outcome data for theinternal trial as part of a simulation study that investigates inverseprobability weighting, we need to translate the desired marginal drift andtreatment effect to the corresponding conditional drift and treatment effectthat can then be added into a TTE outcome model (e.g., Weibull proportionalhazards regression model) used to simulate response data.
Usage
calc_cond_weibull( population, weibull_ph_mod, marg_drift, marg_trt_eff, analysis_time)Arguments
population | A very large data frame (e.g., number of rows |
weibull_ph_mod |
|
marg_drift | Vector of marginal drift values |
marg_trt_eff | Vector of marginal treatment effect values |
analysis_time | A single time point when survival probabilities will becalculated |
Details
In simulation studies that investigate the properties of inverseprobability weighted Bayesian dynamic borrowing, scenarios should beconsidered in which the underlying survival probabilities at someprespecified timet (analysis_time) for the internal and externalcontrol populations differ by varying amounts due to unmeasured confounding(i.e., drift, where positive values indicate a higher survival probabilityfor the internal population). While values of drift and treatment effect(i.e., difference between the survival probabilities at timet forthe treated and control populations) can be defined on the marginal scalefor simulation studies, we must first convert these values to theconditional scale and then include these terms, along with covariates, in aWeibull proportional hazards (PH) regression outcome model when generatingtime-to-event (TTE) data for the internal arms. Doing so allows us toassume a relationship between the covariates and the response variablewhile properly accounting for drift and treatment effect.
To identify the conditional drift and treatment effect that correspond tospecified values of marginal drift and treatment effect, we first bootstrapcovariate vectors from the external data (e.g.,N \ge 100,000) toconstruct a "population" that represents both the internal trial(possibly incorporating intentional covariate imbalance) and the externaltrialafter standardizing it to match the covariate distributionsof the internal trial (allowing us to control for measured confoundingfrom potential imbalance in the covariate distributions). Measuredconfounding can be incorporated into the data generation by bootstrappinga very large data frame (population) in which the distribution of atleast one covariate is intentionally varied from that of the external data;additionalunmeasured drift can be incorporated through thetranslation of specified marginal values (marg_drift) to conditionalvalues.
Let\Delta and\delta denote the marginal and conditional drift,respectively. For a specified value of\Delta, we can identify thecorresponding\delta as the value that, when added as an additionalterm in the Weibull PH model survival function (i.e., additive change inthe intercept) for each individual in the population, increases/decreasesthe population-averaged conditional probabilities of survival at timet by an amount approximately equal to\Delta. That is, theoptimal\delta minimizes
\left| \left( \frac{1}{N} \sum_{i=1}^N \exp \left( -\left\{ \exp \left( \boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} + \delta \right) \times t \right\}^{\alpha_{EC}} \right) - \frac{1}{N} \sum_{i=1}^N \exp \left( -\left\{ \exp \left( \boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} \right) \times t \right\}^{\alpha_{EC}} \right) \right) - \Delta \right|,
where\alpha_{EC} is the Weibull shape parameter,\boldsymbol{\beta}_{EC} is a vector of regression coefficients, and\boldsymbol{x}_i is a vector of covariates (including an interceptterm) from the bootstrapped population of sizeN. We note that\alpha_{EC} = 1/\sigma_{EC} and\boldsymbol{\beta}_{EC} = -\boldsymbol{\xi}_{EC} are calculated as functions of the scale parameter(\sigma_{EC}) and coefficients (\boldsymbol{\xi}_{EC})estimated by thesurvreg object that was fit to the external data, and weassume here that these estimates are the "true" shape and covariate effectswhen generating response data. In the formula above, the first and secondterms correspond to the population-averaged conditional survival functions(i.e., the marginal survival probabilities) at timet for theinternal control population with drift and the external control population(with covariate distributions standardized to match the internal trial),respectively.
If we now denote the marginal and conditional treatment effect by\Gamma and\gamma, respectively, we can use a similar processto identify the optimal\gamma that approximately corresponds to thespecified value of\Gamma, which is done by minimizing the following:
\left| \left( \frac{1}{N} \sum_{i=1}^N \exp \left( -\left\{ \exp \left( \boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} + \delta + \gamma \right) \times t \right\}^{\alpha_{EC}} \right) - \frac{1}{N} \sum_{i=1}^N \exp \left( -\left\{ \exp \left( \boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} + \delta \right) \times t \right\}^{\alpha_{EC}} \right) \right) - \Gamma \right|,
where the first term is the average of the conditional survival functions(i.e., the marginal survival probabilities) at timet for theinternal treated population.
Seeherefor a simulation example with a time-to-event outcome.
Value
tibble of all combinations of the marginal drift and treatmenteffect. For each row the conditional drift and treatment effect has beencalculated as well as the true marginal survival probabilities at timetfor the control and treatment populations.
Examples
library(dplyr)library(survival)# Model "true" regression coefficients using the external dataweibull_ph_mod <- survreg(Surv(y, event) ~ cov1 + cov2 + cov3 + cov4, data = ex_tte_df, dist = "weibull")# Bootstrap internal control "population" with imbalance w.r.t. covariate 2pop_int_ctrl <- bootstrap_cov(ex_tte_df, n = 100000, imbal_var = cov2, imbal_prop = 0.25, ref_val = 0) |> select(c(cov1, cov2, cov3, cov4)) # keep only covariate columns# Convert the marginal drift and treatment effects to conditionalcalc_cond_weibull(population = pop_int_ctrl, weibull_ph_mod, marg_drift = c(-.1, 0, .1), marg_trt_eff = c(0, .10), analysis_time = 12)Calculate Posterior Beta
Description
Calculate a posterior distribution that is beta (or a mixtureof beta components). Only the relevant treatment arms from the internaldataset should be read in (e.g., only the control arm if constructing aposterior distribution for the control response rate).
Usage
calc_post_beta(internal_data, response, prior)Arguments
internal_data | A tibble of the internal data. |
response | Name of response variable |
prior | A distributional object corresponding to a beta distributionor a mixture distribution of beta components |
Details
For a given arm of an internal trial (e.g., the control arm or anactive treatment arm) of sizeN_I, suppose the response data are binarysuch thatY_i \sim \mbox{Bernoulli}(\theta),i=1,\ldots,N_I. Theposterior distribution for\theta is written as
\pi( \theta \mid \boldsymbol{y} ) \propto \mathcal{L}(\theta \mid \boldsymbol{y}) \; \pi(\theta),
where\mathcal{L}(\theta \mid \boldsymbol{y}) is the likelihood of theresponse data from the internal arm and\pi(\theta) is a priordistribution on\theta (either a beta distribution or a mixturedistribution with an arbitrary number of beta components). The posteriordistribution for\theta is either a beta distribution or a mixture ofbeta components depending on whether the prior is a single betadistribution or a mixture distribution.
Value
distributional object
Examples
library(dplyr)library(distributional)calc_post_beta(internal_data = filter(int_binary_df, trt == 1), response = y, prior = dist_beta(0.5, 0.5))Calculate Posterior Normal
Description
Calculate a posterior distribution that is normal (or a mixtureof normal components). Only the relevant treatment arms from the internaldataset should be read in (e.g., only the control arm if constructing aposterior distribution for the control mean).
Usage
calc_post_norm(internal_data, response, prior, internal_sd = NULL)Arguments
internal_data | A tibble of the internal data. |
response | Name of response variable |
prior | A distributional object corresponding to a normal distribution,a t distribution, or a mixture distribution of normal and/or t components |
internal_sd | Standard deviation of internal response data ifassumed known. It can be left as |
Details
For a given arm of an internal trial (e.g., the control arm or anactive treatment arm) of sizeN_I, suppose the response data are normallydistributed such thatY_i \sim N(\theta, \sigma_I^2),i=1,\ldots,N_I.If\sigma_I^2 is assumed known, the posterior distribution for\thetais written as
\pi( \theta \mid \boldsymbol{y}, \sigma_{I}^2 ) \propto \mathcal{L}(\theta \mid \boldsymbol{y}, \sigma_{I}^2) \; \pi(\theta),
where\mathcal{L}(\theta \mid \boldsymbol{y}, \sigma_{I}^2) is thelikelihood of the response data from the internal arm and\pi(\theta)is a prior distribution on\theta (either a normal distribution, at distribution, or a mixture distribution with an arbitrary number ofnormal and/ort components). Anyt components of the prior for\theta are approximated with a mixture of two normal distributions.
If\sigma_I^2 is unknown, the marginal posterior distribution for\theta is instead written as
\pi( \theta \mid \boldsymbol{y} ) \propto \left\{ \int_0^\infty \mathcal{L}(\theta, \sigma_{I}^2 \mid \boldsymbol{y}) \; \pi(\sigma_{I}^2) \; d\sigma_{I}^2 \right\} \times \pi(\theta).
In this case, the prior for\sigma_I^2 is chosen to be\pi(\sigma_{I}^2) = (\sigma_I^2)^{-1} such that\left\{ \int_0^\infty \mathcal{L}(\theta, \sigma_{I}^2 \mid \boldsymbol{y}) \; \pi(\sigma_{I}^2) \; d\sigma_{I}^2 \right\}becomes a non-standardizedt distribution. This integrated likelihoodis then approximated with a mixture of two normal distributions.
Ifinternal_sd is supplied a positive value andprior corresponds to asingle normal distribution, then the posterior distribution for\thetais a normal distribution. Ifinternal_sd = NULL or if other types of priordistributions are specified (e.g., mixture or t distribution), then theposterior distribution is a mixture of normal distributions.
Value
distributional object
Examples
library(distributional)library(dplyr)post_treated <- calc_post_norm(internal_data = filter(int_norm_df, trt == 1), response = y, prior = dist_normal(0.5, 10), internal_sd = 0.15)Calculate Posterior Weibull
Description
Calculate a posterior distribution for the probability ofsurviving past a given analysis time(s) for time-to-event data with aWeibull likelihood. Only the relevant treatment arms from the internaldataset should be read in (e.g., only the control arm if constructing aposterior distribution for the control survival probability).
Usage
calc_post_weibull(internal_data, response, event, prior, analysis_time, ...)Arguments
internal_data | This can either be a propensity score object or a tibbleof the internal data. |
response | Name of response variable |
event | Name of event indicator variable (1: event; 0: censored) |
prior | A distributional object corresponding to a multivariate normaldistribution or a mixture of 2 multivariate normals. The first elementof the mean vector and the first row/column of covariance matrix correspondto the log-shape parameter, and the second element corresponds to the intercept(i.e., log-inverse-scale) parameter. |
analysis_time | Vector of time(s) when survival probabilities will becalculated |
... | rstan sampling option. This overrides any default options. For moreinformation, see |
Details
For a given arm of an internal trial (e.g., the control arm or anactive treatment arm) of sizeN_I, suppose the response data are time to eventsuch thatY_i \sim \mbox{Weibull}(\alpha, \sigma), where
f(y_i \mid \alpha, \sigma) = \left( \frac{\alpha}{\sigma} \right) \left( \frac{y_i}{\sigma} \right)^{\alpha - 1} \exp \left( -\left( \frac{y_i}{\sigma} \right)^\alpha \right),
i=1,\ldots,N_I. Define\boldsymbol{\theta} = \{\log(\alpha), \beta\}where\beta = -\log(\sigma) is the intercept parameter of a Weibullproportional hazards regression model. The posterior distribution for\boldsymbol{\theta} is written as
\pi( \boldsymbol{\theta} \mid \boldsymbol{y}, \boldsymbol{\nu} ) \propto \mathcal{L}(\boldsymbol{\theta} \mid \boldsymbol{y}, \boldsymbol{\nu}) \; \pi(\boldsymbol{\theta}),
where\mathcal{L}(\boldsymbol{\theta} \mid \boldsymbol{y}, \boldsymbol{\nu}) = \prod_{i=1}^{N_I} f(y_i \mid \boldsymbol{\theta})^{\nu_i} S(y_i \mid \boldsymbol{\theta})^{1 - \nu_i}is the likelihood of the response data from the internal arm with event indicator\boldsymbol{\nu} and survival functionS(y_i \mid \boldsymbol{\theta}) = 1 - F(y_i \mid \boldsymbol{\theta}), and\pi(\boldsymbol{\theta}) is a priordistribution on\boldsymbol{\theta} (either a multivariate normal distribution or amixture of two multivariate normal distributions). Note that the posterior distributionfor\boldsymbol{\theta} does not have a closed form, and thus MCMC samplesfor\log(\alpha) and\beta are drawn from the posterior. These MCMCsamples are used to construct samples from the posterior distributionfor the probability of surviving past the specified analysis time(s) for thegiven arm.
To construct a posterior distribution for the treatment difference (i.e., thedifference in survival probabilities at the specified analysis time), obtain MCMCsamples from the posterior distributions for the survival probabilities undereach arm and then subtract the control-arm samples from the treatment-armsamples.
Value
stan posterior object
Examples
library(distributional)library(dplyr)library(rstan)mvn_prior <- dist_multivariate_normal( mu = list(c(0.3, -2.6)), sigma = list(matrix(c(1.5, 0.3, 0.3, 1.1), nrow = 2)))post_treated <- calc_post_weibull(filter(int_tte_df, trt == 1), response = y, event = event, prior = mvn_prior, analysis_time = 12, warmup = 5000, iter = 15000)# Extract MCMC samples of survival probabilities at time t=12surv_prob_treated <- as.data.frame(extract(post_treated, pars = c("survProb")))[,1]Calculate Power Prior Beta
Description
Calculate a (potentially inverse probability weighted) betapower prior for the control response rate using external control data.
Usage
calc_power_prior_beta(external_data, response, prior)Arguments
external_data | This can either be a |
response | Name of response variable |
prior | A beta distributional object that is the initial prior for thecontrol response rate before the external control data are observed |
Details
Weighted participant-level response data from the control arm of anexternal study are incorporated into an inverse probability weighted (IPW)power prior for the control response rate\theta_C. When borrowinginformation from an external control arm of sizeN_{EC}, the componentsof the IPW power prior for\theta_C are defined as follows:
- Initial prior:
\theta_C \sim \mbox{Beta}(\nu_0, \phi_0)- IPW likelihood of the external response data
\boldsymbol{y}_Ewithweights\hat{\boldsymbol{a}}_0: \mathcal{L}_E(\theta_C \mid \boldsymbol{y}_E, \hat{\boldsymbol{a}}_0) \propto \exp \left( \sum_{i=1}^{N_{EC}} \hat{a}_{0i} \left[ y_i \log(\theta_C) + (1 - y_i) \log(1 - \theta_C) \right] \right)- IPW power prior:
\theta_C \mid \boldsymbol{y}_E, \hat{\boldsymbol{a}}_0 \sim \mbox{Beta} \left( \sum_{i=1}^{N_{EC}} \hat{a}_{0i} y_i + \nu_0, \sum_{i=1}^{N_{EC}} \hat{a}_{0i} (1 - y_i) + \phi_0 \right)
Defining the weights\hat{\boldsymbol{a}}_0 to equal 1 results in aconventional beta power prior.
Value
Beta power prior object
See Also
Other power prior:calc_power_prior_norm(),calc_power_prior_weibull()
Examples
library(distributional)library(dplyr)# This function can be used directly on the datacalc_power_prior_beta(external_data = ex_binary_df, response = y, prior = dist_beta(0.5, 0.5))# Or this function can be used with a propensity score objectps_obj <- calc_prop_scr(internal_df = filter(int_binary_df, trt == 0), external_df = ex_binary_df, id_col = subjid, model = ~ cov1 + cov2 + cov3 + cov4)calc_power_prior_beta(ps_obj, response = y, prior = dist_beta(0.5, 0.5))Calculate Power Prior Normal
Description
Calculate a (potentially inverse probability weighted) normalpower prior using external data.
Usage
calc_power_prior_norm( external_data, response, prior = NULL, external_sd = NULL)Arguments
external_data | This can either be a |
response | Name of response variable |
prior | Either |
external_sd | Standard deviation of external response data if assumedknown. It can be left as |
Details
Weighted participant-level response data from an external study areincorporated into an inverse probability weighted (IPW) power prior for theparameter of interest\theta (e.g., the control mean if borrowingfrom an external control arm). When borrowing information from an externaldataset of sizeN_{E}, the IPW likelihood of the external responsedata\boldsymbol{y}_E with weights\hat{\boldsymbol{a}}_0 is defined as
\mathcal{L}_E(\theta \mid \boldsymbol{y}_E, \hat{\boldsymbol{a}}_0, \sigma_{E}^2) \propto \exp \left( -\frac{1}{2 \sigma_{E}^2} \sum_{i=1}^{N_{E}} \hat{a}_{0i} (y_i - \theta)^2 \right).
Theprior argument should be either a distributional object with a familytype ofnormal orNULL, corresponding to the use of a normal initialprior or an improper uniform initial prior (i.e.,\pi(\theta) \propto 1), respectively.
Theexternal_sd argument can be a positive value if the external standarddeviation is assumed known or left asNULL otherwise. Ifexternal_sd = NULL, thenprior must beNULL to indicate the use of an improperuniform initial prior for\theta, and an improper prior is definedfor the unknown external standard deviation such that\pi(\sigma_E^2) \propto (\sigma_E^2)^{-1}. The details of the IPW power prior for eachcase are as follows:
external_sd = positive value(\sigma_E^2known):Witheither a proper normal or an improper uniform initial prior, the IPWweighted power prior for
\thetais a normal distribution.external_sd = NULL(\sigma_E^2unknown):With improperpriors for both
\thetaand\sigma_E^2, the marginal IPW weightedpower prior for\thetaafter integrating over\sigma_E^2isa non-standardizedtdistribution.
Defining the weights\hat{\boldsymbol{a}}_0 to equal 1 results in aconventional normal (ort) power prior if the external standarddeviation is known (unknown).
Value
Normal power prior object
See Also
Other power prior:calc_power_prior_beta(),calc_power_prior_weibull()
Examples
library(distributional)library(dplyr)# This function can be used directly on the datacalc_power_prior_norm(ex_norm_df, response = y, prior = dist_normal(0.5, 10), external_sd = 0.15)# Or this function can be used with a propensity score objectps_obj <- calc_prop_scr(internal_df = filter(int_norm_df, trt == 0), external_df = ex_norm_df, id_col = subjid, model = ~ cov1 + cov2 + cov3 + cov4)calc_power_prior_norm(ps_obj, response = y, prior = dist_normal(0.5, 10), external_sd = 0.15)Calculate Power Prior Weibull
Description
Calculate an approximate (potentially inverse probability weighted)multivariate normal power prior for the log-shape and log-inverse-scale parametersof a Weibull likelihood for external time-to-event control data.
Usage
calc_power_prior_weibull( external_data, response, event, intercept, shape, approximation = c("Laplace", "MCMC"), ...)Arguments
external_data | This can either be a |
response | Name of response variable |
event | Name of event indicator variable (1: event; 0: censored) |
intercept | Normal distributional object that is the initial prior for theintercept (i.e., log-inverse-scale) parameter |
shape | Integer value that is the scale of the half-normal priorfor the shape parameter |
approximation | Type of approximation to be used. Either |
... | Arguments passed to |
Details
Weighted participant-level response data from the control arm of anexternal study are incorporated into an approximated inverse probabilityweighted (IPW) power prior for the parameter vector\boldsymbol{\theta}_C = \{\log(\alpha), \beta\}, where\beta = -\log(\sigma)is the intercept parameter of a Weibull proportional hazards regression modeland\alpha and\sigma are the Weibull shape and scale parameters,respectively. When borrowing information from an external dataset of sizeN_{E},the IPW likelihood of the external response data\boldsymbol{y}_E withevent indicators\boldsymbol{\nu}_E and weights\hat{\boldsymbol{a}}_0is defined as
\mathcal{L}_E(\alpha, \sigma \mid \boldsymbol{y}_E, \boldsymbol{\nu}_E, \hat{\boldsymbol{a}}_0) \propto \prod_{i=1}^{N_E} \left\{ \left( \frac{\alpha}{\sigma} \right) \left( \frac{y_i}{\sigma} \right)^{\alpha - 1} \exp \left( -\left( \frac{y_i}{\sigma} \right)^\alpha \right) \right\}^{\hat{a}_{0i} \nu_i} \left\{ \exp \left( -\left( \frac{y_i}{\sigma} \right)^\alpha \right) \right\}^{\hat{a}_{0i}(1 - \nu_i)}.
The initial priors for the intercept parameter\beta and the shape parameter\alpha are assumed to be normal and half-normal, respectively, which canbe defined using theintercept andshape arguments.
The power prior for\boldsymbol{\theta}_C does not have a closed form, andthus we approximate it via a bivariate normal distribution; i.e.,
\boldsymbol{\theta}_C \mid \boldsymbol{y}_E, \boldsymbol{\nu}_E, \hat{\boldsymbol{a}}_0 \; \dot\sim \; \mbox{MVN} \left( \tilde{\boldsymbol{\mu}}_0, \tilde{\boldsymbol{\Sigma}}_0 \right).
Ifapproximation = Laplace, then\tilde{\boldsymbol{\mu}}_0 is the mode vectorof the IPW power prior and\tilde{\boldsymbol{\Sigma}}_0 is the negativeinverse of the Hessian of the log IPW power prior evaluated at the mode. Ifapproximation = MCMC, then MCMC samples are obtained from the IPW power prior,and\tilde{\boldsymbol{\mu}}_0 and\tilde{\boldsymbol{\Sigma}}_0are the estimated mean vector and covariance matrix of these MCMC samples.Note that the Laplace approximation method is faster due to its use ofoptimization instead of MCMC sampling.
The first element of the mean vector and the first row/column of covariancematrix correspond to the log-shape parameter (\log(\alpha)), and thesecond element corresponds to the intercept (\beta, the log-inverse-scale)parameter.
Value
Multivariate Normal Distributional Object
See Also
Other power prior:calc_power_prior_beta(),calc_power_prior_norm()
Examples
library(distributional)library(dplyr)# This function can be used directly on the datacalc_power_prior_weibull(ex_tte_df, response = y, event = event, intercept = dist_normal(0, 10), shape = 50, approximation = "Laplace")# Or this function can be used with a propensity score objectps_obj <- calc_prop_scr(internal_df = filter(int_tte_df, trt == 0), external_df = ex_tte_df, id_col = subjid, model = ~ cov1 + cov2 + cov3 + cov4)calc_power_prior_weibull(ps_obj, response = y, event = event, intercept = dist_normal(0, 10), shape = 50, approximation = "Laplace")Create a Propensity Score Object
Description
Calculate the propensity scores and ATT inverse probabilityweights for participants from internal and external datasets. Only therelevant treatment arms from each dataset should be read in (e.g., onlythe control arm from each dataset if creating a hybrid control arm).
Usage
calc_prop_scr(internal_df, external_df, id_col, model, ...)Arguments
internal_df | Internal dataset with one row per subject and all thevariables needed to run the model |
external_df | External dataset with one row per subject and all thevariables needed to run the model |
id_col | Name of the column in both datasets used to identify eachsubject. It must be the same across datasets |
model | Model used to calculate propensity scores |
... | Optional arguments |
Details
For the subset of participants in both the external and internalstudies for which we want to balance the covariate distributions (e.g.,external control and internal control participants if constructing ahybrid control arm), we define a study-inclusion propensity score foreach participant as
e(x_i) = P(S_i = 1 \mid x_i),
wherex_i denotes a vector of baseline covariates for theithparticipant andS_i denotes the indicator that the participant isenrolled in the internal trial (S_i = 1 if internal,S_i = 0if external). The estimated propensity score\hat{e}(x_i) is obtainedusing logistic regression.
An ATT inverse probability weight is calculated for each individual as
\hat{a}_{0i} = \frac{\hat{e}(x_i)}{\hat{P}(S_i = s_i | x_i)} = s_i + (1 - s_i ) \frac{\hat{e}(x_i)}{1 - \hat{e}(x_i)}.
In a weighted estimator, data from participants in the external studyare given a weight of\hat{e}(x_i)⁄(1 - \hat{e}(x_i)) whereas datafrom participants in the internal trial are given a weight of 1.
Value
prop_scr_obj object, with the internal and the external data andthe propensity score and inverse probability weight calculated for eachsubject.
Examples
# This can be used for both continuous and binary datalibrary(dplyr)# Continuouscalc_prop_scr(internal_df = filter(int_norm_df, trt == 0), external_df = ex_norm_df, id_col = subjid, model = ~ cov1 + cov2 + cov3 + cov4)# Binarycalc_prop_scr(internal_df = filter(int_binary_df, trt == 0), external_df = ex_binary_df, id_col = subjid, model = ~ cov1 + cov2 + cov3 + cov4)Calculate the Analysis Time Based on a Target Number of Events and/or TargetFollow-up Time
Description
Calculate the Analysis Time Based on a Target Number of Events and/or TargetFollow-up Time
Usage
calc_study_duration( study_time, observed_time, event_indicator, target_events = NULL, target_follow_up = NULL)Arguments
study_time | Vector of study times (accrual time + observed time) |
observed_time | Vector of observed times (event time or censoring time) |
event_indicator | Vector of boolean values (TRUE/FALSE or 1/0)indicating if the observed time value is an event or censoring time |
target_events | Target number of events, where the analysis time isdetermined once this number of events is reached. Default is |
target_follow_up | Target follow-up for each participant, where theanalysis time is determined once each participant in the risk set isfollowed up for this amount of time (i.e., minimum follow-up time). Defaultis |
Details
This function calculates the analysis time for a study with atime-to-event endpoint for which the target number of events(target_events) and/or target follow-up time (target_follow_up) arespecified. If onlytarget_events is specified, the analysis will occurat the time when the target number of events has been reached. If onlytarget_follow_up is specified, the analysis will occur once thelast-enrolled participant who is still in the risk set has been followed upfor this amount of time. If bothtarget_events andtarget_follow_up arespecified, the analysis time will be based on whichever occurs first.
Value
Time of analysis
Examples
library(dplyr)# Determining analysis time by reaching a target number of eventsex_tte_df |> mutate( analysis_time = calc_study_duration(study_time = total_time, observed_time = y, event_indicator = event, target_events = 30))# Determining analysis time by a target follow-up timeex_tte_df |> mutate( analysis_time = calc_study_duration(study_time = total_time, observed_time = y, event_indicator = event, target_follow_up = 12))# Or use both (whichever happens first)ex_tte_df |> mutate( analysis_time = calc_study_duration(study_time = total_time, observed_time = y, event_indicator = event, target_events = 30, target_follow_up = 12))External Binary Control Data for Propensity Score Balancing
Description
This is a simulated dataset used to illustrate Bayesian dynamic borrowing inthe case when borrowing from an external control arm with a binary endpoint,where the baseline covariate distributions of the internal and external dataare balanced via inverse probability weighting.
Usage
ex_binary_dfFormat
ex_binary_df
A data frame with 150 rows and 6 columns:
- subjid
Unique subject ID
- cov1
Covariate 1, which is normally distributed around 65 with a SD of 10
- cov2
Covariate 2, which is binary (0 vs. 1) with about 30% of participants having level 1
- cov3
Covariate 3, which is binary (0 vs. 1) with about 40% of participants having level 1
- cov4
Covariate 4, which is binary (0 vs. 1) with about 50% of participants having level 1
- y
Response, which is binary (0 vs. 1)
External Normal Control Data for Propensity Score Balancing
Description
This is a simulated dataset used to illustrate Bayesian dynamic borrowing inthe case when borrowing from an external control arm with a normal endpoint,where the baseline covariate distributions of the internal and external dataare balanced via inverse probability weighting.
Usage
ex_norm_dfFormat
ex_norm_df
A data frame with 150 rows and 6 columns:
- subjid
Unique subject ID
- cov1
Covariate 1, which is normally distributed around 50 with a SD of 10
- cov2
Covariate 2, which is binary (0 vs. 1) with about 20% of participants having level 1
- cov3
Covariate 3, which is binary (0 vs. 1) with about 60% of participants having level 1
- cov4
Covariate 4, which is binary (0 vs. 1) with about 30% of participants having level 1
- y
Response, which is normally distributed with a SD of 0.15
External Time-to-Event Control Data for Propensity Score Balancing
Description
This is a simulated dataset used to illustrate Bayesian dynamic borrowing inthe case when borrowing from an external control arm with a time-to-event endpoint,where the baseline covariate distributions of the internal and external dataare balanced via inverse probability weighting.
Usage
ex_tte_dfFormat
ex_tte_df
A data frame with 150 rows and 9 columns:
- subjid
Unique subject ID
- y
Response (observed time at which the participant either had an event or was censored)
- enr_time
Enrollment time
- total_time
Time from study start
- event
Event indicator (1: event; 0: censored)
- cov1
Covariate 1, which is normally distributed around 65 with a SD of 10
- cov2
Covariate 2, which is binary (0 vs. 1) with about 30% of participants having level 1
- cov3
Covariate 3, which is binary (0 vs. 1) with about 40% of participants having level 1
- cov4
Covariate 4, which is binary (0 vs. 1) with about 50% of participants having level 1
Internal Binary Data for Propensity Score Balancing
Description
This is a simulated dataset used to illustrate Bayesian dynamic borrowing inthe case when borrowing from an external control arm with a binary endpoint,where the baseline covariate distributions of the internal and external dataare balanced via inverse probability weighting.
Usage
int_binary_dfFormat
int_binary_df
A data frame with 160 rows and 7 columns:
- subjid
Unique subject ID
- cov1
Covariate 1, which is normally distributed around 62 with an sd of 8
- cov2
Covariate 2, which is binary (0 vs. 1) with about 40% of participants having level 1
- cov3
Covariate 3, which is binary (0 vs. 1) with about 40% of participants having level 1
- cov4
Covariate 4, which is binary (0 vs. 1) with about 60% of participants having level 1
- trt
Treatment indicator, where 0 = control and 1 = active treatment
- y
Response, which is binary (0 vs. 1)
Internal Normal Data for Propensity Score Balancing
Description
This is a simulated dataset used to illustrate Bayesian dynamic borrowing inthe case when borrowing from an external control arm with a normal endpoint,where the baseline covariate distributions of the internal and external dataare balanced via inverse probability weighting.
Usage
int_norm_dfFormat
int_norm_df
A data frame with 120 rows and 7 columns:
- subjid
Unique subject ID
- cov1
Covariate 1, which is normally distributed around 55 with a SD of 8
- cov2
Covariate 2, which is binary (0 vs. 1) with about 30% of participants having level 1
- cov3
Covariate 3, which is binary (0 vs. 1) with about 50% of participants having level 1
- cov4
Covariate 4, which is binary (0 vs. 1) with about 30% of participants having level 1
- trt
Treatment indicator, where 0 = control and 1 = active treatment
- y
Response, which is normally distributed with a SD of 0.15
Internal Time-to-Event Control Data for Propensity Score Balancing
Description
This is a simulated dataset used to illustrate Bayesian dynamic borrowing inthe case when borrowing from an external control arm with a time-to-event endpoint,where the baseline covariate distributions of the internal and external dataare balanced via inverse probability weighting.
Usage
int_tte_dfFormat
int_tte_df
A data frame with 160 rows and 10 columns:
- subjid
Unique subject ID
- y
Response (observed time at which the participant either had an event or was censored)
- enr_time
Enrollment time
- total_time
Time from study start
- event
Event indicator (1: event; 0: censored)
- trt
Treatment indicator, where 0 = control and 1 = active treatment
- cov1
Covariate 1, which is normally distributed around 62 with a SD of 8
- cov2
Covariate 2, which is binary (0 vs. 1) with about 40% of participants having level 1
- cov3
Covariate 3, which is binary (0 vs. 1) with about 40% of participants having level 1
- cov4
Covariate 4, which is binary (0 vs. 1) with about 60% of participants having level 1
Inverse Logit Function
Description
Inverse Logit Function
Usage
inv_logit(x)Arguments
x | Real number(s) to take the inverse logit of |
Details
This function is a short hand to\exp(x)/(1 + \exp(x)).
Value
Vector of probabilities between 0 and 1
Examples
inv_logit(0.5)Test If Propensity Score Object
Description
Test If Propensity Score Object
Usage
is_prop_scr(x)Arguments
x | Object to test |
Value
Boolean
Examples
library(dplyr)x <- calc_prop_scr(internal_df = filter(int_norm_df, trt == 0), external_df = ex_norm_df, id_col = subjid, model = ~ cov1 + cov2 + cov3 + cov4)is_prop_scr(x)Extract Means of Mixture Components
Description
Extract Means of Mixture Components
Usage
mix_means(x)Arguments
x | A mixture distributional object |
Details
If a distributional object that is a mixture of two or more normaldistributions is read in, the function will return a numeric object withthe means of each normal component. If the distributional object is amixture of two or more multivariate normal distributions, the functionwill return a list with the mean vectors of each multivariate normalcomponent.
Value
numeric or list object
Examples
library(distributional)mix_norm <- dist_mixture(comp1 = dist_normal(1, 10), comp2 = dist_normal(1.5, 12), weights = c(.5, .5))mix_means(mix_norm)Extract Standard Deviations of Mixture Components
Description
Extract Standard Deviations of Mixture Components
Usage
mix_sigmas(x)Arguments
x | A mixture distributional object |
Details
If a distributional object that is a mixture of two or more normaldistributions is read in, the function will return a numeric object withthe standard deviations of each normal component. If the distributionalobject is a mixture of two or more multivariate normal distributions, thefunction will return a list with the covariance matrices of each multivariatenormal component.
Value
numeric or list object
Examples
library(distributional)mix_norm <- dist_mixture(comp1 = dist_normal(1, 10), comp2 = dist_normal(1.5, 12), weights = c(.5, .5))mix_sigmas(mix_norm)Plot Distribution
Description
Plot Distribution
Usage
plot_dist(...)Arguments
... | Distributional object(s) to plot. When passing multiple objectsnaming them will change the labels in the plot, else they will use thedistributional format |
Value
ggplot object that is the density of the provided distribution
Examples
library(distributional)plot_dist(dist_normal(0, 1))plot_dist(dist_multivariate_normal(mu = list(c(1, 2)), sigma = list(matrix(c(4, 2, 2, 3), ncol=2))))#Plotting Multipleplot_dist(dist_normal(0, 1), dist_normal(10, 5))plot_dist('Prior' = dist_normal(0, 1), 'Posterior' = dist_normal(10, 5))Propensity Score Cloud Plot
Description
Propensity Score Cloud Plot
Usage
prop_scr_cloud(x, trimmed_prop_scr = NULL)Arguments
x | A |
trimmed_prop_scr | A trimmed |
Value
ggplot object
Examples
library(dplyr)ps_obj <- calc_prop_scr(internal_df = filter(int_norm_df, trt == 0), external_df = ex_norm_df, id_col = subjid, model = ~ cov1 + cov2 + cov3 + cov4)ps_obj_trimmed <- trim_ps(ps_obj, low = 0.1, high = 0.6)# Plotting the Propensity Scoresprop_scr_cloud(ps_obj, trimmed_prop_scr = ps_obj_trimmed)Density of the Propensity Score Object
Description
Plot overlapping density curves of the propensity scores forboth the internal and external participants, or plot external IPWs.
Usage
prop_scr_dens( x, variable = c("propensity score", "ps", "inverse probability weight", "ipw"), ...)Arguments
x | Propensity score object |
variable | Variable to plot. It must be either a propensity score("ps" or "propensity score") or inverse probability weight ("ipw" or"inverse probability weight") |
... | Optional arguments for |
Value
ggplot object
Examples
library(dplyr)ps_obj <- calc_prop_scr(internal_df = filter(int_norm_df, trt == 0), external_df = ex_norm_df, id_col = subjid, model = ~ cov1 + cov2 + cov3 + cov4)# Plotting the Propensity Scoresprop_scr_dens(ps_obj)# Or plotting the inverse probability weightsprop_scr_dens(ps_obj, variable = "ipw")Histogram of the Propensity Score Object
Description
Plot overlapping histograms of the propensity scores for boththe internal and external participants, or plot external IPWs.
Usage
prop_scr_hist( x, variable = c("propensity score", "ps", "inverse probability weight", "ipw"), ...)Arguments
x | Propensity score object |
variable | Variable to plot. It must be either a propensity score("ps" or "propensity score") or inverse probability weight ("ipw" or"inverse probability weight") |
... | Optional arguments for |
Value
ggplot object
Examples
library(dplyr)ps_obj <- calc_prop_scr(internal_df = filter(int_norm_df, trt == 0), external_df = ex_norm_df, id_col = subjid, model = ~ cov1 + cov2 + cov3 + cov4)# Plotting the Propensity Scoresprop_scr_hist(ps_obj)# Or plotting the inverse probability weightsprop_scr_hist(ps_obj, variable = "ipw")Love Plot of the Absolute Standardized Mean Differences
Description
Plot the unadjusted and IPW-adjusted absolute standardized meandifferences for each covariate.
Usage
prop_scr_love(x, reference_line = NULL, ...)Arguments
x | Propensity score object |
reference_line | Numeric value of where along the x-axis the verticalreference line should be placed |
... | Optional options for |
Value
ggplot object
Examples
library(dplyr)ps_obj <- calc_prop_scr(internal_df = filter(int_norm_df, trt == 0), external_df = ex_norm_df, id_col = subjid, model = ~ cov1 + cov2 + cov3 + cov4)# Plotting the Propensity Scoresprop_scr_love(ps_obj, reference_line = 0.1)Objects exported from other packages
Description
These objects are imported from other packages. Follow the linksbelow to see their documentation.
Rescale aprop_scr object
Description
Rescale aprop_scr object
Usage
rescale_ps(x, n = NULL, scale_factor = NULL)Arguments
x | a |
n | Desired sample size that the external data should effectivelycontribute to the analysis of the internal trial data. This will be used toscale the external weights if |
scale_factor | Value to multiple all weights by. This will be used toscale the external weights if |
Value
aprop_scr object with rescaled weights
Examples
library(dplyr)ps_obj <- calc_prop_scr(internal_df = filter(int_binary_df, trt == 0), external_df = ex_binary_df, id_col = subjid, model = ~ cov1 + cov2 + cov3 + cov4)# weights in a propensity score object can be rescaled to achieve a desired# effective sample size (i.e., sum of weights)rescale_ps(ps_obj, n = 75)# Or by a predetermined factorrescale_ps(ps_obj, scale_factor = 1.5)Robustify Multivariate Normal Distributions
Description
Adds vague normal component, where the level of vagueness is controlled bythen parameter
Usage
robustify_mvnorm(prior, n, weights = c(0.5, 0.5))Arguments
prior | Multivariate Normal distributional object |
n | Number of theoretical participants (or events, for time-to-event data) |
weights | Vector of weights, where the first number corresponds to theinformative component and the second is the vague |
Details
In cases with a time-to-event endpoint, a robust mixture prior can becreated by adding a vague multivariate normal component to any multivariatenormal prior with mean vector\boldsymbol{\mu} and covariance matrix\boldsymbol{\Sigma}. The vague component is calculated to have thesame mean vector\boldsymbol{\mu} and covariance matrix equal to\boldsymbol{\Sigma} \times n, wheren is the specified number oftheoretical events.
Value
mixture distribution
Examples
library(distributional)robustify_mvnorm( dist_multivariate_normal(mu = list(c(1, 0)), sigma = list(c(10, 5))), n = 15)Robustify Normal Distributions
Description
Adds vague normal component, where the level of vagueness is controlled bythen parameter
Usage
robustify_norm(prior, n, weights = c(0.5, 0.5))Arguments
prior | Normal or Multivariate Normal distributional object |
n | Number of theoretical participants (or events, for time-to-event data) |
weights | Vector of weights, where the first number corresponds to theinformative component and the second is the vague |
Details
In cases with a normal endpoint, a robust mixture prior can be created byadding a vague normal component to any normal prior with mean\thetaand variance\sigma^2.The vague component is calculated to have thesame mean\theta and variance equal to\sigma^2 \times n, wheren is the specified number of theoretical participants. If robustifying a normalpower prior that was calculated from external control data andn is defined asthe number of external control participants, and the vague component wouldthen correspond to one external control participant's worth of data.
Value
mixture distribution
Examples
library(distributional)robustify_norm(dist_normal(0,1), n = 15)Simulate Participant Accrual Times
Description
Simulate Participant Accrual Times
Usage
sim_accrual(n, accrual_periods, accrual_props)Arguments
n | Number of participants |
accrual_periods | Vector of right endpoints defining the time periods ofaccrual, e.g., c(6,8) defines 0<=x<6, 6<=x<8. |
accrual_props | Vector indicating the proportion of participants that areexpected to be enrolled during each of the defined accrual periods. Shouldsum to 1, otherwise these proportions will be normalized. |
Details
Simulate the accrual times for each participant, whereaccrual_periods defines the right time points for each accrual period(with the last element corresponding to the end of accrual), andaccrual_props defines the proportion of study participants who areenrolled during each of these periods. The simulated accrual times forparticipants within a given accrual period are assumed to be uniformlydistributed.
Value
Vector of accrual times corresponding to when each participant entersthe study
Examples
at <- sim_accrual(n = 100000, accrual_periods = c(6, 8), accrual_props = c(.5, .5))hist(at, breaks = 100, main = "Histogram of Enrollment Times", xlab = "Enrollment Time")Simulate Event Times for Each Individual from a Piecewise Constant Hazard Model
Description
Simulate Event Times for Each Individual from a Piecewise Constant Hazard Model
Usage
sim_pw_const_haz(n, hazard_periods = NULL, hazard_values)Arguments
n | Number of individuals |
hazard_periods | Vector of break points between time periods with separate constanthazards, e.g., c(6,8) defines [0,6), [6,8), [8, infinity). Leave as NULL ifdefining only one hazard period. |
hazard_values | Vector of constant hazard values associated with the time intervals |
Details
Simulate the event or censoring times for each participant using apiecewise constant hazard model where each time period is defined to havea different constant hazard. Here,hazard_periods defines the righttime points for each time period (with the exception of the last timeperiod which extends to infinity), andhazard_values defines the constanthazards for each time period.
Value
Vector of simulated times from the time-to-event distribution
Examples
tte_dat <- sim_pw_const_haz(n = 100000, hazard_periods = c(6, 8), hazard_values = c(0.1, 0.1, 0.1))hist(tte_dat, breaks = 100, main = "Event Time Distribution", xlab = "Event Time")Simulate Event Times for Each Participant from a Weibull Proportional HazardsRegression Model
Description
Simulate Event Times for Each Participant from a Weibull Proportional HazardsRegression Model
Usage
sim_weib_ph(weibull_ph_mod, samp_df, cond_drift = 0, cond_trt_effect = 0)Arguments
weibull_ph_mod |
|
samp_df | Data frame of covariates corresponding to the sample arm(control or treated) for which event times should be simulated. The columnnames should correspond to the covariate names in the |
cond_drift | Optional value of the conditional drift by which theintercept in the Weibull proportional hazards regression model should beincreased/decreased to incorporate the impact of unmeasurable sources ofdrift. Default is 0. |
cond_trt_effect | Optional value of the conditional treatment effect bywhich the intercept in the Weibull proportional hazards regression modelshould be increased/decreased if simulating event data for a treated arm.Default is 0. |
Details
Simulate the event times for each participant using a Weibullproportional hazards (PH) regression model. The "true" parameter valuesfor the Weibull shape\alpha and the regression coefficients\boldsymbol{\beta} are assumed to be equal to the parameter estimatesfrom asurvreg object (weibull_ph_mod) fit using external data (notethat the Weibull shape parameter\alpha is defined as the inverse ofthe scale parameter reported bysurvreg).
For participanti, lety_i denote the time-to-event randomvariable and\boldsymbol{x}_i = \{x_{i,1}, \ldots, x_{i,p}\} thevector ofp covariates (rowi ofsamp_df) that correspondto the(p+1)-dimensional vector of regression coefficients\boldsymbol{\beta}. The density function of the Weibull PH regressionmodel is
f(y_i \mid \boldsymbol{x}_i, \alpha, \boldsymbol{\beta}, \delta, \gamma) = \left( \frac{\alpha}{\sigma_i} \right) \left( \frac{y_i}{\sigma_i} \right)^{\alpha - 1} \exp \left( -\left( \frac{y_i}{\sigma_i} \right)^\alpha \right),
where-\log(\sigma_i) = \beta_0 + \beta_1 x_{i,1} + \ldots + \beta_p x_{i,p} + \delta + \gamma. Here,\delta and\gammadenote the conditional drift (cond_drift) and conditional treatmenteffect (cond_trt_effect), respectively, that can be calculated usingcalc_cond_weibull() for desired values of the marginal drift and marginaltreatment effect.
Value
Vector of simulated event times from a Weibull proportional hazardsregression model
Examples
library(dplyr)library(survival)# Model "true" regression coefficients and shape parameter using the external dataweibull_ph_mod <- survreg(Surv(y, event) ~ cov1 + cov2 + cov3 + cov4, data = ex_tte_df, dist = "weibull")# Sample covariates for internal control arm via bootstrap from external datasamp_int_ctrl <- bootstrap_cov(ex_tte_df, n = 100) |> select(c(cov1, cov2, cov3, cov4)) # keep only covariate columnstte_dat <- sim_weib_ph(weibull_ph_mod, samp_df = samp_int_ctrl)Create Sweet Spot Plots for Multiple Simulation Scenarios
Description
Create visualization plots to help identify the "sweet spot" in borrowingstrategies across different simulation scenarios. For each unique scenariodefined by the combination of variables inscenario_vars, the functionproduces a plot showing power, type I error, and the distribution of thedesign prior for the control marginal parameter for approaches with andwithout borrowing.
Usage
sweet_spot_plot( .data, scenario_vars, trt_diff, control_marg_param, h0_prob, h0_prob_no_borrowing, design_prior = NULL, highlight = TRUE)Arguments
.data | A data frame containing iteration-level simulation results. |
scenario_vars | A vector of quoted column names correspondingto variables used to define unique simulation scenarios. Each uniquecombination of values in these columns will generate a separate plot. |
trt_diff | An unquoted column name representing the treatmentdifference. Used to identify scenarios with null effect (trt_diff = 0) fortype I error calculation. |
control_marg_param | An unquoted column name to be used as the x-axis inthe plots. This is typically the control endpoint of interest on themarginal scale (e.g., control response rate). |
h0_prob | An unquoted column name containing the probability ofrejecting the null hypothesis when when borrowing external data. |
h0_prob_no_borrowing | An unquoted column name containing theprobability of rejecting the null hypothesis when not borrowing externaldata. |
design_prior | An unquoted column name containing distributional objectsthat represent the design prior distribution for the control marginalparameter (e.g., posterior distribution using the external control data).Used to aid visualization of which values of the control marginal parameterare assumed to be plausible. Default is |
highlight | Logical value to indicate if you want sweet spothighlighting or not. If |
Details
The function calculates power and type I error rates for BDB approachesthat borrow from external data (e.g., use of a robust mixture prior with positiveweight on the informative component) and an approach that does notborrow from external data (e.g., use of a vague prior) under each scenarioand visualizes them together as a function of the underlying control marginalparameter of interest (e.g., control response rate for binary outcomes) thatmay vary as a result of drift. This helps identify the "sweet spot" where borrowingresults in higher power and lower type I error rates compared to not borrowing.Type I error is calculated using scenarios wheretrt_diff equals 0, and poweris calculated for all scenarios with positive values oftrt_diff.
Ifdesign_prior is non-NULL, the design prior distribution is includedin the plot to provide insight into which values of the control marginalparameter are plausible given this assumed design prior. We note thatdesign_prior can represent any informative prior that potentiallyincorporates the external control data (e.g., the posterior distribution ofthe control marginal parameter constructed using the external data and avague prior). Each element of the vector corresponding todesign_prior mustbe a distributional object with a family equal to "beta", "normal", or"mixture" (where each component is either "beta" or "normal"). For thetime-to-event case in which a multivariate normal prior is assumed for thecontrol log-shape and intercept of a Weibull proportional hazards model,this distribution must first be translated into a univariate beta designprior for the control survival probability at some prespecified time.This approximation can be done usingapprox_mvn_at_time(). If thedesign priors in the vector indicated bydesign_prior differ acrossiterations within a given scenario (e.g., using the IPW power prior as theiteration-specific design prior), then the average distribution will beplotted (i.e., a distribution of the same family with the hyperparametersaveraged across iterations).
Value
A list of ggplot objects, one for each unique scenario defined byscenario_vars. Each plot shows:
Power curves for the cases with and without borrowing
Type I error rates for the cases with and without borrowing
Distribution of the design prior (if
design_prioris specified)
References
Best, N., Ajimi, M., Neuenschwander, B., Saint-Hilary, G., & Wandel, S.(2024). Beyond the Classical Type I Error: Bayesian Metrics for BayesianDesigns Using Informative Priors.Statistics in Biopharmaceutical Research,17(2), 183–196.doi:10.1080/19466315.2024.2342817
Examples
library(dplyr)# Assuming binary_sim_df is a data frame with simulation results in the shape# of binary template codeplots <- sweet_spot_plot( .data = binary_sim_df, scenario_vars = c("population", "marg_trt_eff"), trt_diff = marg_trt_eff, control_marg_param = true_control_RR, h0_prob = reject_H0_yes, h0_prob_no_borrowing = no_borrowing_reject_H0_yes, design_prior = pwr_prior)# Display the first plotplots[[1]]tte_plots <- tte_sim_df |> mutate(beta_appox = approx_mvn_at_time(mix_prior, time = 12)) |> sweet_spot_plot( scenario_vars = c("population", "marg_trt_eff"), trt_diff = marg_trt_eff, control_marg_param = true_control_surv_prob, h0_prob = reject_H0_yes, h0_prob_no_borrowing = no_borrowing_reject_H0_yes, design_prior = beta_appox )tte_plots[[1]]Tidy a(n) prop_scr object
Description
Tidy a(n) prop_scr object
Usage
## S3 method for class 'prop_scr'tidy(x, ...)Arguments
x | a |
... | Unused, included for generic consistency only. |
Value
A tidytibble::tibble() summarizing the results of the propensityscore weighting. The tibble will have the id column of the external data,aninternal column to indicate all the data is external, aps columnwith the propensity scores and aweight column with the inverseprobability weights
Examples
library(dplyr)ps_obj <- calc_prop_scr(internal_df = filter(int_binary_df, trt == 0), external_df = ex_binary_df, id_col = subjid, model = ~ cov1 + cov2 + cov3 + cov4)tidy(ps_obj)Trim aprop_scr object
Description
Trim aprop_scr object
Usage
trim_ps(x, low = NULL, high = NULL, quantile = FALSE)Arguments
x | A |
low | Low cut-off such that all participants with propensity scores lessthan this value (or quantile if |
high | High cut-off such that all participants with propensity scoresgreater than this value (or quantile if |
quantile | True/False value to determine if the cut-off values are baseddirectly on the propensity scores (false) or their quantiles (true). By default this isfalse. |
Details
This function uses R's default method of quantile calculation (type7)
Value
aprop_scr object with a trimmed propensity score distribution
Examples
library(dplyr)ps_obj <- calc_prop_scr(internal_df = filter(int_binary_df, trt == 0), external_df = ex_binary_df, id_col = subjid, model = ~ cov1 + cov2 + cov3 + cov4)trim_ps(ps_obj, low = 0.3, high = 0.7)Time-to-Event Simulation Data
Description
This is an example of output from a simulation study that investigates theoperating characteristics of inverse probability weighted Bayesian dynamicborrowing for the case with a time-to-event outcome. This output was generatedbased on the time-to-event simulation template. For this simulation study, only thedegree of covariate imbalance (as indicated bypopulation) and themarginal treatment effect were varied.
Usage
tte_sim_dfFormat
tte_sim_df A data frame with 18 rows and 7 columns:
- population
Populations defined by different covariate imbalances
- marg_trt_eff
Marginal treatment effect
- true_control_surv_prop
True control survival probability at time t=12 months on the marginal scale
- reject_H0_yes
Probability of rejecting the null hypothesis in the case with borrowing
- no_borrowing_reject_H0_yes
Probability of rejecting the null hypothesis without borrowing
- pwr_prior
Vector of IPW power priors as distributional objects
- mix_prior
Vector of mixture priors (i.e., the robustified IPW power priors) as distributional objects