Movatterモバイル変換

Title:

Bayesian Evaluation, Analysis, and Simulation Software Tools forTrials

Version:

0.0.3

Description:

Bayesian dynamic borrowing with covariate adjustment via inverse probability weighting for simulations and data analyses in clinical trials. This makes it easy to use propensity score methods to balance covariate distributions between external and internal data. This methodology based on Psioda et al (2025) <doi:10.1080/10543406.2025.2489285>.

License:

GPL (≥ 3)

URL:

https://gsk-biostatistics.github.io/beastt/,https://github.com/GSK-Biostatistics/beastt

BugReports:

https://github.com/GSK-Biostatistics/beastt/issues

Suggests:

knitr, mvtnorm, rmarkdown, spelling, testthat (≥ 3.0.0),tibble, vdiffr, survival

Config/testthat/edition:

Encoding:

UTF-8

RoxygenNote:

7.3.2

Imports:

cli, cobalt, distributional, dplyr, generics, ggdist, ggplot2,methods, mixtools, purrr, Rcpp (≥ 0.12.0), RcppParallel (≥5.0.1), rlang, rstan (≥ 2.18.1), rstantools (≥ 2.4.0),stringr, tidyr

VignetteBuilder:

knitr

Depends:

R (≥ 4.1.0)

LazyData:

true

Language:

en-US

LinkingTo:

BH (≥ 1.66.0), Rcpp (≥ 0.12.0), RcppEigen (≥ 0.3.3.3.0),RcppParallel (≥ 5.0.1), rstan (≥ 2.18.1), StanHeaders (≥2.18.0)

SystemRequirements:

GNU make

NeedsCompilation:

yes

Packaged:

2025-05-15 07:08:48 UTC; christinafillmore

Author:

Christina Fillmore

[aut, cre], Nate Bean

[aut], Abi Terry [aut], Ben Arancibia [aut], GlaxoSmithKline Research & Development Limited [cph, fnd], Trustees of Columbia University [cph] (R/stanmodels.R, configure, configure.win)

Maintainer:

Christina Fillmore <christina.e.fillmore@gsk.com>

Repository:

CRAN

Date/Publication:

2025-05-15 11:40:06 UTC

The 'beastt' package.

Description

Inverse Probability Weighted Bayesian Dynamic Borrowing

References

Stan Development Team (NA). RStan: the R interface to Stan. R package version 2.32.3. https://mc-stan.org

Approximate Multivariate Normal Distribution as Beta at a Specific Time

Description

Converts a multivariate normal distribution for Weibull parameters (or a mixtureof these distributions) into an approximate beta distribution for the survivalprobability at a specific time point. This is particularly useful for visualizingsurvival probabilities in sweet spot plots

Usage

approx_mvn_at_time(x, time)

Arguments

x

A vector of distributional objects that must be either multivariate normaldistributions or mixtures of multivariate normal distributions. For Weibull models,these represent distributions of the log(shape) and log(scale) parameters.

time

A numeric value specifying survival time at which to calculate thesurvival probability.

Details

The function performs the following steps:

For each multivariate normal distribution, it generates 10,000 samples of theWeibull parameters
Calculates the corresponding survival probabilities at the specified timeusing the Weibull survival function
Fits a beta distribution to match the mean and variance of these survivalprobabilities
For mixture distributions, it performs this approximation for each componentand creates a new mixture with the same weights

The conversion uses the relationship between Weibull parameters and survivalprobability: S(t) = exp(-(t*exp(log_scale))^exp(log_shape)).

Value

A vector of beta distributional (or mixture of beta distributional)objects approximating the survival probabilities at the specified timepoint. If the input is a mixture distribution, the output will be a mixtureof beta distributions with the same weights.

Examples

library(distributional)# Create a multivariate normal distribution for Weibull parameters# (log(shape), log(scale))mvn_dist <- dist_multivariate_normal(  mu = list(c(0, -1)),  # log(shape) = 0, log(scale) = -1  sigma = list(matrix(c(0.1, 0, 0, 0.1), nrow = 2)))# Approximate as beta distribution for survival at time t=12beta_approx <- approx_mvn_at_time(mvn_dist, time = 12)

Calculate Average Distribution from Multiple Distributional Objects

Description

Compute a single "average" distribution from a vector of distributionalobjects. This function calculates the mean of each hyperparameter across allinput distributions and returns a new distributional object of the samefamily with these averaged hyperparameters.

Usage

avg_dist(x)

Arguments

x

A vector of distributional objects of the same family (beta,normal, multivariate normal, or mixture).

Details

The function supports four distribution families:

Beta distributions: Averages the shape1 and shape2 hyperparameters
Normal distributions: Averages the mean and standard deviationhyperparameters
Multivariate normal distributions: Averages the location vectorsand covariance matrices
Mixture distributions: Same as above for each distribution type,where averaging is done by component. Also averages the mixture weight.

For multivariate normal distributions, both the location vector andcovariance matrix are averaged element-wise.

Value

A single distributional object of the same family as the input,with hyperparameters set equal to the average of all input distributionhyperparameters.

Examples

library(distributional)# Beta distributionsbeta_dists <- c(  dist_beta(shape1 = 2, shape2 = 5),  dist_beta(shape1 = 3, shape2 = 3),  dist_beta(shape1 = 4, shape2 = 2))avg_dist(beta_dists) |> parameters()# Normal distributionsnorm_dists <- c(  dist_normal(mu = 0, sigma = 1),  dist_normal(mu = 2, sigma = 2),  dist_normal(mu = 4, sigma = 3))avg_dist(norm_dists) |> parameters()# Multivariate normal distributionsmvn_dists <- c(  dist_multivariate_normal(mu = list(c(0, 0)), sigma = list(matrix(c(1, 0, 0, 1), nrow = 2))),  dist_multivariate_normal(mu = list(c(1, 1)), sigma = list(matrix(c(2, 0, 0, 2), nrow = 2))))avg_dist(mvn_dists) |> parameters()

Binary Simulation Data

Description

This is an example of output from a simulation study that investigates theoperating characteristics of inverse probability weighted Bayesian dynamicborrowing for the case with a binary outcome. This output was generatedbased on the binary simulation template. For this simulation study, only thedegree of covariate imbalance (as indicated bypopulation) and themarginal treatment effect were varied.

Usage

binary_sim_df

Format

`binary_sim_df` A data frame with 255 rows and 6 columns:

population: Populations defined by different covariate imbalances
marg_trt_eff: Marginal treatment effect
true_control_RR: True control response rate on the marginal scale
reject_H0_yes: Probability of rejecting the null hypothesis in the case with borrowing
no_borrowing_reject_H0_yes: Probability of rejecting the null hypothesis without borrowing
pwr_prior: Vector of power priors (or some other informativeprior distribution for the control marginal parameter of interestbased on the external data) as distributional objects

Bootstrap Covariate Data

Description

Bootstrap Covariate Data

Usage

bootstrap_cov(  external_dat,  n,  imbal_var = NULL,  imbal_prop = NULL,  ref_val = 0)

Arguments

external_dat

Data frame of the external data from which to bootstrapcovariate vectors

n

Number of rows in the output dataset

imbal_var

Optional variable indicating which covariate's distributionshould be altered to incorporate imbalance compared to the external data.If leftNULL, the distributions of all covariates in the output datasetwill match the distributions in the external dataset. The imbalancevariable must be binary.

imbal_prop

Optional imbalance proportion, required if an imbalancevariable is specified. This defines the proportion of individuals with thereference value of the imbalance variable in the returned dataset. Thiscan either be a single proportion or a vector of proportions, in which casea list of datasets is returned.

ref_val

Optional value corresponding to the reference level of thebinary imbalance variable, if specified

Details

Covariate data can be generated forn individuals enrolled inthe internal trial by bootstrap sampling entire covariate vectors from theexternal data, thus preserving the correlation between the covariates. Ifbothimbal_var =NULL andimbal_prop =NULL, the function returnsa single data frame in which the distributions of each covariate alignwith the covariate distributions from the external data (i.e., balancedcovariate distributions across the two trials). Alternatively, covariateimbalance can be incorporated into the generated sample with respect to abinary covariate (imbal_var) such that a specified proportion(imbal_prop) of individuals in the resulting sample will have thereference level (ref_val) of this imbalance covariate. In this case,stratified bootstrap sampling is employed with the imbalance covariate asthe stratification factor.

Multiple samples with varying degrees of imbalance can be generatedsimultaneously by definingimbal_prop to be a vector of values. Thefunction then returns a list of data frames with a length equal to thenumber of specified imbalance proportions.

Value

Data frame with the same number of columns as the external data frameand n number of rows (if the length ofimbal_prop is 0 or 1); otherwise,a list of data frames with a length equal to that ofimbal_prop

Examples

# Return one data frame with covariate distributions similar to external datasamp_balance <- bootstrap_cov(ex_binary_df, n = 1000)# Return a list of two data frames that incorporate imbalance w.r.t. covariate 2samp_imbalance <- bootstrap_cov(ex_binary_df, n = 1000, imbal_var = cov2,                                imbal_prop = c(0.25, 0.5), ref_val = 0)

Calculate Conditional Drift and Treatment Effect for Binary Outcome Models

Description

In order to properly generate binary response data for the internal trial aspart of a simulation study that investigates inverse probability weighting,we need to translate the desired marginal drift and treatment effect to thecorresponding conditional drift and treatment effect that can then be addedinto a binary outcome model (e.g., logistic regression model) used tosimulate response data.

Usage

calc_cond_binary(population, glm, marg_drift, marg_trt_eff)

Arguments

population

A very large data frame (e.g., number of rows\ge100,000) where the columns correspond to the covariates defined in thelogistic regression model object. This data frame should be constructed torepresent the population of the internal trial according to the assumedcovariate distributions (possibly imbalanced from the external data).

glm

Logistic regression model object fit using the external data

marg_drift

Vector of marginal drift values

marg_trt_eff

Vector of marginal treatment effect values

Details

In simulation studies that investigate the properties of inverseprobability weighted Bayesian dynamic borrowing, scenarios should beconsidered in which the underlying response rates for the internal andexternal control populations differ by varying amounts due to unmeasuredconfounding (i.e., drift, where positive values indicate a higher responserate for the internal population). While values of drift and treatmenteffect (i.e., risk difference) can be defined on the marginal scale forsimulation studies, we must first convert these values to the conditionalscale and then include these terms, along with covariates, in a logisticregression outcome model when generating response data for the internalarms. Doing so allows us to assume a relationship between the covariatesand the response variable while properly accounting for drift and treatmenteffect.

To identify the conditional drift and treatment effect that correspond tospecified values of marginal drift and treatment effect, we first bootstrapcovariate vectors from the external data (e.g.,N \ge 100,000) toconstruct a "population" that represents both the internal trial(possibly incorporating intentional covariate imbalance) and the externaltrialafter standardizing it to match the covariate distributionsof the internal trial (allowing us to control for measured confoundingfrom potential imbalance in the covariate distributions). Measuredconfounding can be incorporated into the data generation by bootstrappinga very large data frame (population) in which the distribution of atleast one covariate is intentionally varied from that of the external data;additionalunmeasured drift can be incorporated through thetranslation of specified marginal values (marg_drift) to conditionalvalues.

Let\Delta and\delta denote the marginal and conditional drift,respectively. For a specified value of\Delta, we can identify thecorresponding\delta as the value that, when added as an additionalterm in the logistic regression model (i.e., change in the intercept) foreach individual in the population, increases/decreases thepopulation-averaged conditional probabilities of response by an amountapproximately equal to\Delta. That is, the optimal\deltaminimizes

\left| \left( \frac{1}{N} \sum_{i=1}^N \frac{\exp \left( \boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} + \delta \right)}{1 + \exp\left(\boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} + \delta \right)} - \frac{1}{N} \sum_{i=1}^N \frac{\exp \left( \boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} \right)}{1 + \exp \left(\boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} \right)} \right) - \Delta \right|,

where\boldsymbol{\beta}_{EC} is the vector of regression coefficientsfrom the logistic regression model (glm) fit to the external control data(assumed here to be the "true" covariate effects when generating responsedata) and\boldsymbol{x}_i is a vector of covariates (including anintercept term) from the bootstrapped population of sizeN. In theformula above, the first and second terms correspond to thepopulation-averaged conditional probabilities (i.e., the marginal responserates) of the internal control population with drift and the externalcontrol population (with covariate distributions standardized to match theinternal trial), respectively.

If we now denote the marginal and conditional treatment effect by\Gamma and\gamma, respectively, we can use a similar processto identify the optimal\gamma that approximately corresponds to thespecified value of\Gamma, which is done by minimizing the following:

\left| \left( \frac{1}{N} \sum_{i=1}^N \frac{\exp \left( \boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} + \delta + \gamma \right)}{1 + \exp\left(\boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} + \delta + \gamma \right)} - \frac{1}{N} \sum_{i=1}^N \frac{\exp \left( \boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} + \delta \right)}{1 + \exp \left(\boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} + \delta \right)} \right) - \Gamma \right|,

where the first term is the average of the conditional probabilitiesof response (i.e., the marginal response rate) of the internaltreated population.

Seeherefor a simulation example with a binary outcome.

Value

tibble of all combinations of the marginal drift and treatmenteffect. For each row the conditional drift and treatment effect has beencalculated as well as the true response rates for the control and treatedpopulations.

Examples

library(dplyr)# Model "true" regression coefficients using the external datalogit_mod <- glm(y ~ cov1 + cov2 + cov3 + cov4, data = ex_binary_df, family = binomial)# Bootstrap internal control "population" with imbalance w.r.t. covariate 2pop_int_ctrl <- bootstrap_cov(ex_binary_df, n = 100000, imbal_var = cov2,                              imbal_prop = 0.25, ref_val = 0) |>                              select(-subjid, -y)  # keep only covariate columns# Convert the marginal drift and treatment effects to conditionalcalc_cond_binary(population = pop_int_ctrl, glm = logit_mod,                 marg_drift = c(-.1, 0, .1), marg_trt_eff = c(0, .15))

Calculate Conditional Drift and Treatment Effect for Time-to-Event Outcome Models

Description

In order to properly generate time-to-event (TTE) outcome data for theinternal trial as part of a simulation study that investigates inverseprobability weighting, we need to translate the desired marginal drift andtreatment effect to the corresponding conditional drift and treatment effectthat can then be added into a TTE outcome model (e.g., Weibull proportionalhazards regression model) used to simulate response data.

Usage

calc_cond_weibull(  population,  weibull_ph_mod,  marg_drift,  marg_trt_eff,  analysis_time)

Arguments

population

A very large data frame (e.g., number of rows\ge100,000) where the columns correspond to the covariates defined in thesurvreg object for the Weibull proportional hazards model. This dataframe should be constructed to represent the population of the internaltrial according to the assumed covariate distributions (possibly imbalancedfrom the external data).

weibull_ph_mod

survreg object corresponding to a Weibullproportional hazards model fit using the external data

marg_drift

Vector of marginal drift values

marg_trt_eff

Vector of marginal treatment effect values

analysis_time

A single time point when survival probabilities will becalculated

Details

In simulation studies that investigate the properties of inverseprobability weighted Bayesian dynamic borrowing, scenarios should beconsidered in which the underlying survival probabilities at someprespecified timet (analysis_time) for the internal and externalcontrol populations differ by varying amounts due to unmeasured confounding(i.e., drift, where positive values indicate a higher survival probabilityfor the internal population). While values of drift and treatment effect(i.e., difference between the survival probabilities at timet forthe treated and control populations) can be defined on the marginal scalefor simulation studies, we must first convert these values to theconditional scale and then include these terms, along with covariates, in aWeibull proportional hazards (PH) regression outcome model when generatingtime-to-event (TTE) data for the internal arms. Doing so allows us toassume a relationship between the covariates and the response variablewhile properly accounting for drift and treatment effect.

Let\Delta and\delta denote the marginal and conditional drift,respectively. For a specified value of\Delta, we can identify thecorresponding\delta as the value that, when added as an additionalterm in the Weibull PH model survival function (i.e., additive change inthe intercept) for each individual in the population, increases/decreasesthe population-averaged conditional probabilities of survival at timet by an amount approximately equal to\Delta. That is, theoptimal\delta minimizes

\left| \left( \frac{1}{N} \sum_{i=1}^N \exp \left( -\left\{ \exp \left( \boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} + \delta \right) \times t \right\}^{\alpha_{EC}} \right) - \frac{1}{N} \sum_{i=1}^N \exp \left( -\left\{ \exp \left( \boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} \right) \times t \right\}^{\alpha_{EC}} \right) \right) - \Delta \right|,

where\alpha_{EC} is the Weibull shape parameter,\boldsymbol{\beta}_{EC} is a vector of regression coefficients, and\boldsymbol{x}_i is a vector of covariates (including an interceptterm) from the bootstrapped population of sizeN. We note that\alpha_{EC} = 1/\sigma_{EC} and\boldsymbol{\beta}_{EC} = -\boldsymbol{\xi}_{EC} are calculated as functions of the scale parameter(\sigma_{EC}) and coefficients (\boldsymbol{\xi}_{EC})estimated by thesurvreg object that was fit to the external data, and weassume here that these estimates are the "true" shape and covariate effectswhen generating response data. In the formula above, the first and secondterms correspond to the population-averaged conditional survival functions(i.e., the marginal survival probabilities) at timet for theinternal control population with drift and the external control population(with covariate distributions standardized to match the internal trial),respectively.

\left| \left( \frac{1}{N} \sum_{i=1}^N \exp \left( -\left\{ \exp \left( \boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} + \delta + \gamma \right) \times t \right\}^{\alpha_{EC}} \right) - \frac{1}{N} \sum_{i=1}^N \exp \left( -\left\{ \exp \left( \boldsymbol{x}_i^\prime \boldsymbol{\beta}_{EC} + \delta \right) \times t \right\}^{\alpha_{EC}} \right) \right) - \Gamma \right|,

where the first term is the average of the conditional survival functions(i.e., the marginal survival probabilities) at timet for theinternal treated population.

Seeherefor a simulation example with a time-to-event outcome.

Value

tibble of all combinations of the marginal drift and treatmenteffect. For each row the conditional drift and treatment effect has beencalculated as well as the true marginal survival probabilities at timetfor the control and treatment populations.

Examples

library(dplyr)library(survival)# Model "true" regression coefficients using the external dataweibull_ph_mod <- survreg(Surv(y, event) ~ cov1 + cov2 + cov3 + cov4, data = ex_tte_df,                          dist = "weibull")# Bootstrap internal control "population" with imbalance w.r.t. covariate 2pop_int_ctrl <- bootstrap_cov(ex_tte_df, n = 100000, imbal_var = cov2,                              imbal_prop = 0.25, ref_val = 0) |>  select(c(cov1, cov2, cov3, cov4))     # keep only covariate columns# Convert the marginal drift and treatment effects to conditionalcalc_cond_weibull(population = pop_int_ctrl, weibull_ph_mod,                  marg_drift = c(-.1, 0, .1), marg_trt_eff = c(0, .10),                  analysis_time = 12)

Calculate Posterior Beta

Description

Calculate a posterior distribution that is beta (or a mixtureof beta components). Only the relevant treatment arms from the internaldataset should be read in (e.g., only the control arm if constructing aposterior distribution for the control response rate).

Usage

calc_post_beta(internal_data, response, prior)

Arguments

internal_data

A tibble of the internal data.

response

Name of response variable

prior

A distributional object corresponding to a beta distributionor a mixture distribution of beta components

Details

For a given arm of an internal trial (e.g., the control arm or anactive treatment arm) of sizeN_I, suppose the response data are binarysuch thatY_i \sim \mbox{Bernoulli}(\theta),i=1,\ldots,N_I. Theposterior distribution for\theta is written as

\pi( \theta \mid \boldsymbol{y} ) \propto \mathcal{L}(\theta \mid \boldsymbol{y}) \; \pi(\theta),

where\mathcal{L}(\theta \mid \boldsymbol{y}) is the likelihood of theresponse data from the internal arm and\pi(\theta) is a priordistribution on\theta (either a beta distribution or a mixturedistribution with an arbitrary number of beta components). The posteriordistribution for\theta is either a beta distribution or a mixture ofbeta components depending on whether the prior is a single betadistribution or a mixture distribution.

Value

distributional object

Examples

library(dplyr)library(distributional)calc_post_beta(internal_data = filter(int_binary_df, trt == 1),               response = y,               prior = dist_beta(0.5, 0.5))

Calculate Posterior Normal

Description

Calculate a posterior distribution that is normal (or a mixtureof normal components). Only the relevant treatment arms from the internaldataset should be read in (e.g., only the control arm if constructing aposterior distribution for the control mean).

Usage

calc_post_norm(internal_data, response, prior, internal_sd = NULL)

Arguments

internal_data

A tibble of the internal data.

response

Name of response variable

prior

A distributional object corresponding to a normal distribution,a t distribution, or a mixture distribution of normal and/or t components

internal_sd

Standard deviation of internal response data ifassumed known. It can be left asNULL if assumed unknown

Details

For a given arm of an internal trial (e.g., the control arm or anactive treatment arm) of sizeN_I, suppose the response data are normallydistributed such thatY_i \sim N(\theta, \sigma_I^2),i=1,\ldots,N_I.If\sigma_I^2 is assumed known, the posterior distribution for\thetais written as

\pi( \theta \mid \boldsymbol{y}, \sigma_{I}^2 ) \propto \mathcal{L}(\theta \mid \boldsymbol{y}, \sigma_{I}^2) \; \pi(\theta),

where\mathcal{L}(\theta \mid \boldsymbol{y}, \sigma_{I}^2) is thelikelihood of the response data from the internal arm and\pi(\theta)is a prior distribution on\theta (either a normal distribution, at distribution, or a mixture distribution with an arbitrary number ofnormal and/ort components). Anyt components of the prior for\theta are approximated with a mixture of two normal distributions.

If\sigma_I^2 is unknown, the marginal posterior distribution for\theta is instead written as

\pi( \theta \mid \boldsymbol{y} ) \propto \left\{ \int_0^\infty \mathcal{L}(\theta, \sigma_{I}^2 \mid \boldsymbol{y}) \; \pi(\sigma_{I}^2) \; d\sigma_{I}^2 \right\} \times \pi(\theta).

In this case, the prior for\sigma_I^2 is chosen to be\pi(\sigma_{I}^2) = (\sigma_I^2)^{-1} such that\left\{ \int_0^\infty \mathcal{L}(\theta, \sigma_{I}^2 \mid \boldsymbol{y}) \; \pi(\sigma_{I}^2) \; d\sigma_{I}^2 \right\}becomes a non-standardizedt distribution. This integrated likelihoodis then approximated with a mixture of two normal distributions.

Ifinternal_sd is supplied a positive value andprior corresponds to asingle normal distribution, then the posterior distribution for\thetais a normal distribution. Ifinternal_sd = NULL or if other types of priordistributions are specified (e.g., mixture or t distribution), then theposterior distribution is a mixture of normal distributions.

Value

distributional object

Examples

library(distributional)library(dplyr)post_treated <- calc_post_norm(internal_data = filter(int_norm_df, trt == 1),                               response = y,                               prior = dist_normal(0.5, 10),                               internal_sd = 0.15)

Calculate Posterior Weibull

Description

Calculate a posterior distribution for the probability ofsurviving past a given analysis time(s) for time-to-event data with aWeibull likelihood. Only the relevant treatment arms from the internaldataset should be read in (e.g., only the control arm if constructing aposterior distribution for the control survival probability).

Usage

calc_post_weibull(internal_data, response, event, prior, analysis_time, ...)

Arguments

internal_data

This can either be a propensity score object or a tibbleof the internal data.

response

Name of response variable

event

Name of event indicator variable (1: event; 0: censored)

prior

A distributional object corresponding to a multivariate normaldistribution or a mixture of 2 multivariate normals. The first elementof the mean vector and the first row/column of covariance matrix correspondto the log-shape parameter, and the second element corresponds to the intercept(i.e., log-inverse-scale) parameter.

analysis_time

Vector of time(s) when survival probabilities will becalculated

...

rstan sampling option. This overrides any default options. For moreinformation, seerstan::sampling()

Details

For a given arm of an internal trial (e.g., the control arm or anactive treatment arm) of sizeN_I, suppose the response data are time to eventsuch thatY_i \sim \mbox{Weibull}(\alpha, \sigma), where

f(y_i \mid \alpha, \sigma) = \left( \frac{\alpha}{\sigma} \right) \left( \frac{y_i}{\sigma} \right)^{\alpha - 1} \exp \left( -\left( \frac{y_i}{\sigma} \right)^\alpha \right),

i=1,\ldots,N_I. Define\boldsymbol{\theta} = \{\log(\alpha), \beta\}where\beta = -\log(\sigma) is the intercept parameter of a Weibullproportional hazards regression model. The posterior distribution for\boldsymbol{\theta} is written as

\pi( \boldsymbol{\theta} \mid \boldsymbol{y}, \boldsymbol{\nu} ) \propto \mathcal{L}(\boldsymbol{\theta} \mid \boldsymbol{y}, \boldsymbol{\nu}) \; \pi(\boldsymbol{\theta}),

where\mathcal{L}(\boldsymbol{\theta} \mid \boldsymbol{y}, \boldsymbol{\nu}) = \prod_{i=1}^{N_I} f(y_i \mid \boldsymbol{\theta})^{\nu_i} S(y_i \mid \boldsymbol{\theta})^{1 - \nu_i}is the likelihood of the response data from the internal arm with event indicator\boldsymbol{\nu} and survival functionS(y_i \mid \boldsymbol{\theta}) = 1 - F(y_i \mid \boldsymbol{\theta}), and\pi(\boldsymbol{\theta}) is a priordistribution on\boldsymbol{\theta} (either a multivariate normal distribution or amixture of two multivariate normal distributions). Note that the posterior distributionfor\boldsymbol{\theta} does not have a closed form, and thus MCMC samplesfor\log(\alpha) and\beta are drawn from the posterior. These MCMCsamples are used to construct samples from the posterior distributionfor the probability of surviving past the specified analysis time(s) for thegiven arm.

To construct a posterior distribution for the treatment difference (i.e., thedifference in survival probabilities at the specified analysis time), obtain MCMCsamples from the posterior distributions for the survival probabilities undereach arm and then subtract the control-arm samples from the treatment-armsamples.

Value

stan posterior object

Examples

library(distributional)library(dplyr)library(rstan)mvn_prior <- dist_multivariate_normal(   mu = list(c(0.3, -2.6)),   sigma = list(matrix(c(1.5, 0.3, 0.3, 1.1), nrow = 2)))post_treated <- calc_post_weibull(filter(int_tte_df, trt == 1),                                  response = y,                                  event = event,                                  prior = mvn_prior,                                  analysis_time = 12,                                  warmup = 5000,                                  iter = 15000)# Extract MCMC samples of survival probabilities at time t=12surv_prob_treated <- as.data.frame(extract(post_treated,                                   pars = c("survProb")))[,1]

Calculate Power Prior Beta

Description

Calculate a (potentially inverse probability weighted) betapower prior for the control response rate using external control data.

Usage

calc_power_prior_beta(external_data, response, prior)

Arguments

external_data

This can either be aprop_scr_obj created by callingcreate_prop_scr() or a tibble of the external data. If it is just atibble the weights will be assumed to be 1.

response

Name of response variable

prior

A beta distributional object that is the initial prior for thecontrol response rate before the external control data are observed

Details

Weighted participant-level response data from the control arm of anexternal study are incorporated into an inverse probability weighted (IPW)power prior for the control response rate\theta_C. When borrowinginformation from an external control arm of sizeN_{EC}, the componentsof the IPW power prior for\theta_C are defined as follows:

Initial prior:: \theta_C \sim \mbox{Beta}(\nu_0, \phi_0)
IPW likelihood of the external response data\boldsymbol{y}_E withweights\hat{\boldsymbol{a}}_0:: \mathcal{L}_E(\theta_C \mid \boldsymbol{y}_E, \hat{\boldsymbol{a}}_0) \propto \exp \left( \sum_{i=1}^{N_{EC}} \hat{a}_{0i} \left[ y_i \log(\theta_C) + (1 - y_i) \log(1 - \theta_C) \right] \right)
IPW power prior:: \theta_C \mid \boldsymbol{y}_E, \hat{\boldsymbol{a}}_0 \sim \mbox{Beta} \left( \sum_{i=1}^{N_{EC}} \hat{a}_{0i} y_i + \nu_0, \sum_{i=1}^{N_{EC}} \hat{a}_{0i} (1 - y_i) + \phi_0 \right)

Defining the weights\hat{\boldsymbol{a}}_0 to equal 1 results in aconventional beta power prior.

Value

Beta power prior object

Examples

library(distributional)library(dplyr)# This function can be used directly on the datacalc_power_prior_beta(external_data = ex_binary_df,                      response = y,                      prior = dist_beta(0.5, 0.5))# Or this function can be used with a propensity score objectps_obj <- calc_prop_scr(internal_df = filter(int_binary_df, trt == 0),                        external_df = ex_binary_df,                        id_col = subjid,                        model = ~ cov1 + cov2 + cov3 + cov4)calc_power_prior_beta(ps_obj,                      response = y,                      prior = dist_beta(0.5, 0.5))

Calculate Power Prior Normal

Description

Calculate a (potentially inverse probability weighted) normalpower prior using external data.

Usage

calc_power_prior_norm(  external_data,  response,  prior = NULL,  external_sd = NULL)

Arguments

external_data

This can either be aprop_scr_obj created by callingcreate_prop_scr() or a tibble of the external data. If it is just atibble the weights will be assumed to be 1. Only the external data for thearm(s) of interest should be included in this object (e.g., externalcontrol data if creating a power prior for the control mean)

response

Name of response variable

prior

EitherNULL or a normal distributional object that is theinitial prior for the parameter of interest (e.g., control mean) before theexternal data are observed

external_sd

Standard deviation of external response data if assumedknown. It can be left asNULL if assumed unknown

Details

Weighted participant-level response data from an external study areincorporated into an inverse probability weighted (IPW) power prior for theparameter of interest\theta (e.g., the control mean if borrowingfrom an external control arm). When borrowing information from an externaldataset of sizeN_{E}, the IPW likelihood of the external responsedata\boldsymbol{y}_E with weights\hat{\boldsymbol{a}}_0 is defined as

\mathcal{L}_E(\theta \mid \boldsymbol{y}_E, \hat{\boldsymbol{a}}_0, \sigma_{E}^2) \propto \exp \left( -\frac{1}{2 \sigma_{E}^2} \sum_{i=1}^{N_{E}} \hat{a}_{0i} (y_i - \theta)^2 \right).

Theprior argument should be either a distributional object with a familytype ofnormal orNULL, corresponding to the use of a normal initialprior or an improper uniform initial prior (i.e.,\pi(\theta) \propto 1), respectively.

Theexternal_sd argument can be a positive value if the external standarddeviation is assumed known or left asNULL otherwise. Ifexternal_sd = NULL, thenprior must beNULL to indicate the use of an improperuniform initial prior for\theta, and an improper prior is definedfor the unknown external standard deviation such that\pi(\sigma_E^2) \propto (\sigma_E^2)^{-1}. The details of the IPW power prior for eachcase are as follows:

⁠external_sd = positive value⁠ (\sigma_E^2 known):: Witheither a proper normal or an improper uniform initial prior, the IPWweighted power prior for\theta is a normal distribution.
external_sd = NULL (\sigma_E^2 unknown):: With improperpriors for both\theta and\sigma_E^2, the marginal IPW weightedpower prior for\theta after integrating over\sigma_E^2 isa non-standardizedt distribution.

Defining the weights\hat{\boldsymbol{a}}_0 to equal 1 results in aconventional normal (ort) power prior if the external standarddeviation is known (unknown).

Value

Normal power prior object

Examples

library(distributional)library(dplyr)# This function can be used directly on the datacalc_power_prior_norm(ex_norm_df,                      response = y,                      prior = dist_normal(0.5, 10),                      external_sd = 0.15)# Or this function can be used with a propensity score objectps_obj <- calc_prop_scr(internal_df = filter(int_norm_df, trt == 0),                        external_df = ex_norm_df,                        id_col = subjid,                        model = ~ cov1 + cov2 + cov3 + cov4)calc_power_prior_norm(ps_obj,                      response = y,                      prior = dist_normal(0.5, 10),                      external_sd = 0.15)

Calculate Power Prior Weibull

Description

Calculate an approximate (potentially inverse probability weighted)multivariate normal power prior for the log-shape and log-inverse-scale parametersof a Weibull likelihood for external time-to-event control data.

Usage

calc_power_prior_weibull(  external_data,  response,  event,  intercept,  shape,  approximation = c("Laplace", "MCMC"),  ...)

Arguments

external_data

response

Name of response variable

event

Name of event indicator variable (1: event; 0: censored)

intercept

Normal distributional object that is the initial prior for theintercept (i.e., log-inverse-scale) parameter

shape

Integer value that is the scale of the half-normal priorfor the shape parameter

approximation

Type of approximation to be used. EitherLaplace orMCMC.Laplace is used by default because it is faster thanMCMC.

...

Arguments passed torstan::sampling (e.g. iter, chains).

Details

Weighted participant-level response data from the control arm of anexternal study are incorporated into an approximated inverse probabilityweighted (IPW) power prior for the parameter vector\boldsymbol{\theta}_C = \{\log(\alpha), \beta\}, where\beta = -\log(\sigma)is the intercept parameter of a Weibull proportional hazards regression modeland\alpha and\sigma are the Weibull shape and scale parameters,respectively. When borrowing information from an external dataset of sizeN_{E},the IPW likelihood of the external response data\boldsymbol{y}_E withevent indicators\boldsymbol{\nu}_E and weights\hat{\boldsymbol{a}}_0is defined as

\mathcal{L}_E(\alpha, \sigma \mid \boldsymbol{y}_E, \boldsymbol{\nu}_E, \hat{\boldsymbol{a}}_0) \propto \prod_{i=1}^{N_E} \left\{ \left( \frac{\alpha}{\sigma} \right) \left( \frac{y_i}{\sigma} \right)^{\alpha - 1} \exp \left( -\left( \frac{y_i}{\sigma} \right)^\alpha \right) \right\}^{\hat{a}_{0i} \nu_i} \left\{ \exp \left( -\left( \frac{y_i}{\sigma} \right)^\alpha \right) \right\}^{\hat{a}_{0i}(1 - \nu_i)}.

The initial priors for the intercept parameter\beta and the shape parameter\alpha are assumed to be normal and half-normal, respectively, which canbe defined using theintercept andshape arguments.

The power prior for\boldsymbol{\theta}_C does not have a closed form, andthus we approximate it via a bivariate normal distribution; i.e.,

\boldsymbol{\theta}_C \mid \boldsymbol{y}_E, \boldsymbol{\nu}_E, \hat{\boldsymbol{a}}_0 \; \dot\sim \; \mbox{MVN} \left( \tilde{\boldsymbol{\mu}}_0, \tilde{\boldsymbol{\Sigma}}_0 \right).

Ifapproximation = Laplace, then\tilde{\boldsymbol{\mu}}_0 is the mode vectorof the IPW power prior and\tilde{\boldsymbol{\Sigma}}_0 is the negativeinverse of the Hessian of the log IPW power prior evaluated at the mode. Ifapproximation = MCMC, then MCMC samples are obtained from the IPW power prior,and\tilde{\boldsymbol{\mu}}_0 and\tilde{\boldsymbol{\Sigma}}_0are the estimated mean vector and covariance matrix of these MCMC samples.Note that the Laplace approximation method is faster due to its use ofoptimization instead of MCMC sampling.

The first element of the mean vector and the first row/column of covariancematrix correspond to the log-shape parameter (\log(\alpha)), and thesecond element corresponds to the intercept (\beta, the log-inverse-scale)parameter.

Value

Multivariate Normal Distributional Object

Examples

library(distributional)library(dplyr)# This function can be used directly on the datacalc_power_prior_weibull(ex_tte_df,                         response = y,                         event = event,                         intercept = dist_normal(0, 10),                         shape = 50,                         approximation = "Laplace")# Or this function can be used with a propensity score objectps_obj <- calc_prop_scr(internal_df = filter(int_tte_df, trt == 0),                        external_df = ex_tte_df,                        id_col = subjid,                        model = ~ cov1 + cov2 + cov3 + cov4)calc_power_prior_weibull(ps_obj,                         response = y,                         event = event,                         intercept = dist_normal(0, 10),                         shape = 50,                         approximation = "Laplace")

Create a Propensity Score Object

Description

Calculate the propensity scores and ATT inverse probabilityweights for participants from internal and external datasets. Only therelevant treatment arms from each dataset should be read in (e.g., onlythe control arm from each dataset if creating a hybrid control arm).

Usage

calc_prop_scr(internal_df, external_df, id_col, model, ...)

Arguments

internal_df

Internal dataset with one row per subject and all thevariables needed to run the model

external_df

External dataset with one row per subject and all thevariables needed to run the model

id_col

Name of the column in both datasets used to identify eachsubject. It must be the same across datasets

model

Model used to calculate propensity scores

...

Optional arguments

Details

For the subset of participants in both the external and internalstudies for which we want to balance the covariate distributions (e.g.,external control and internal control participants if constructing ahybrid control arm), we define a study-inclusion propensity score foreach participant as

e(x_i) = P(S_i = 1 \mid x_i),

wherex_i denotes a vector of baseline covariates for theithparticipant andS_i denotes the indicator that the participant isenrolled in the internal trial (S_i = 1 if internal,S_i = 0if external). The estimated propensity score\hat{e}(x_i) is obtainedusing logistic regression.

An ATT inverse probability weight is calculated for each individual as

\hat{a}_{0i} = \frac{\hat{e}(x_i)}{\hat{P}(S_i = s_i | x_i)} = s_i + (1 - s_i ) \frac{\hat{e}(x_i)}{1 - \hat{e}(x_i)}.

In a weighted estimator, data from participants in the external studyare given a weight of\hat{e}(x_i)⁄(1 - \hat{e}(x_i)) whereas datafrom participants in the internal trial are given a weight of 1.

Value

prop_scr_obj object, with the internal and the external data andthe propensity score and inverse probability weight calculated for eachsubject.

Examples

# This can be used for both continuous and binary datalibrary(dplyr)# Continuouscalc_prop_scr(internal_df = filter(int_norm_df, trt == 0),                       external_df = ex_norm_df,                       id_col = subjid,                       model = ~ cov1 + cov2 + cov3 + cov4)# Binarycalc_prop_scr(internal_df = filter(int_binary_df, trt == 0),                       external_df = ex_binary_df,                       id_col = subjid,                       model = ~ cov1 + cov2 + cov3 + cov4)

Calculate the Analysis Time Based on a Target Number of Events and/or TargetFollow-up Time

Description

Calculate the Analysis Time Based on a Target Number of Events and/or TargetFollow-up Time

Usage

calc_study_duration(  study_time,  observed_time,  event_indicator,  target_events = NULL,  target_follow_up = NULL)

Arguments

study_time

Vector of study times (accrual time + observed time)

observed_time

Vector of observed times (event time or censoring time)

event_indicator

Vector of boolean values (TRUE/FALSE or 1/0)indicating if the observed time value is an event or censoring time

target_events

Target number of events, where the analysis time isdetermined once this number of events is reached. Default isNULL, inwhich casetarget_follow_up must be specified.

target_follow_up

Target follow-up for each participant, where theanalysis time is determined once each participant in the risk set isfollowed up for this amount of time (i.e., minimum follow-up time). DefaultisNULL, in which casetarget_events must be specified.

Details

This function calculates the analysis time for a study with atime-to-event endpoint for which the target number of events(target_events) and/or target follow-up time (target_follow_up) arespecified. If onlytarget_events is specified, the analysis will occurat the time when the target number of events has been reached. If onlytarget_follow_up is specified, the analysis will occur once thelast-enrolled participant who is still in the risk set has been followed upfor this amount of time. If bothtarget_events andtarget_follow_up arespecified, the analysis time will be based on whichever occurs first.

Value

Time of analysis

Examples

library(dplyr)# Determining analysis time by reaching a target number of eventsex_tte_df |> mutate(  analysis_time = calc_study_duration(study_time = total_time, observed_time = y,                                      event_indicator = event, target_events = 30))# Determining analysis time by a target follow-up timeex_tte_df |> mutate(  analysis_time = calc_study_duration(study_time = total_time, observed_time = y,                                      event_indicator = event, target_follow_up = 12))# Or use both (whichever happens first)ex_tte_df |> mutate(  analysis_time = calc_study_duration(study_time = total_time, observed_time = y,                                      event_indicator = event,                                      target_events = 30, target_follow_up = 12))

External Binary Control Data for Propensity Score Balancing

Description

This is a simulated dataset used to illustrate Bayesian dynamic borrowing inthe case when borrowing from an external control arm with a binary endpoint,where the baseline covariate distributions of the internal and external dataare balanced via inverse probability weighting.

Usage

ex_binary_df

Format

`ex_binary_df`

A data frame with 150 rows and 6 columns:

subjid: Unique subject ID
cov1: Covariate 1, which is normally distributed around 65 with a SD of 10
cov2: Covariate 2, which is binary (0 vs. 1) with about 30% of participants having level 1
cov3: Covariate 3, which is binary (0 vs. 1) with about 40% of participants having level 1
cov4: Covariate 4, which is binary (0 vs. 1) with about 50% of participants having level 1
y: Response, which is binary (0 vs. 1)

External Normal Control Data for Propensity Score Balancing

Description

This is a simulated dataset used to illustrate Bayesian dynamic borrowing inthe case when borrowing from an external control arm with a normal endpoint,where the baseline covariate distributions of the internal and external dataare balanced via inverse probability weighting.

Usage

ex_norm_df

Format

`ex_norm_df`

A data frame with 150 rows and 6 columns:

subjid: Unique subject ID
cov1: Covariate 1, which is normally distributed around 50 with a SD of 10
cov2: Covariate 2, which is binary (0 vs. 1) with about 20% of participants having level 1
cov3: Covariate 3, which is binary (0 vs. 1) with about 60% of participants having level 1
cov4: Covariate 4, which is binary (0 vs. 1) with about 30% of participants having level 1
y: Response, which is normally distributed with a SD of 0.15

External Time-to-Event Control Data for Propensity Score Balancing

Description

This is a simulated dataset used to illustrate Bayesian dynamic borrowing inthe case when borrowing from an external control arm with a time-to-event endpoint,where the baseline covariate distributions of the internal and external dataare balanced via inverse probability weighting.

Usage

ex_tte_df

Format

`ex_tte_df`

A data frame with 150 rows and 9 columns:

subjid: Unique subject ID
y: Response (observed time at which the participant either had an event or was censored)
enr_time: Enrollment time
total_time: Time from study start
event: Event indicator (1: event; 0: censored)
cov1: Covariate 1, which is normally distributed around 65 with a SD of 10
cov2: Covariate 2, which is binary (0 vs. 1) with about 30% of participants having level 1
cov3: Covariate 3, which is binary (0 vs. 1) with about 40% of participants having level 1
cov4: Covariate 4, which is binary (0 vs. 1) with about 50% of participants having level 1

Internal Binary Data for Propensity Score Balancing

Description

Usage

int_binary_df

Format

`int_binary_df`

A data frame with 160 rows and 7 columns:

subjid: Unique subject ID
cov1: Covariate 1, which is normally distributed around 62 with an sd of 8
cov2: Covariate 2, which is binary (0 vs. 1) with about 40% of participants having level 1
cov3: Covariate 3, which is binary (0 vs. 1) with about 40% of participants having level 1
cov4: Covariate 4, which is binary (0 vs. 1) with about 60% of participants having level 1
trt: Treatment indicator, where 0 = control and 1 = active treatment
y: Response, which is binary (0 vs. 1)

Internal Normal Data for Propensity Score Balancing

Description

Usage

int_norm_df

Format

`int_norm_df`

A data frame with 120 rows and 7 columns:

subjid: Unique subject ID
cov1: Covariate 1, which is normally distributed around 55 with a SD of 8
cov2: Covariate 2, which is binary (0 vs. 1) with about 30% of participants having level 1
cov3: Covariate 3, which is binary (0 vs. 1) with about 50% of participants having level 1
cov4: Covariate 4, which is binary (0 vs. 1) with about 30% of participants having level 1
trt: Treatment indicator, where 0 = control and 1 = active treatment
y: Response, which is normally distributed with a SD of 0.15

Internal Time-to-Event Control Data for Propensity Score Balancing

Description

Usage

int_tte_df

Format

`int_tte_df`

A data frame with 160 rows and 10 columns:

subjid: Unique subject ID
y: Response (observed time at which the participant either had an event or was censored)
enr_time: Enrollment time
total_time: Time from study start
event: Event indicator (1: event; 0: censored)
trt: Treatment indicator, where 0 = control and 1 = active treatment
cov1: Covariate 1, which is normally distributed around 62 with a SD of 8
cov2: Covariate 2, which is binary (0 vs. 1) with about 40% of participants having level 1
cov3: Covariate 3, which is binary (0 vs. 1) with about 40% of participants having level 1
cov4: Covariate 4, which is binary (0 vs. 1) with about 60% of participants having level 1

Inverse Logit Function

Description

Inverse Logit Function

Usage

inv_logit(x)

Arguments

x

Real number(s) to take the inverse logit of

Details

This function is a short hand to\exp(x)/(1 + \exp(x)).

Value

Vector of probabilities between 0 and 1

Examples

inv_logit(0.5)

Test If Propensity Score Object

Description

Test If Propensity Score Object

Usage

is_prop_scr(x)

Arguments

x

Object to test

Value

Boolean

Examples

library(dplyr)x <- calc_prop_scr(internal_df = filter(int_norm_df, trt == 0),                       external_df = ex_norm_df,                       id_col = subjid,                       model = ~ cov1 + cov2 + cov3 + cov4)is_prop_scr(x)

Extract Means of Mixture Components

Description

Extract Means of Mixture Components

Usage

mix_means(x)

Arguments

x

A mixture distributional object

Details

If a distributional object that is a mixture of two or more normaldistributions is read in, the function will return a numeric object withthe means of each normal component. If the distributional object is amixture of two or more multivariate normal distributions, the functionwill return a list with the mean vectors of each multivariate normalcomponent.

Value

numeric or list object

Examples

library(distributional)mix_norm <- dist_mixture(comp1 = dist_normal(1, 10),                         comp2 = dist_normal(1.5, 12),                         weights = c(.5, .5))mix_means(mix_norm)

Extract Standard Deviations of Mixture Components

Description

Extract Standard Deviations of Mixture Components

Usage

mix_sigmas(x)

Arguments

x

A mixture distributional object

Details

If a distributional object that is a mixture of two or more normaldistributions is read in, the function will return a numeric object withthe standard deviations of each normal component. If the distributionalobject is a mixture of two or more multivariate normal distributions, thefunction will return a list with the covariance matrices of each multivariatenormal component.

Value

numeric or list object

Examples

library(distributional)mix_norm <- dist_mixture(comp1 = dist_normal(1, 10),                         comp2 = dist_normal(1.5, 12),                         weights = c(.5, .5))mix_sigmas(mix_norm)

Plot Distribution

Description

Plot Distribution

Usage

plot_dist(...)

Arguments

...

Distributional object(s) to plot. When passing multiple objectsnaming them will change the labels in the plot, else they will use thedistributional format

Value

ggplot object that is the density of the provided distribution

Examples

library(distributional)plot_dist(dist_normal(0, 1))plot_dist(dist_multivariate_normal(mu = list(c(1, 2)), sigma = list(matrix(c(4, 2, 2, 3), ncol=2))))#Plotting Multipleplot_dist(dist_normal(0, 1), dist_normal(10, 5))plot_dist('Prior' = dist_normal(0, 1), 'Posterior' = dist_normal(10, 5))

Propensity Score Cloud Plot

Description

Propensity Score Cloud Plot

Usage

prop_scr_cloud(x, trimmed_prop_scr = NULL)

Arguments

x

Aprop_scr object

trimmed_prop_scr

A trimmedprop_scr object

Value

ggplot object

Examples

library(dplyr)ps_obj <- calc_prop_scr(internal_df = filter(int_norm_df, trt == 0),                        external_df = ex_norm_df,                        id_col = subjid,                        model = ~ cov1 + cov2 + cov3 + cov4)ps_obj_trimmed <- trim_ps(ps_obj, low = 0.1, high = 0.6)# Plotting the Propensity Scoresprop_scr_cloud(ps_obj, trimmed_prop_scr = ps_obj_trimmed)

Density of the Propensity Score Object

Description

Plot overlapping density curves of the propensity scores forboth the internal and external participants, or plot external IPWs.

Usage

prop_scr_dens(  x,  variable = c("propensity score", "ps", "inverse probability weight", "ipw"),  ...)

Arguments

x

Propensity score object

variable

Variable to plot. It must be either a propensity score("ps" or "propensity score") or inverse probability weight ("ipw" or"inverse probability weight")

...

Optional arguments forgeom_density

Value

ggplot object

Examples

library(dplyr)ps_obj <- calc_prop_scr(internal_df = filter(int_norm_df, trt == 0),                       external_df = ex_norm_df,                       id_col = subjid,                       model = ~ cov1 + cov2 + cov3 + cov4)# Plotting the Propensity Scoresprop_scr_dens(ps_obj)# Or plotting the inverse probability weightsprop_scr_dens(ps_obj, variable = "ipw")

Histogram of the Propensity Score Object

Description

Plot overlapping histograms of the propensity scores for boththe internal and external participants, or plot external IPWs.

Usage

prop_scr_hist(  x,  variable = c("propensity score", "ps", "inverse probability weight", "ipw"),  ...)

Arguments

x

Propensity score object

variable

Variable to plot. It must be either a propensity score("ps" or "propensity score") or inverse probability weight ("ipw" or"inverse probability weight")

...

Optional arguments forgeom_histogram

Value

ggplot object

Examples

library(dplyr)ps_obj <- calc_prop_scr(internal_df = filter(int_norm_df, trt == 0),                       external_df = ex_norm_df,                       id_col = subjid,                       model = ~ cov1 + cov2 + cov3 + cov4)# Plotting the Propensity Scoresprop_scr_hist(ps_obj)# Or plotting the inverse probability weightsprop_scr_hist(ps_obj, variable = "ipw")

Love Plot of the Absolute Standardized Mean Differences

Description

Plot the unadjusted and IPW-adjusted absolute standardized meandifferences for each covariate.

Usage

prop_scr_love(x, reference_line = NULL, ...)

Arguments

x

Propensity score object

reference_line

Numeric value of where along the x-axis the verticalreference line should be placed

...

Optional options forgeom_point

Value

ggplot object

Examples

library(dplyr)ps_obj <- calc_prop_scr(internal_df = filter(int_norm_df, trt == 0),                       external_df = ex_norm_df,                       id_col = subjid,                       model = ~ cov1 + cov2 + cov3 + cov4)# Plotting the Propensity Scoresprop_scr_love(ps_obj, reference_line = 0.1)

Objects exported from other packages

Description

These objects are imported from other packages. Follow the linksbelow to see their documentation.

generics: glance,tidy

Rescale a`prop_scr` object

Description

Rescale aprop_scr object

Usage

rescale_ps(x, n = NULL, scale_factor = NULL)

Arguments

x

aprop_scr obj

n

Desired sample size that the external data should effectivelycontribute to the analysis of the internal trial data. This will be used toscale the external weights ifscale_factor is not specified

scale_factor

Value to multiple all weights by. This will be used toscale the external weights ifn is not specified

Value

aprop_scr object with rescaled weights

Examples

library(dplyr)ps_obj <- calc_prop_scr(internal_df = filter(int_binary_df, trt == 0),                       external_df = ex_binary_df,                       id_col = subjid,                       model = ~ cov1 + cov2 + cov3 + cov4)# weights in a propensity score object can be rescaled to achieve a desired# effective sample size (i.e., sum of weights)rescale_ps(ps_obj, n = 75)# Or by a predetermined factorrescale_ps(ps_obj, scale_factor = 1.5)

Robustify Multivariate Normal Distributions

Description

Adds vague normal component, where the level of vagueness is controlled bythen parameter

Usage

robustify_mvnorm(prior, n, weights = c(0.5, 0.5))

Arguments

prior

Multivariate Normal distributional object

n

Number of theoretical participants (or events, for time-to-event data)

weights

Vector of weights, where the first number corresponds to theinformative component and the second is the vague

Details

In cases with a time-to-event endpoint, a robust mixture prior can becreated by adding a vague multivariate normal component to any multivariatenormal prior with mean vector\boldsymbol{\mu} and covariance matrix\boldsymbol{\Sigma}. The vague component is calculated to have thesame mean vector\boldsymbol{\mu} and covariance matrix equal to\boldsymbol{\Sigma} \times n, wheren is the specified number oftheoretical events.

Value

mixture distribution

Examples

library(distributional)robustify_mvnorm(      dist_multivariate_normal(mu = list(c(1, 0)), sigma = list(c(10, 5))),       n = 15)

Robustify Normal Distributions

Description

Adds vague normal component, where the level of vagueness is controlled bythen parameter

Usage

robustify_norm(prior, n, weights = c(0.5, 0.5))

Arguments

prior

Normal or Multivariate Normal distributional object

n

Number of theoretical participants (or events, for time-to-event data)

weights

Vector of weights, where the first number corresponds to theinformative component and the second is the vague

Details

In cases with a normal endpoint, a robust mixture prior can be created byadding a vague normal component to any normal prior with mean\thetaand variance\sigma^2.The vague component is calculated to have thesame mean\theta and variance equal to\sigma^2 \times n, wheren is the specified number of theoretical participants. If robustifying a normalpower prior that was calculated from external control data andn is defined asthe number of external control participants, and the vague component wouldthen correspond to one external control participant's worth of data.

Value

mixture distribution

Examples

library(distributional)robustify_norm(dist_normal(0,1), n = 15)

Simulate Participant Accrual Times

Description

Simulate Participant Accrual Times

Usage

sim_accrual(n, accrual_periods, accrual_props)

Arguments

n

Number of participants

accrual_periods

Vector of right endpoints defining the time periods ofaccrual, e.g., c(6,8) defines 0<=x<6, 6<=x<8.

accrual_props

Vector indicating the proportion of participants that areexpected to be enrolled during each of the defined accrual periods. Shouldsum to 1, otherwise these proportions will be normalized.

Details

Simulate the accrual times for each participant, whereaccrual_periods defines the right time points for each accrual period(with the last element corresponding to the end of accrual), andaccrual_props defines the proportion of study participants who areenrolled during each of these periods. The simulated accrual times forparticipants within a given accrual period are assumed to be uniformlydistributed.

Value

Vector of accrual times corresponding to when each participant entersthe study

Examples

at <- sim_accrual(n = 100000, accrual_periods = c(6, 8), accrual_props = c(.5, .5))hist(at, breaks = 100, main = "Histogram of Enrollment Times", xlab = "Enrollment Time")

Simulate Event Times for Each Individual from a Piecewise Constant Hazard Model

Description

Simulate Event Times for Each Individual from a Piecewise Constant Hazard Model

Usage

sim_pw_const_haz(n, hazard_periods = NULL, hazard_values)

Arguments

n

Number of individuals

hazard_periods

Vector of break points between time periods with separate constanthazards, e.g., c(6,8) defines [0,6), [6,8), [8, infinity). Leave as NULL ifdefining only one hazard period.

hazard_values

Vector of constant hazard values associated with the time intervals

Details

Simulate the event or censoring times for each participant using apiecewise constant hazard model where each time period is defined to havea different constant hazard. Here,hazard_periods defines the righttime points for each time period (with the exception of the last timeperiod which extends to infinity), andhazard_values defines the constanthazards for each time period.

Value

Vector of simulated times from the time-to-event distribution

Examples

tte_dat <- sim_pw_const_haz(n = 100000, hazard_periods = c(6, 8), hazard_values = c(0.1, 0.1, 0.1))hist(tte_dat, breaks = 100, main = "Event Time Distribution", xlab = "Event Time")

Simulate Event Times for Each Participant from a Weibull Proportional HazardsRegression Model

Description

Simulate Event Times for Each Participant from a Weibull Proportional HazardsRegression Model

Usage

sim_weib_ph(weibull_ph_mod, samp_df, cond_drift = 0, cond_trt_effect = 0)

Arguments

weibull_ph_mod

survreg object corresponding to a Weibullproportional hazards model fit using the external data

samp_df

Data frame of covariates corresponding to the sample arm(control or treated) for which event times should be simulated. The columnnames should correspond to the covariate names in thesurvreg object.

cond_drift

Optional value of the conditional drift by which theintercept in the Weibull proportional hazards regression model should beincreased/decreased to incorporate the impact of unmeasurable sources ofdrift. Default is 0.

cond_trt_effect

Optional value of the conditional treatment effect bywhich the intercept in the Weibull proportional hazards regression modelshould be increased/decreased if simulating event data for a treated arm.Default is 0.

Details

Simulate the event times for each participant using a Weibullproportional hazards (PH) regression model. The "true" parameter valuesfor the Weibull shape\alpha and the regression coefficients\boldsymbol{\beta} are assumed to be equal to the parameter estimatesfrom asurvreg object (weibull_ph_mod) fit using external data (notethat the Weibull shape parameter\alpha is defined as the inverse ofthe scale parameter reported bysurvreg).

For participanti, lety_i denote the time-to-event randomvariable and\boldsymbol{x}_i = \{x_{i,1}, \ldots, x_{i,p}\} thevector ofp covariates (rowi ofsamp_df) that correspondto the(p+1)-dimensional vector of regression coefficients\boldsymbol{\beta}. The density function of the Weibull PH regressionmodel is

f(y_i \mid \boldsymbol{x}_i, \alpha, \boldsymbol{\beta}, \delta, \gamma) = \left( \frac{\alpha}{\sigma_i} \right) \left( \frac{y_i}{\sigma_i} \right)^{\alpha - 1} \exp \left( -\left( \frac{y_i}{\sigma_i} \right)^\alpha \right),

where-\log(\sigma_i) = \beta_0 + \beta_1 x_{i,1} + \ldots + \beta_p x_{i,p} + \delta + \gamma. Here,\delta and\gammadenote the conditional drift (cond_drift) and conditional treatmenteffect (cond_trt_effect), respectively, that can be calculated usingcalc_cond_weibull() for desired values of the marginal drift and marginaltreatment effect.

Value

Vector of simulated event times from a Weibull proportional hazardsregression model

Examples

library(dplyr)library(survival)# Model "true" regression coefficients and shape parameter using the external dataweibull_ph_mod <- survreg(Surv(y, event) ~ cov1 + cov2 + cov3 + cov4, data = ex_tte_df,                          dist = "weibull")# Sample covariates for internal control arm via bootstrap from external datasamp_int_ctrl <- bootstrap_cov(ex_tte_df, n = 100) |>  select(c(cov1, cov2, cov3, cov4))     # keep only covariate columnstte_dat <- sim_weib_ph(weibull_ph_mod, samp_df = samp_int_ctrl)

Create Sweet Spot Plots for Multiple Simulation Scenarios

Description

Create visualization plots to help identify the "sweet spot" in borrowingstrategies across different simulation scenarios. For each unique scenariodefined by the combination of variables inscenario_vars, the functionproduces a plot showing power, type I error, and the distribution of thedesign prior for the control marginal parameter for approaches with andwithout borrowing.

Usage

sweet_spot_plot(  .data,  scenario_vars,  trt_diff,  control_marg_param,  h0_prob,  h0_prob_no_borrowing,  design_prior = NULL,  highlight = TRUE)

Arguments

.data

A data frame containing iteration-level simulation results.

scenario_vars

A vector of quoted column names correspondingto variables used to define unique simulation scenarios. Each uniquecombination of values in these columns will generate a separate plot.

trt_diff

An unquoted column name representing the treatmentdifference. Used to identify scenarios with null effect (trt_diff = 0) fortype I error calculation.

control_marg_param

An unquoted column name to be used as the x-axis inthe plots. This is typically the control endpoint of interest on themarginal scale (e.g., control response rate).

h0_prob

An unquoted column name containing the probability ofrejecting the null hypothesis when when borrowing external data.

h0_prob_no_borrowing

An unquoted column name containing theprobability of rejecting the null hypothesis when not borrowing externaldata.

design_prior

An unquoted column name containing distributional objectsthat represent the design prior distribution for the control marginalparameter (e.g., posterior distribution using the external control data).Used to aid visualization of which values of the control marginal parameterare assumed to be plausible. Default isNULL, in which case no designprior is plotted. See Details for more information.

highlight

Logical value to indicate if you want sweet spothighlighting or not. IfTRUE the sweet spot (where borrowing increasepower and reduces type 1 error) will be highlighted.

Details

The function calculates power and type I error rates for BDB approachesthat borrow from external data (e.g., use of a robust mixture prior with positiveweight on the informative component) and an approach that does notborrow from external data (e.g., use of a vague prior) under each scenarioand visualizes them together as a function of the underlying control marginalparameter of interest (e.g., control response rate for binary outcomes) thatmay vary as a result of drift. This helps identify the "sweet spot" where borrowingresults in higher power and lower type I error rates compared to not borrowing.Type I error is calculated using scenarios wheretrt_diff equals 0, and poweris calculated for all scenarios with positive values oftrt_diff.

Ifdesign_prior is non-NULL, the design prior distribution is includedin the plot to provide insight into which values of the control marginalparameter are plausible given this assumed design prior. We note thatdesign_prior can represent any informative prior that potentiallyincorporates the external control data (e.g., the posterior distribution ofthe control marginal parameter constructed using the external data and avague prior). Each element of the vector corresponding todesign_prior mustbe a distributional object with a family equal to "beta", "normal", or"mixture" (where each component is either "beta" or "normal"). For thetime-to-event case in which a multivariate normal prior is assumed for thecontrol log-shape and intercept of a Weibull proportional hazards model,this distribution must first be translated into a univariate beta designprior for the control survival probability at some prespecified time.This approximation can be done usingapprox_mvn_at_time(). If thedesign priors in the vector indicated bydesign_prior differ acrossiterations within a given scenario (e.g., using the IPW power prior as theiteration-specific design prior), then the average distribution will beplotted (i.e., a distribution of the same family with the hyperparametersaveraged across iterations).

Value

A list of ggplot objects, one for each unique scenario defined byscenario_vars. Each plot shows:

Power curves for the cases with and without borrowing
Type I error rates for the cases with and without borrowing
Distribution of the design prior (ifdesign_prior is specified)

References

Best, N., Ajimi, M., Neuenschwander, B., Saint-Hilary, G., & Wandel, S.(2024). Beyond the Classical Type I Error: Bayesian Metrics for BayesianDesigns Using Informative Priors.Statistics in Biopharmaceutical Research,17(2), 183–196.doi:10.1080/19466315.2024.2342817

Examples

library(dplyr)# Assuming binary_sim_df is a data frame with simulation results in the shape# of binary template codeplots <- sweet_spot_plot(  .data = binary_sim_df,  scenario_vars = c("population", "marg_trt_eff"),  trt_diff = marg_trt_eff,  control_marg_param = true_control_RR,  h0_prob = reject_H0_yes,  h0_prob_no_borrowing = no_borrowing_reject_H0_yes,  design_prior = pwr_prior)# Display the first plotplots[[1]]tte_plots <- tte_sim_df |> mutate(beta_appox = approx_mvn_at_time(mix_prior, time = 12)) |> sweet_spot_plot(   scenario_vars = c("population", "marg_trt_eff"),   trt_diff = marg_trt_eff,   control_marg_param = true_control_surv_prob,   h0_prob = reject_H0_yes,   h0_prob_no_borrowing = no_borrowing_reject_H0_yes,   design_prior = beta_appox )tte_plots[[1]]

Tidy a(n) prop_scr object

Description

Tidy a(n) prop_scr object

Usage

## S3 method for class 'prop_scr'tidy(x, ...)

Arguments

x

aprop_scr obj

...

Unused, included for generic consistency only.

Value

A tidytibble::tibble() summarizing the results of the propensityscore weighting. The tibble will have the id column of the external data,aninternal column to indicate all the data is external, aps columnwith the propensity scores and aweight column with the inverseprobability weights

Examples

library(dplyr)ps_obj <- calc_prop_scr(internal_df = filter(int_binary_df, trt == 0),                       external_df = ex_binary_df,                       id_col = subjid,                       model = ~ cov1 + cov2 + cov3 + cov4)tidy(ps_obj)

Trim a`prop_scr` object

Description

Trim aprop_scr object

Usage

trim_ps(x, low = NULL, high = NULL, quantile = FALSE)

Arguments

x

Aprop_scr object

low

Low cut-off such that all participants with propensity scores lessthan this value (or quantile ifquantile = TRUE) are removed. If leftNULL no lower bound will be used

high

High cut-off such that all participants with propensity scoresgreater than this value (or quantile ifquantile = TRUE) are removed. IfleftNULL no upper bound will be used

quantile

True/False value to determine if the cut-off values are baseddirectly on the propensity scores (false) or their quantiles (true). By default this isfalse.

Details

This function uses R's default method of quantile calculation (type7)

Value

aprop_scr object with a trimmed propensity score distribution

Examples

library(dplyr)ps_obj <- calc_prop_scr(internal_df = filter(int_binary_df, trt == 0),                       external_df = ex_binary_df,                       id_col = subjid,                       model = ~ cov1 + cov2 + cov3 + cov4)trim_ps(ps_obj, low = 0.3, high = 0.7)

Time-to-Event Simulation Data

Description

This is an example of output from a simulation study that investigates theoperating characteristics of inverse probability weighted Bayesian dynamicborrowing for the case with a time-to-event outcome. This output was generatedbased on the time-to-event simulation template. For this simulation study, only thedegree of covariate imbalance (as indicated bypopulation) and themarginal treatment effect were varied.

Usage

tte_sim_df

Format

`tte_sim_df` A data frame with 18 rows and 7 columns:

population: Populations defined by different covariate imbalances
marg_trt_eff: Marginal treatment effect
true_control_surv_prop: True control survival probability at time t=12 months on the marginal scale
reject_H0_yes: Probability of rejecting the null hypothesis in the case with borrowing
no_borrowing_reject_H0_yes: Probability of rejecting the null hypothesis without borrowing
pwr_prior: Vector of IPW power priors as distributional objects
mix_prior: Vector of mixture priors (i.e., the robustified IPW power priors) as distributional objects

Movatterモバイル変換

The 'beastt' package.

Description

References

Approximate Multivariate Normal Distribution as Beta at a Specific Time

Description

Usage

Arguments

Details

Value

See Also

Examples

Calculate Average Distribution from Multiple Distributional Objects

Description

Usage

Arguments

Details

Value

Examples

Binary Simulation Data

Description

Usage

Format

binary_sim_df A data frame with 255 rows and 6 columns:

Bootstrap Covariate Data

Description

Usage

Arguments

Details

Value

Examples

Calculate Conditional Drift and Treatment Effect for Binary Outcome Models

Description

Usage

Arguments

Details

Value

Examples

Calculate Conditional Drift and Treatment Effect for Time-to-Event Outcome Models

Description

Usage

Arguments

Details

Value

Examples

Calculate Posterior Beta

Description

Usage

Arguments

Details

Value

Examples

Calculate Posterior Normal

Description

Usage

Arguments

Details

Value

Examples

Calculate Posterior Weibull

Description

Usage

Arguments

Details

Value

Examples

Calculate Power Prior Beta

Description

Usage

Arguments

Details

Value

See Also

Examples

Calculate Power Prior Normal

Description

Usage

Arguments

Details

Value

`binary_sim_df` A data frame with 255 rows and 6 columns:

`ex_binary_df`

`ex_norm_df`

`ex_tte_df`

`int_binary_df`

`int_norm_df`

`int_tte_df`