Movatterモバイル変換


[0]ホーム

URL:


Type:Package
Title:Prediction Intervals
Version:2.3.0
Description:An implementation of prediction intervals for overdispersed count data, for overdispersed binomial data and for linear random effects models.
License:GPL-2 |GPL-3 [expanded from: GPL (≥ 2)]
Encoding:UTF-8
LazyData:true
Imports:stats, graphics, methods
RoxygenNote:7.3.2
Suggests:rmarkdown, knitr, testthat (≥ 3.0.0)
Config/testthat/edition:3
Depends:R (≥ 3.5.0), ggplot2, lme4, MASS
URL:https://github.com/MaxMenssen/predint
BugReports:https://github.com/MaxMenssen/predint/issues
NeedsCompilation:no
Packaged:2025-07-22 10:33:24 UTC; maxmenssen
Author:Max Menssen [aut, cre]
Maintainer:Max Menssen <menssen@cell.uni-hannover.de>
Repository:CRAN
Date/Publication:2025-07-22 11:01:31 UTC

Historical numbers of revertant colonies in the Ames test (OECD 471)

Description

This data set contains artificial historical control data that was sampled inorder to mimic the number of revertant colonies based on two or three petri dishes.

Usage

ames_HCD

Format

Adata.frame with 2 rows and 10 columns:

rev_col

no. of revertant colonies

no_dish

no. of petri dishes in the control group


Store prediction intervals or limits as adata.frame

Description

Get the prediction intervals or limits of an object of classpredint andsave them as adata.frame.

Usage

## S3 method for class 'predint'as.data.frame(x, ...)

Arguments

x

object of classpredint

...

additional arguments to be passed tobase::as.data.frame()

Value

This function returns the prediction intervals or limits stored in anobject of class"predint" as adata.frame

Examples

### PI for quasi-Poisson datapred_int <- quasi_pois_pi(histdat=ames_HCD,                          newoffset=3,                          nboot=100,                          traceplot = FALSE)# Return the prediction intervals as a data.frameas.data.frame(pred_int)# Please note that nboot was set to 100 in order to decrease computing time# of the example. For a valid analysis set nboot=10000.

Beta-binomial data (example 1)

Description

This data set contains sampled beta-binomial data from 10 clusterseach of size 50. The data set was sampled withrbbinom(n=10, size=50, prob=0.1, rho=0.06).

Usage

bb_dat1

Format

Adata.frame with 10 rows and 2 columns:

succ

number of successes

fail

number of failures


Beta-binomial data (example 2)

Description

This data set contains sampled beta-binomial data from 3 clusters each ofdifferent size. The data set was sampled withrbbinom(n=3, size=c(40, 50, 60), prob=0.1, rho=0.06).

Usage

bb_dat2

Format

Adata.frame with 3 rows and 2 columns:

succ

number of successes

fail

number of failures


Simple uncalibrated prediction intervals for beta-binomial data

Description

bb_pi() is a helper function that is internally called bybeta_bin_pi(). Itcalculates simple uncalibrated prediction intervals for binarydata with overdispersion changing between the clusters (beta-binomial).

Usage

bb_pi(  newsize,  histsize,  pi,  rho,  q = qnorm(1 - 0.05/2),  alternative = "both",  newdat = NULL,  histdat = NULL,  algorithm = NULL)

Arguments

newsize

number of experimental units in the historical clusters

histsize

number of experimental units in the future clusters

pi

binomial proportion

rho

intra class correlation

q

quantile used for interval calculation

alternative

either "both", "upper" or "lower"alternative specifies, if a prediction interval oran upper or a lower prediction limit should be computed

newdat

additional argument to specify the current data set

histdat

additional argument to specify the historical data set

algorithm

used to define the algorithm for calibration if called viabeta_bin_pi(). This argument is not of interest for the calculationof simple uncalibrated intervals

Details

This function returns a simple uncalibrated prediction interval

[l,u]_m = n^*_m \hat{\pi} \pm q \sqrt{n^*_m \hat{\pi} (1- \hat{\pi}) [1 + (n^*_m -1) \hat{\rho}] +[\frac{n^{*2}_m \hat{\pi} (1- \hat{\pi})}{\sum_h n_h} + \frac{\sum_h n_h -1}{\sum_h n_h} n^{*2}_m \hat{\pi} (1- \hat{\pi}) \hat{\rho}]}

withn^*_m as the number of experimental units in them=1, 2, ... , M future clusters,\hat{\pi} as the estimate for the binomial proportion obtained from thehistorical data,\hat{\rho} as the estimate for the intra class correlationandn_h as the number of experimental units per historical cluster.

The direct application of this uncalibrated prediction interval to real life datais not recommended. Please usebeta_bin_pi() for real life applications.

Value

bb_pi() returns an object of classc("predint", "betaBinomialPI")with prediction intervals or limits in the first entry ($prediction).

Examples

# Pointwise uncalibrated PIbb_pred <- bb_pi(newsize=c(50), pi=0.3, rho=0.05, histsize=rep(50, 20), q=qnorm(1-0.05/2))summary(bb_pred)

Prediction intervals for beta-binomial data

Description

beta_bin_pi() calculates bootstrap calibrated prediction intervals forbeta-binomial data

Usage

beta_bin_pi(  histdat,  newdat = NULL,  newsize = NULL,  alternative = "both",  alpha = 0.05,  nboot = 10000,  delta_min = 0.01,  delta_max = 10,  tolerance = 0.001,  traceplot = TRUE,  n_bisec = 30,  algorithm = "MS22mod")

Arguments

histdat

adata.frame with two columns (number of successes andnumber of failures) containing the historical data

newdat

adata.frame with two columns (number of successes andnumber of failures) containing the future data

newsize

a vector containing the future cluster sizes

alternative

either "both", "upper" or "lower".alternativespecifies if a prediction interval or an upper or a lower prediction limitshould be computed

alpha

defines the level of confidence (1-alpha)

nboot

number of bootstraps

delta_min

lower start value for bisection

delta_max

upper start value for bisection

tolerance

tolerance for the coverage probability in the bisection

traceplot

ifTRUE: Plot for visualization of the bisection process

n_bisec

maximal number of bisection steps

algorithm

either "MS22" or "MS22mod" (see details)

Details

This function returns bootstrap-calibrated prediction intervals as well aslower or upper prediction limits.

Ifalgorithm is set to "MS22", both limits of the prediction intervalare calibrated simultaneously using the algorithm described in Menssen andSchaarschmidt (2022), section 3.2.4. The calibrated prediction interval is givenas

[l,u]_m = n^*_m \hat{\pi} \pm q^{calib} \hat{se}(Y_m - y^*_m)

where

\hat{se}(Y_m - y^*_m) = \sqrt{n^*_m \hat{\pi} (1- \hat{\pi}) [1 + (n^*_m -1) \hat{\rho}] +[\frac{n^{*2}_m \hat{\pi} (1- \hat{\pi})}{\sum_h n_h} + \frac{\sum_h n_h -1}{\sum_h n_h} n^{*2}_m \hat{\pi} (1- \hat{\pi}) \hat{\rho}]}

withn^*_m as the number of experimental units in the future clusters,\hat{\pi} as the estimate for the binomial proportion obtained from thehistorical data,q^{calib} as the bootstrap-calibrated coefficient,\hat{\rho} as the estimate for the intra class correlation (Lui et al. 2000)andn_h as the number of experimental units per historical cluster.

Ifalgorithm is set to "MS22mod", both limits of the prediction intervalare calibrated independently from each other. The resulting prediction intervalis given by

[l,u]_m = \big[n^*_m \hat{\pi} - q^{calib}_l \hat{se}(Y_m - y^*_m), \quad n^*_m \hat{\pi} + q^{calib}_u \hat{se}(Y_m - y^*_m) \big]

Please note, that this modification does not affect the calibration procedure, if onlyprediction limits are of interest.

Value

beta_bin_pi returns an object of classc("predint", "betaBinomialPI")with prediction intervals or limits in the first entry ($prediction).

References

Lui et al. (2000): Confidence intervals for the risk ratiounder cluster sampling based on the beta-binomial model. Statistics in Medicine.
doi:10.1002/1097-0258(20001115)19:21<2933::AID-SIM591>3.0.CO;2-Q

Menssen and Schaarschmidt (2022): Prediction intervals for all of M futureobservations based on linear random effects models. Statistica Neerlandica.doi:10.1111/stan.12260

Examples

# Prediction intervalpred_int <- beta_bin_pi(histdat=mortality_HCD, newsize=40, nboot=100)summary(pred_int)# Upper prediction boundpred_u <- beta_bin_pi(histdat=mortality_HCD, newsize=40, alternative="upper", nboot=100)summary(pred_u)# Please note that nboot was set to 100 in order to decrease computing time# of the example. For a valid analysis set nboot=10000.

Bisection algorithm for bootstrap calibration of prediction intervals

Description

This helper function returns a bootstrap calibrated coefficient for the calculationof prediction intervals (and limits).

Usage

bisection(  y_star_hat,  pred_se,  y_star,  alternative,  quant_min,  quant_max,  n_bisec,  tol,  alpha,  traceplot = TRUE)

Arguments

y_star_hat

a list of lengthB that contains the expected future observations.Each entry in this list has to be a numeric vector of lengthM.

pred_se

a list of lengthB that contains the standard errors of the prediction.Each entry in this list has to be a numeric vector of lengthM.

y_star

a list of lengthB that contains the future observations.Each entry in this list has to be a numeric vector of lengthM.

alternative

either "both", "upper" or "lower".alternative specifies if a prediction interval oran upper or a lower prediction limit should be computed

quant_min

lower start value for bisection

quant_max

upper start value for bisection

n_bisec

maximal number of bisection steps

tol

tolerance for the coverage probability in the bisection

alpha

defines the level of confidence (1-\alpha)

traceplot

ifTRUE: Plot for visualization of the bisection process

Details

This function is an implementation of the bisection algorithm of Menssenand Schaarschmidt 2022. It returns a calibrated coefficientq^{calib} for thecalculation of pointwise and simultaneous prediction intervals

[l,u] = \hat{y}^*_m \pm q^{calib} \hat{se}(Y_m - y^*_m),

lower prediction limits

l = \hat{y}^*_m - q^{calib} \hat{se}(Y_m - y^*_m)

or upper prediction limits

u = \hat{y}^*_m + q^{calib} \hat{se}(Y_m - y^*_m)

that cover all ofm=1, ... , M future observations.

In this notation,\hat{y}^*_m are the expected future observations for each ofthem future clusters,q^{calib} is thecalibrated coefficient and\hat{se}(Y_m - y^*_m)are the standard errors of the prediction.

Value

This function returnsq^{calib} in the equation above.

References

Menssen and Schaarschmidt (2022): Prediction intervals for all of M futureobservations based on linear random effects models. Statistica Neerlandica.
doi:10.1111/stan.12260


Bootstrap new data from uncalibrated prediction intervals

Description

boot_predint() is a helper function to bootstrap new data from the simpleuncalibrated prediction intervals implemented in predint.

Usage

boot_predint(pred_int, nboot, adjust = "within")

Arguments

pred_int

object of classc("quasiPoissonPI", "betaBinomialPI","quasiBinomialPI", negativeBinomialPI)

nboot

number of bootstraps

adjust

specifies if simultaneous prediction should be done for severalcontrol groups of different studies (between), or for the outcome ofthe current control and some treatment groupswithin the same trial

Details

This function only works for binomial and Poisson type data. For the samplingof new data from random effects models seelmer_bs.

Value

boot_predint returns an object of classc("predint", "bootstrap")which is a list with two entries: One for bootstrapped historical observationsand one for bootstrapped future observations.

Examples

# Simple quasi-Poisson PItest_pi <- qp_pi(histoffset=c(3,3,3,4,5), newoffset=3, lambda=10, phi=3, q=1.96)# Draw 5 bootstrap samplestest_boot <- boot_predint(pred_int = test_pi, nboot=50)str(test_boot)summary(test_boot)# Please note that the low number of bootstrap samples was chosen in order to# decrease computing time. For valid analysis draw at least 10000 bootstrap samples.

Cross-classified data (example 1)

Description

c2_dat1 contains data that is sampled from a balanced cross-classified design.This data set is used in order to demonstrate the functionality of thelmer_pi_...() functions.

Usage

c2_dat1

Format

Adata.frame with 27 rows and 3 columns:

y_ijk

observations

a

treatment a

b

treatment b


Cross-classified data (example 2)

Description

c2_dat2 contains data that was sampled from an unbalanced cross-classified design.This data set is used in order to demonstrate the functionality of thelmer_pi_...() functions.

Usage

c2_dat2

Format

Adata.frame with 21 rows and 3 columns:

y_ijk

observations

a

treatment a

b

treatment b


Cross-classified data (example 3)

Description

c2_dat3 contains data that was sampled from a balanced cross-classified design.This data set is used in order to demonstrate the functionality of thelmer_pi_...() functions.

Usage

c2_dat3

Format

Adata.frame with 8 rows and 3 columns:

y_ijk

observations

a

treatment a

b

treatment b


Cross-classified data (example 4)

Description

c2_dat4 contains data that was sampled from an unbalanced cross-classified design.This data set is used in order to demonstrate the functionality of thelmer_pi_...() functions.

Usage

c2_dat4

Format

Adata.frame with 6 rows and 3 columns:

y_ijk

observations

a

treatment a

b

treatment b


Sampling of bootstrap data from a given random effects model

Description

lmer_bs() draws bootstrap samples based on the estimates for the meanand the variance components drawn from a random effects model fit withlme4::lmer().Contrary tolme4::bootMer(), the number of observations for each random factorcan vary between the original data set and the bootstrapped data. Random effectsinmodel have to be specified as(1|random effect).

Usage

lmer_bs(model, newdat = NULL, futmat_list = NULL, nboot)

Arguments

model

a random effects model of class lmerMod

newdat

adata.frame with the same column names as the historical dataon whichmodel depends

futmat_list

a list that contains design matrices for each random factor

nboot

number of bootstrap samples

Details

The data sampling is based on a list of design matrices (one for each random factor)that can be obtained ifnewdat and the model formula are provided tolme4::lFormula(). Hence, each random factor that is part of the initialmodel must have at least two replicates innewdat.
If a random factor in the future data set does not have any replicate, a listthat contains design matrices (one for each random factor) can be provided viafutmat_list.

Value

A list of lengthnboot containing the bootstrapped observations.

Examples

# loading lme4library(lme4)# Fitting a random effects model based on c2_dat1fit <- lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1)summary(fit)#----------------------------------------------------------------------------### Using c2_dat2 as newdatc2_dat2lmer_bs(model=fit, newdat=c2_dat2, nboot=100)#----------------------------------------------------------------------------### Using futmat_list# c2_dat4 has no replication for b. Hence the list of design matrices can not be# generated by lme4::lFormula() and have to be provided by hand via futmat_list.c2_dat4# Build a list containing the design matricesfml <- vector(length=4, "list")names(fml) <- c("a:b", "b", "a", "Residual")fml[["a:b"]] <- matrix(nrow=6, ncol=2, data=c(1,1,0,0,0,0, 0,0,1,1,1,1))fml[["b"]] <- matrix(nrow=6, ncol=1, data=c(1,1,1,1,1,1))fml[["a"]] <- matrix(nrow=6, ncol=2, data=c(1,1,0,0,0,0, 0,0,1,1,1,1))fml[["Residual"]] <- diag(6)fmllmer_bs(model=fit, futmat_list=fml, nboot=100)

Prediction intervals for future observations based on linear random effects models (DEPRECATED)

Description

This function is deprecated. Please uselmer_pi_unstruc(),lmer_pi_futvec() orlmer_pi_futmat().

Usage

lmer_pi(  model,  newdat = NULL,  m = NULL,  alternative = "both",  alpha = 0.05,  nboot = 10000,  lambda_min = 0.01,  lambda_max = 10,  traceplot = TRUE,  n_bisec = 30)

Arguments

model

a random effects model of class"lmerMod"

newdat

adata.frame with the same column names as the historical dataon which the model depends

m

number of future observations

alternative

either "both", "upper" or "lower".alternative specifiesif a prediction interval or an upper or a lower prediction limit should be computed

alpha

defines the level of confidence (1-alpha)

nboot

number of bootstraps

lambda_min

lower start value for bisection

lambda_max

upper start value for bisection

traceplot

ifTRUE: plot for visualization of the bisection process

n_bisec

maximal number of bisection steps

Details

This function returns a bootstrap calibrated prediction interval

[l,u] = \hat{y} \pm q \sqrt{\hat{var}(\hat{y} - y)}

with\hat{y} as the predicted future observation,y as the observed future observations,\sqrt{\hat{var}(\hat{y} - y)}as the prediction standard error andq as the bootstrap calibrated coefficient thatapproximates a quantile of the multivariate t-distribution.
Please note that this function relies on linear random effects models that arefitted withlmer() from the lme4 package. Random effects have to be specified as(1|random_effect).

Value

Ifnewdat is specified: Adata.frame that contains the future data,the historical mean (hist_mean), the calibrated coefficient (quant_calib),the prediction standard error (pred_se), the prediction interval (lower and upper)and a statement if the prediction interval covers the future observation (cover).

Ifm is specified: Adata.frame that contains the number of future observations (m)the historical mean (hist_mean), the calibrated coefficient (quant_calib),the prediction standard error (pred_se) and the prediction interval (lower and upper).

Ifalternative is set to "lower": Lower prediction limits are computed insteadof a prediction interval.

Ifalternative is set to "upper": Upper prediction limits are computed insteadof a prediction interval.

Iftraceplot=TRUE, a graphical overview about the bisection process is given.

Examples

# This function is deprecated.# Please use lmer_pi_unstruc() if you want exactly the same functionality.# Please use lmer_pi_futmat() or lmer_pi_futvec() if you want to take care# of the future experimental design

Prediction intervals for future observations based on linear random effects models

Description

lmer_pi_futmat() calculates a bootstrap calibrated prediction interval for one or morefuture observation(s) based on linear random effects models. With this approach,the experimental design of the future data is taken into account (see below).

Usage

lmer_pi_futmat(  model,  newdat = NULL,  futmat_list = NULL,  alternative = "both",  alpha = 0.05,  nboot = 10000,  delta_min = 0.01,  delta_max = 10,  tolerance = 0.001,  traceplot = TRUE,  n_bisec = 30,  algorithm = "MS22")

Arguments

model

a random effects model of class"lmerMod"

newdat

either 1 or adata.frame with the same column names as the historical dataon whichmodel depends

futmat_list

a list that contains design matrices for each random factor

alternative

either "both", "upper" or "lower".alternative specifiesif a prediction interval or an upper or a lower prediction limit should be computed

alpha

defines the level of confidence (1-alpha)

nboot

number of bootstraps

delta_min

lower start value for bisection

delta_max

upper start value for bisection

tolerance

tolerance for the coverage probability in the bisection

traceplot

ifTRUE: Plot for visualization of the bisection process

n_bisec

maximal number of bisection steps

algorithm

either "MS22" or "MS22mod" (see details)

Details

This function returns bootstrap-calibrated prediction intervals as well aslower or upper prediction limits.

Ifalgorithm is set to "MS22", both limits of the prediction intervalare calibrated simultaneously using the algorithm described in Menssen andSchaarschmidt (2022), section 3.2.4. The calibrated prediction interval is givenas

[l,u] = \hat{\mu} \pm q^{calib} \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1}\hat{\sigma}^2_c}

with\hat{\mu} as the expected future observation (historical mean) and\hat{\sigma}^2_c as thec=1, 2, ..., C variance components and\hat{\sigma}^2_{C+1}as the residual variance obtained from the randomeffects model fitted withlme4::lmer() andq^{calib} as the as the bootstrap-calibratedcoefficient used for interval calculation.

Ifalgorithm is set to "MS22mod", both limits of the prediction intervalare calibrated independently from each other. The resulting prediction intervalis given by

[l,u] = \Big[\hat{\mu} - q^{calib}_l \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1} \hat{\sigma}^2_c}, \quad\hat{\mu} + q^{calib}_u \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1} \hat{\sigma}^2_c} \Big].

Please note, that this modification does not affect the calibration procedure, if onlyprediction limits are of interest.

Ifnewdat is defined, the bootstrapped future observations used for the calibrationprocess mimic the structure of the data set provided vianewdat. Thedata sampling is based on a list of design matrices (one for each random factor)that can be obtained ifnewdat and the model formula are provided tolme4::lFormula(). Hence, each random factor that is part of the initialmodel must have at least two replicates innewdat.
If a random factor in the future data set does not have any replicate, a listthat contains design matrices (one for each random factor) can be provided viafutmat_list.

This function is an implementation of the PI given in Menssen and Schaarschmidt 2022section 3.2.4, except, that the bootstrap calibration values are drawn frombootstrap samples that mimic the future data as described above.

Value

lmer_pi_futmat() returns an object of classc("predint", "normalPI")with prediction intervals or limits in the first entry ($prediction).

References

Menssen and Schaarschmidt (2022): Prediction intervals for all of M futureobservations based on linear random effects models. Statistica Neerlandica,doi:10.1111/stan.12260

Examples

# loading lme4library(lme4)# Fitting a random effects model based on c2_dat1fit <- lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1)summary(fit)#----------------------------------------------------------------------------### Using newdat# Prediction interval using c2_dat2 as future datapred_int <- lmer_pi_futmat(model=fit, newdat=c2_dat2, alternative="both", nboot=100)summary(pred_int)# Upper prediction limit for m=1 future observationspred_u <- lmer_pi_futmat(model=fit, newdat=1, alternative="upper", nboot=100)summary(pred_u)#----------------------------------------------------------------------------### Using futmat_list# c2_dat4 has no replication for b. Hence the list of design matrices can not be# generated by lme4::lFormula() and have to be provided by hand via futmat_list.c2_dat4# Build a list containing the design matricesfml <- vector(length=4, "list")names(fml) <- c("a:b", "b", "a", "Residual")fml[["a:b"]] <- matrix(nrow=6, ncol=2, data=c(1,1,0,0,0,0, 0,0,1,1,1,1))fml[["b"]] <- matrix(nrow=6, ncol=1, data=c(1,1,1,1,1,1))fml[["a"]] <- matrix(nrow=6, ncol=2, data=c(1,1,0,0,0,0, 0,0,1,1,1,1))fml[["Residual"]] <- diag(6)fml# Please note, that the design matrix for the interaction term a:b is also# provided even there is no replication for b, since it is assumed that# both, the historical and the future data descent from the same data generating# process.# Calculate the PIpred_fml <- lmer_pi_futmat(model=fit, futmat_list=fml, alternative="both", nboot=100)summary(pred_fml)#----------------------------------------------------------------------------# Please note that nboot was set to 100 in order to decrease computing time# of the example. For a valid analysis set nboot=10000.

Prediction intervals for future observations based on linear random effects models

Description

lmer_pi_futvec() calculates a bootstrap calibrated prediction interval for one or morefuture observation(s) based on linear random effects models. With this approach,the experimental design of the future data is taken into account (see below).

Usage

lmer_pi_futvec(  model,  futvec,  newdat = NULL,  alternative = "both",  alpha = 0.05,  nboot = 10000,  delta_min = 0.01,  delta_max = 10,  tolerance = 0.001,  traceplot = TRUE,  n_bisec = 30,  algorithm = "MS22")

Arguments

model

a random effects model of class lmerMod

futvec

an integer vector that defines the structure of the future data based on therow numbers of the historical data. Iflength(futvec) is one, a PIfor one future observation is computed

newdat

adata.frame with the same column names as the historical dataon whichmodel depends

alternative

either "both", "upper" or "lower".alternative specifiesif a prediction interval or an upper or a lower prediction limit should be computed

alpha

defines the level of confidence (1-alpha)

nboot

number of bootstraps

delta_min

lower start value for bisection

delta_max

upper start value for bisection

tolerance

tolerance for the coverage probability in the bisection

traceplot

ifTRUE: Plot for visualization of the bisection process

n_bisec

maximal number of bisection steps

algorithm

either "MS22" or "MS22mod" (see details)

Details

This function returns bootstrap-calibrated prediction intervals as well aslower or upper prediction limits.

Ifalgorithm is set to "MS22", both limits of the prediction intervalare calibrated simultaneously using the algorithm described in Menssen andSchaarschmidt (2022), section 3.2.4. The calibrated prediction interval is givenas

[l,u] = \hat{\mu} \pm q^{calib} \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1}\hat{\sigma}^2_c}

with\hat{\mu} as the expected future observation (historical mean) and\hat{\sigma}^2_c as thec=1, 2, ..., C variance components and\hat{\sigma}^2_{C+1}as the residual variance obtained from the randomeffects model fitted withlme4::lmer() andq^{calib} as the as the bootstrap-calibratedcoefficient used for interval calculation.

Ifalgorithm is set to "MS22mod", both limits of the prediction intervalare calibrated independently from each other. The resulting prediction intervalis given by

[l,u] = \Big[\hat{\mu} - q^{calib}_l \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1} \hat{\sigma}^2_c}, \quad\hat{\mu} + q^{calib}_u \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1} \hat{\sigma}^2_c} \Big].

Please note, that this modification does not affect the calibration procedure, if onlyprediction limits are of interest.

Be aware that the sampling structure of the historical data must contain the structure of thefuture data. This means that the observations per random factor must be less orequal in the future data compared to the historical data.

This function is an implementation of the PI given in Menssen and Schaarschmidt 2022 section 3.2.4except that the bootstrap calibration values are drawn from bootstrap samples thatmimic the future data.

Value

lmer_pi_futvec() returns an object of classc("predint", "normalPI")with prediction intervals or limits in the first entry ($prediction).

References

Menssen and Schaarschmidt (2022): Prediction intervals for all of M futureobservations based on linear random effects models. Statistica Neerlandica,doi:10.1111/stan.12260

Examples

# loading lme4library(lme4)# Fitting a random effects model based on c2_dat1fit <- lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1)summary(fit)#----------------------------------------------------------------------------### Prediction interval using c2_dat3 as future data# without printing c2_dat3 in the output# Row numbers of the historical data c2_dat1 that define the structure of# the future data c2_dat3futvec <- c(1, 2, 4, 5, 10, 11, 13, 14)# Calculating the PIpred_int <- lmer_pi_futvec(model=fit, futvec=futvec, nboot=100)summary(pred_int)#----------------------------------------------------------------------------### Calculating the PI with c2_dat3 printed in the outputpred_int_new <- lmer_pi_futvec(model=fit, futvec=futvec, newdat=c2_dat3, nboot=100)summary(pred_int_new)#----------------------------------------------------------------------------### Upper prediction limit for m=1 future observationpred_u <- lmer_pi_futvec(model=fit, futvec=1, alternative="upper", nboot=100)summary(pred_u)#----------------------------------------------------------------------------# Please note that nboot was set to 100 in order to decrease computing time# of the example. For a valid analysis set nboot=10000.

Prediction intervals for future observations based on linear random effects models

Description

lmer_pi_unstruc() calculates a bootstrap calibrated prediction interval for one or morefuture observation(s) based on linear random effects models as described in section3.2.4. of Menssen and Schaarschmidt (2022).Please note, that the bootstrap calibration used here does not consider the samplingstructure of the future data, since the calibration values are drawn randomly frombootstrap data sets that have the same structure as the historical data.

Usage

lmer_pi_unstruc(  model,  newdat = NULL,  m = NULL,  alternative = "both",  alpha = 0.05,  nboot = 10000,  delta_min = 0.01,  delta_max = 10,  tolerance = 0.001,  traceplot = TRUE,  n_bisec = 30,  algorithm = "MS22")

Arguments

model

a random effects model of class lmerMod

newdat

adata.frame with the same column names as the historical dataon which the model depends

m

number of future observations

alternative

either "both", "upper" or "lower".alternative specifiesif a prediction interval or an upper or a lower prediction limit should be computed

alpha

defines the level of confidence (1-alpha)

nboot

number of bootstraps

delta_min

lower start value for bisection

delta_max

upper start value for bisection

tolerance

tolerance for the coverage probability in the bisection

traceplot

ifTRUE: Plot for visualization of the bisection process

n_bisec

maximal number of bisection steps

algorithm

either "MS22" or "MS22mod" (see details)

Details

This function returns bootstrap-calibrated prediction intervals as well aslower or upper prediction limits.

Ifalgorithm is set to "MS22", both limits of the prediction intervalare calibrated simultaneously using the algorithm described in Menssen andSchaarschmidt (2022), section 3.2.4. The calibrated prediction interval is givenas

[l,u] = \hat{\mu} \pm q^{calib} \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1}\hat{\sigma}^2_c}

with\hat{\mu} as the expected future observation (historical mean) and\hat{\sigma}^2_c as thec=1, 2, ..., C variance components and\hat{\sigma}^2_{C+1}as the residual variance obtained from the randomeffects model fitted withlme4::lmer() andq^{calib} as the as the bootstrap-calibratedcoefficient used for interval calculation.

Ifalgorithm is set to "MS22mod", both limits of the prediction intervalare calibrated independently from each other. The resulting prediction intervalis given by

[l,u] = \Big[\hat{\mu} - q^{calib}_l \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1} \hat{\sigma}^2_c}, \quad\hat{\mu} + q^{calib}_u \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1} \hat{\sigma}^2_c} \Big].

Please note, that this modification does not affect the calibration procedure, if onlyprediction limits are of interest.

This function is an direct implementation of the PI given in Menssen and Schaarschmidt2022 section 3.2.4.

Value

lmer_pi_futvec() returns an object of classc("predint", "normalPI")with prediction intervals or limits in the first entry ($prediction).

References

Menssen and Schaarschmidt (2022): Prediction intervals for all of M futureobservations based on linear random effects models. Statistica Neerlandica,doi:10.1111/stan.12260

Examples

# loading lme4library(lme4)# Fitting a random effects model based on c2_dat1fit <- lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1)summary(fit)# Prediction interval using c2_dat2 as future datapred_int <- lmer_pi_unstruc(model=fit, newdat=c2_dat2, alternative="both", nboot=100)summary(pred_int)# Upper prediction limit for m=3 future observationspred_u <- lmer_pi_unstruc(model=fit, m=3, alternative="upper", nboot=100)summary(pred_u)# Please note that nboot was set to 100 in order to decrease computing time# of the example. For a valid analysis set nboot=10000.

Historical mortality of male B6C3F1-mice

Description

This data set contains historical control data about the mortality of male B6C3F1-miceobtained in long term carcinogenicity studies at the National Toxicology Programpresented in NTP Historical Control Reports from 2013 to 2016.It was used in Menssen and Schaarschmidt 2019 as a real life example.

Usage

mortality_HCD

Format

Adata.frame with 2 rows and 10 columns:

dead

no. of dead mice

alive

no. of living mice

References

Menssen and Schaarschmidt (2019): Prediction intervals for overdispersed binomialdata with application to historical controls. Statistics in Medicine.doi:10.1002/sim.8124
NTP Historical Control Reports:https://ntp.niehs.nih.gov/data/controls


Simple uncalibrated prediction intervals for negative-binomial data

Description

nb_pi() is a helper function that is internally called byneg_bin_pi(). Itcalculates simple uncalibrated prediction intervals for negative-binomial datawith offsets.

Usage

nb_pi(  newoffset,  histoffset,  lambda,  kappa,  q = qnorm(1 - 0.05/2),  alternative = "both",  newdat = NULL,  histdat = NULL,  algorithm = NULL)

Arguments

newoffset

number of experimental units in the future clusters

histoffset

number of experimental units in the historical clusters

lambda

overall Poisson mean

kappa

dispersion parameter

q

quantile used for interval calculation

alternative

either "both", "upper" or "lower".alternative specifies, if a prediction interval oran upper or a lower prediction limit should be computed

newdat

additional argument to specify the current data set

histdat

additional argument to specify the historical data set

algorithm

used to define the algorithm for calibration if called viaquasi_pois_pi(). This argument is not of interest for the calculationof simple uncalibrated intervals

Details

This function returns a simple uncalibrated prediction interval

[l,u]_m = n^*_m \hat{\lambda} \pm q \sqrt{n^*_m\frac{\hat{\lambda} + \hat{\kappa} \bar{n} \hat{\lambda}}{\bar{n} H} +(n^*_m \hat{\lambda} + \hat{\kappa} n^{*2}_m \hat{\lambda}^2)}

withn^*_m as the number of experimental units inm=1, 2, ... , M future clusters,\hat{\lambda} as the estimate for the Poisson mean obtained from thehistorical data,\hat{\kappa} as the estimate for the dispersion parameter,n_h as the number of experimental units per historical cluster and\bar{n}=\sum_h^{n_h} n_h / H.

The direct application of this uncalibrated prediction interval to real life datais not recommended. Please use theneg_bin_pi() function for real life applications.

Value

np_pi returns an object of classc("predint", "negativeBinomialPI").

Examples

# Prediction intervalnb_pred <- nb_pi(newoffset=3, lambda=3, kappa=0.04, histoffset=1:9, q=qnorm(1-0.05/2))summary(nb_pred)

Prediction intervals for negative-binomial data

Description

neg_bin_pi() calculates bootstrap calibrated prediction intervals fornegative-binomial data.

Usage

neg_bin_pi(  histdat,  newdat = NULL,  newoffset = NULL,  alternative = "both",  adjust = "within",  alpha = 0.05,  nboot = 10000,  delta_min = 0.01,  delta_max = 10,  tolerance = 0.001,  traceplot = TRUE,  n_bisec = 30,  algorithm = "MS22mod")

Arguments

histdat

adata.frame with two columns. The first has to containthe historical observations. The second has to contain the number of experimentalunits per study (offsets).

newdat

data.frame with two columns. The first has to containthe future observations. The second has to contain the number of experimentalunits per study (offsets).

newoffset

vector with future number of experimental units per historicalstudy.

alternative

either "both", "upper" or "lower".alternative specifies if a prediction interval oran upper or a lower prediction limit should be computed

adjust

specifies if simultaneous prediction should be done for severalcontrol groups of different studies (between), or for the outcome ofthe current control and some treatment groupswithin the same trial

alpha

defines the level of confidence (1-\alpha)

nboot

number of bootstraps

delta_min

lower start value for bisection

delta_max

upper start value for bisection

tolerance

tolerance for the coverage probability in the bisection

traceplot

ifTRUE: Plot for visualization of the bisection process

n_bisec

maximal number of bisection steps

algorithm

either "MS22" or "MS22mod" (see details)

Details

This function returns bootstrap-calibrated prediction intervals as well aslower or upper prediction limits.

Ifalgorithm is set to "MS22", both limits of the prediction intervalare calibrated simultaneously using the algorithm described in Menssen andSchaarschmidt (2022), section 3.2.4. The calibrated prediction interval is givenas

[l,u]_m = n^*_m \hat{\lambda} \pm q \sqrt{n^*_m\frac{\hat{\lambda} + \hat{\kappa} \bar{n} \hat{\lambda}}{\bar{n} H} +(n^*_m \hat{\lambda} + \hat{\kappa} n^{*2}_m \hat{\lambda}^2)}

withn^*_m as the number of experimental units in the future clusters,\hat{\lambda} as the estimate for the Poisson mean obtained from thehistorical data,\hat{\kappa} as the estimate for the dispersion parameter,n_h as the number of experimental units per historical cluster and\bar{n}=\sum_h^{n_h} n_h / H.

Ifalgorithm is set to "MS22mod", both limits of the prediction intervalare calibrated independently from each other. The resulting prediction intervalis given by

[l,u] = \Big[n^*_m \hat{\lambda} - q^{calib}_l \sqrt{n^*_m\frac{\hat{\lambda} + \hat{\kappa} \bar{n} \hat{\lambda}}{\bar{n} H} +(n^*_m \hat{\lambda} + \hat{\kappa} n^{*2}_m \hat{\lambda}^2)}, \quad n^*_m \hat{\lambda} + q^{calib}_u \sqrt{n^*_m\frac{\hat{\lambda} + \hat{\kappa} \bar{n} \hat{\lambda}}{\bar{n} H} +(n^*_m \hat{\lambda} + \hat{\kappa} n^{*2}_m \hat{\lambda}^2)} \Big]

Please note, that this modification does not affect the calibration procedure, if onlyprediction limits are of interest.

Value

neg_bin_pi() returns an object of classc("predint", "negativeBinomialPI")with prediction intervals or limits in the first entry ($prediction).

References

Menssen et al. (2025): Prediction Intervals for Overdispersed PoissonData and Their Application in Medical and Pre-Clinical Quality Control.Pharmaceutical Statisticsdoi:10.1002/pst.2447

Examples

# HCD from the Ames testames_HCD# Pointwise prediction interval for one future number of revertant colonies# obtained in three petridishespred_int <- neg_bin_pi(histdat=ames_HCD, newoffset=3, nboot=100)summary(pred_int)# Simultaneous prediction interval for the numbers of revertant colonies obtained in# the control and three treatment groups of a future trialpred_int_w <- neg_bin_pi(histdat=ames_HCD, newoffset=c(3, 3, 3, 3), adjust="within", nboot=100)summary(pred_int_w)# Please note that nboot was set to 100 in order to decrease computing time# of the example. For a valid analysis set nboot=10000.

Simple uncalibrated prediction intervals for normal distributed data

Description

normal_pi() is a helper function that is internally called by thelmer_pi_...() functions.It calculates simple uncalibrated prediction intervals for normal distributedobservations.

Usage

normal_pi(  mu,  pred_se,  m = 1,  q = qnorm(1 - 0.05/2),  alternative = "both",  futmat_list = NULL,  futvec = NULL,  newdat = NULL,  histdat = NULL,  algorithm = NULL)

Arguments

mu

overall mean

pred_se

standard error of the prediction

m

number of future observations

q

quantile used for interval calculation

alternative

either "both", "upper" or "lower"alternative specifies, if a prediction interval oran upper or a lower prediction limit should be computed

futmat_list

used to add the list of future design matrices to the outputif called vialmer_pi_futmat()

futvec

used to add the vector of the historical row numbers that definethe future experimental design to the output if called vialmer_pi_futmat()

newdat

additional argument to specify the current data set

histdat

additional argument to specify the historical data set

algorithm

used to define the algorithm for calibration if called vialmer_pi_...(). This argument is not of interest for the calculationof simple uncalibrated intervals

Details

This function returns a simple uncalibrated prediction interval asgiven in Menssen and Schaarschmidt 2022

[l,u] = \hat{\mu} \pm q \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1} \hat{\sigma}^2_c}

with\hat{\mu} as the expected future observation (historical mean) and\hat{\sigma}^2_c as thec=1, 2, ..., C variance components and\hat{\sigma}^2_{C+1}as the residual variance andq as the quantile used for interval calculation.

The direct application of this uncalibrated prediction interval to real life datais not recommended. Please use thelmer_pi_...() functions for real life applications.

Value

normal_pi() returns an object of classc("predint", "normalPI")with prediction intervals or limits in the first entry ($prediction).

References

Menssen and Schaarschmidt (2022): Prediction intervals for all of M futureobservations based on linear random effects models. Statistica Neerlandica,doi:10.1111/stan.12260

Examples

# simple PInorm_pred <- normal_pi(mu=10, pred_se=3, m=1)summary(norm_pred)

Estimation of the binomial proportion and the intra class correlation.

Description

pi_rho_est() estimates the overall binomial proportion\hat{\pi} and the intraclass correlation\hat{\rho} of data that is assumed to follow the beta-binomialdistribution. The estimation of\hat{\pi} and\hat{\rho} is done followingthe approach of Lui et al. 2000.

Usage

pi_rho_est(dat)

Arguments

dat

adata.frame with two columns (successes and failures)

Value

a vector containing estimates for\pi and\rho

References

Lui, K.-J., Mayer, J.A. and Eckhardt, L: Confidence intervals for the risk ratiounder cluster sampling based on the beta-binomial model. Statistics in Medicine.2000;19:2933-2942.doi:10.1002/1097-0258(20001115)19:21<2933::AID-SIM591>3.0.CO;2-Q

Examples

# Estimates for bb_dat1pi_rho_est(bb_dat1)

Plots ofpredint objects

Description

This function provides methodology for plotting the prediction intervals orlimits that are calculated using the functionality of thepredint package.

Usage

## S3 method for class 'predint'plot(x, ..., size = 4, width = 0.05, alpha = 0.5)

Arguments

x

object of classpredint

...

arguments handed over toggplot2::aes()

size

size of the dots

width

margin of jittering

alpha

opacity of dot colors

Value

Sinceplot.predint() is based onggplot2::ggplot, it returnsan object of classc("gg", "ggplot").

Examples

### PI for quasi-Poisson datapred_int <- quasi_pois_pi(histdat=ames_HCD,                          newoffset=3,                          nboot=100,                          traceplot = FALSE)### Plot the PIplot(pred_int)### Since plot.predint is based on ggplot, the grafic can be altered using# the methodology provided via ggplot2plot(pred_int)+     theme_classic()

Print objects of classpredint

Description

Print objects of classpredint

Usage

## S3 method for class 'predint'print(x, ...)

Arguments

x

an object of classpredint

...

additional arguments passed over tobase::cbind() andbase::data.frame()

Value

prints output to the console


Quasi-binomial data (example 1)

Description

This data set contains sampled quasi-binomial data from 10 clusterseach of size 50. The data set was sampled withrqbinom(n=10, size=50, prob=0.1, phi=3).

Usage

qb_dat1

Format

Adata.frame with 3 rows and 2 columns:

succ

numbers of success

fail

numbers of failures


Quasi-binomial data (example 2)

Description

This data set contains sampled quasi binomial data from 3 clusters withdifferent size.The data set was sampled withrqbinom(n=3, size=c(40, 50, 60), prob=0.1, phi=3).

Usage

qb_dat2

Format

Adata.frame with 3 rows and 2 columns:

succ

numbers of success

fail

numbers of failures


Simple uncalibrated prediction intervals for quasi-binomial data

Description

qb_pi() is a helper function that is internally called byquasi_bin_pi(). Itcalculates simple uncalibrated prediction intervals for binarydata with constant overdispersion (quasi-binomial assumption).

Usage

qb_pi(  newsize,  histsize,  pi,  phi,  q = qnorm(1 - 0.05/2),  alternative = "both",  newdat = NULL,  histdat = NULL,  algorithm = NULL)

Arguments

newsize

number of experimental units in the historical clusters.

histsize

number of experimental units in the future clusters.

pi

binomial proportion

phi

dispersion parameter

q

quantile used for interval calculation

alternative

either "both", "upper" or "lower"alternative specifies, if a prediction interval oran upper or a lower prediction limit should be computed

newdat

additional argument to specify the current data set

histdat

additional argument to specify the historical data set

algorithm

used to define the algorithm for calibration if called viaquasi_bin_pi. This argument is not of interest for the calculationof simple uncalibrated intervals

Details

This function returns a simple uncalibrated prediction interval

[l,u]_m = n^*_m \hat{\pi} \pm q \sqrt{\hat{\phi} n^*_m \hat{\pi} (1- \hat{\pi}) +\frac{\hat{\phi} n^{*2}_m \hat{\pi} (1- \hat{\pi})}{\sum_h n_h}}

withn^*_m as the number of experimental units in them=1, 2, ... , M future clusters,\hat{\pi} as the estimate for the binomial proportion obtained from thehistorical data,\hat{\phi} as the estimate for the dispersion parameterandn_h as the number of experimental units per historical cluster.

The direct application of this uncalibrated prediction interval to real life datais not recommended. Please use thebeta_bin_pi() functions for real life applications.

Value

qb_pi returns an object of classc("predint", "quasiBinomailPI").

Examples

qb_pred <- qb_pi(newsize=50, pi=0.3, phi=3, histsize=c(50, 50, 30), q=qnorm(1-0.05/2))summary(qb_pred)

Quasi-Poisson data (example 1)

Description

This data set contains sampled quasi-Poisson data for 10 clusters.The data set was sampled withrqpois(n=10, lambda=50, phi=3).

Usage

qp_dat1

Format

A data.frame with two columns

y

numbers of eventzs

offset

size of experimental units


Quasi-Poisson data (example 2)

Description

This data set contains sampled quasi-Poisson data for 3 clusters.The data set was sampled withrqpois(n=3, lambda=50, phi=3).

Usage

qp_dat2

Format

A data.frame with two columns

y

numbers of eventzs

offset

size of experimental units


Simple uncalibrated prediction intervals for quasi-Poisson data

Description

qp_pi() is a helper function that is internally called byquasi_pois_pi(). Itcalculates simple uncalibrated prediction intervals for Poissondata with constant overdispersion (quasi-Poisson assumption).

Usage

qp_pi(  newoffset,  histoffset,  lambda,  phi,  q = qnorm(1 - 0.05/2),  alternative = "both",  adjust = NULL,  newdat = NULL,  histdat = NULL,  algorithm = NULL)

Arguments

newoffset

number of experimental units in the future clusters

histoffset

number of experimental units in the historical clusters

lambda

overall Poisson mean

phi

dispersion parameter

q

quantile used for interval calculation

alternative

either "both", "upper" or "lower"alternative specifies, if a prediction interval oran upper or a lower prediction limit should be computed

adjust

only important if called viaquasi_pois_pi()

newdat

additional argument to specify the current data set

histdat

additional argument to specify the historical data set

algorithm

used to define the algorithm for calibration if called viaquasi_pois_pi(). This argument is not of interest for the calculationof simple uncalibrated intervals

Details

This function returns a simple uncalibrated prediction interval

[l,u]_m = n^*_m \hat{\lambda} \pm q \sqrt{n^*_m \hat{\phi} \hat{\lambda} + \frac{n^{*2}_m \hat{\phi} \hat{\lambda}}{\sum_h n_h}}

withn^*_m as the number of experimental units in them=1, 2, ... , M future clusters,\hat{\lambda} as the estimate for the Poisson mean obtained from thehistorical data,\hat{\phi} as the estimate for the dispersion parameterandn_h as the number of experimental units per historical cluster.

The direct application of this uncalibrated prediction interval to real life datais not recommended. Please use thequasi_pois_pi_pi() functions for real life applications.

Value

qp_pi returns an object of classc("predint", "quasiPoissonPI").

Examples

# Prediction intervalqp_pred <- qp_pi(newoffset=3, lambda=3, phi=3, histoffset=1:9, q=qnorm(1-0.05/2))summary(qp_pred)

Prediction intervals for quasi-binomial data

Description

quasi_bin_pi() calculates bootstrap calibrated prediction intervals for binomialdata with constant overdispersion (quasi-binomial assumption).

Usage

quasi_bin_pi(  histdat,  newdat = NULL,  newsize = NULL,  alternative = "both",  alpha = 0.05,  nboot = 10000,  delta_min = 0.01,  delta_max = 10,  tolerance = 0.001,  traceplot = TRUE,  n_bisec = 30,  algorithm = "MS22mod")

Arguments

histdat

adata.frame with two columns (success and failures) containing the historical data

newdat

adata.frame with two columns (success and failures) containing the future data

newsize

a vector containing the future cluster sizes

alternative

either "both", "upper" or "lower".alternativespecifies if a prediction interval or an upper or a lower prediction limitshould be computed

alpha

defines the level of confidence (1-alpha)

nboot

number of bootstraps

delta_min

lower start value for bisection

delta_max

upper start value for bisection

tolerance

tolerance for the coverage probability in the bisection

traceplot

ifTRUE: Plot for visualization of the bisection process

n_bisec

maximal number of bisection steps

algorithm

either "MS22" or "MS22mod" (see details)

Details

This function returns bootstrap-calibrated prediction intervals as well aslower or upper prediction limits.

Ifalgorithm is set to "MS22", both limits of the prediction intervalare calibrated simultaneously using the algorithm described in Menssen andSchaarschmidt (2022), section 3.2.4. The calibrated prediction interval is givenas

[l,u]_m = n^*_m \hat{\pi} \pm q^{calib} \hat{se}(Y_m - y^*_m)

where

\hat{se}(Y_m - y^*_m) = \sqrt{\hat{\phi} n^*_m \hat{\pi} (1- \hat{\pi}) +\frac{\hat{\phi} n^{*2}_m \hat{\pi} (1- \hat{\pi})}{\sum_h n_h}}

withn^*_m as the number of experimental units in the future clusters,\hat{\pi} as the estimate for the binomial proportion obtained from thehistorical data,q^{calib} as the bootstrap-calibrated coefficient,\hat{\phi} as the estimate for the dispersion parameterandn_h as the number of experimental units per historical cluster.

Ifalgorithm is set to "MS22mod", both limits of the prediction intervalare calibrated independently from each other. The resulting prediction intervalis given by

[l,u] = \Big[n^*_m \hat{\pi} - q^{calib}_l \hat{se}(Y_m - y^*_m), \quadn^*_m \hat{\pi} + q^{calib}_u \hat{se}(Y_m - y^*_m) \Big]

Please note, that this modification does not affect the calibration procedure, if onlyprediction limits are of interest.

Value

quasi_bin_pi returns an object of classc("predint", "quasiBinomialPI")with prediction intervals or limits in the first entry ($prediction).

References

Menssen and Schaarschmidt (2019): Prediction intervals for overdispersed binomialdata with application to historical controls. Statistics in Medicine.doi:10.1002/sim.8124
Menssen and Schaarschmidt (2022): Prediction intervals for all of M futureobservations based on linear random effects models. Statistica Neerlandica,doi:10.1111/stan.12260

Examples

# Pointwise prediction intervalpred_int <- quasi_bin_pi(histdat=mortality_HCD, newsize=40, nboot=100)summary(pred_int)# Pointwise upper prediction limitpred_u <- quasi_bin_pi(histdat=mortality_HCD, newsize=40, alternative="upper", nboot=100)summary(pred_u)# Please note that nboot was set to 100 in order to decrease computing time# of the example. For a valid analysis set nboot=10000.

Prediction intervals for quasi-Poisson data

Description

quasi_pois_pi() calculates bootstrap calibrated prediction intervals for Poissondata with constant overdispersion (quasi-Poisson).

Usage

quasi_pois_pi(  histdat,  newdat = NULL,  newoffset = NULL,  alternative = "both",  adjust = "within",  alpha = 0.05,  nboot = 10000,  delta_min = 0.01,  delta_max = 10,  tolerance = 0.001,  traceplot = TRUE,  n_bisec = 30,  algorithm = "MS22mod")

Arguments

histdat

adata.frame with two columns. The first has to containthe historical observations. The second has to contain the number of experimentalunits per study (offsets).

newdat

adata.frame with two columns. The first has to containthe future observations. The second has to contain the number of experimentalunits per study (offsets).

newoffset

vector with future number of experimental units per historicalstudy.

alternative

either "both", "upper" or "lower".alternative specifies if a prediction interval oran upper or a lower prediction limit should be computed

adjust

specifies if simultaneous prediction should be done for severalcontrol groups of different studies (between), or for the outcome ofthe current control and some treatment groupswithin the same trial

alpha

defines the level of confidence (1-\alpha)

nboot

number of bootstraps

delta_min

lower start value for bisection

delta_max

upper start value for bisection

tolerance

tolerance for the coverage probability in the bisection

traceplot

ifTRUE: Plot for visualization of the bisection process

n_bisec

maximal number of bisection steps

algorithm

either "MS22" or "MS22mod" (see details)

Details

This function returns bootstrap-calibrated prediction intervals as well aslower or upper prediction limits.

Ifalgorithm is set to "MS22", both limits of the prediction intervalare calibrated simultaneously using the algorithm described in Menssen andSchaarschmidt (2022), section 3.2.4. The calibrated prediction interval is givenas

[l,u]_m = n^*_m \hat{\lambda} \pm q^{calib} \sqrt{n^*_m \hat{\phi} \hat{\lambda} + \frac{n^{*2}_m \hat{\phi} \hat{\lambda}}{\sum_h n_h}}

withn^*_m as the number of experimental units in the future clusters,\hat{\lambda} as the estimate for the Poisson mean obtained from thehistorical data,q^{calib} as the bootstrap-calibrated coefficient,\hat{\phi} as the estimate for the dispersion parameterandn_h as the number of experimental units per historical cluster.

Ifalgorithm is set to "MS22mod", both limits of the prediction intervalare calibrated independently from each other. The resulting prediction intervalis given by

[l,u] = \Big[n^*_m \hat{\lambda} - q^{calib}_l \sqrt{n^*_m \hat{\phi} \hat{\lambda} + \frac{n^{*2}_m \hat{\phi} \hat{\lambda}}{\sum_h n_h}}, \quad n^*_m \hat{\lambda} + q^{calib}_u \sqrt{n^*_m \hat{\phi} \hat{\lambda} + \frac{n^{*2}_m \hat{\phi} \hat{\lambda}}{\sum_h n_h}} \Big]

Please note, that this modification does not affect the calibration procedure, if onlyprediction limits are of interest.

Value

quasi_pois_pi returns an object of classc("predint", "quasiPoissonPI")with prediction intervals or limits in the first entry ($prediction).

References

Menssen et al. (2025): Prediction Intervals for Overdispersed PoissonData and Their Application in Medical and Pre-Clinical Quality Control.Pharmaceutical Statisticsdoi:10.1002/pst.2447

Examples

#' # Historical dataqp_dat1# Future dataqp_dat2# Pointwise prediction intervalpred_int <- quasi_pois_pi(histdat=ames_HCD, newoffset=3, nboot=100)summary(pred_int)# Pointwise upper predictionpred_u <- quasi_pois_pi(histdat=ames_HCD, newoffset=3, alternative="upper", nboot=100)summary(pred_u)# Please note that nboot was set to 100 in order to decrease computing time# of the example. For a valid analysis set nboot=10000.

Sampling of beta-binomial data

Description

rbbinom() samples beta-binomial data according to Menssen and Schaarschmidt (2019).

Usage

rbbinom(n, size, prob, rho)

Arguments

n

defines the number of clusters (i)

size

integer vector defining the number of trials per cluster (n_i)

prob

probability of success on each trial (\pi)

rho

intra class correlation (\rho)

Details

For beta binomial data withi=1, ... I clusters, the variance is

var(y_i)= n_i \pi (1-\pi) [1+ (n_i - 1) \rho]

with\rho as the intra class correlation coefficient

\rho = 1 / (1+a+b).

For the sampling(a+b) is defined as

(a+b)=(1-\rho)/\rho

wherea=\pi (a+b) andb=(a+b)-a. Then, the binomial proportionsfor each cluster are sampled from the beta distribution

\pi_i \sim Beta(a, b)

and the number of successes for each cluster are sampled to be

y_i \sim Bin(n_i, \pi_i).

In this parametrizationE(\pi_i)=\pi=a/(a+b) andE(y_i)=n_i \pi.Please note, that1+ (n_i-1) \rho is a constant if all cluster sizes arethe same and hence, in this special case, also the quasi-binomial assumption isfulfilled.

Value

adata.frame with two columns (succ, fail)

References

Menssen M, Schaarschmidt F.: Prediction intervals for overdispersed binomial datawith application to historical controls. Statistics in Medicine. 2019;38:2652-2663.doi:10.1002/sim.8124

Examples

# Sampling of example dataset.seed(234)bb_dat1 <- rbbinom(n=10, size=50, prob=0.1, rho=0.06)bb_dat1set.seed(234)bb_dat2 <- rbbinom(n=3, size=c(40, 50, 60), prob=0.1, rho=0.06)bb_dat2

Sampling of negative binomial data

Description

rnbinom() samples negative-binomial data.The following description of the sampling process is based on the parametrizationused by Gsteiger et al. 2013.

Usage

rnbinom(n, lambda, kappa, offset = NULL)

Arguments

n

defines the number of clusters (I)

lambda

defines the overall Poisson mean (\lambda)

kappa

dispersion parameter (\kappa)

offset

defines the number of experimental units per cluster (n_i)

Details

The variance of the negative-binomial distribution is

var(Y_i) = n_i \lambda (1+ \kappa n_i \lambda).

Negative-biomial observations can be sampled based on predefined values of\kappa,\lambda andn_i:
Define the parameters of the gamma distribution asa=\frac{1}{\kappa} andb_i=\frac{1}{\kappa n_i \lambda}. Then, sample the Poisson means for each cluster

\lambda_i \sim Gamma(a, b_i).

Finally, the observationsy_i are sampled from the Poisson distribution

y_i \sim Pois(\lambda_i)

Value

rnbinom() returns adata.frame with two columns:y as the observations andoffset as the number of offsets perobservation.

References

Gsteiger, S., Neuenschwander, B., Mercier, F. and Schmidli, H. (2013):Using historical control information for the design and analysis of clinicaltrials with overdispersed count data. Statistics in Medicine, 32: 3609-3622.doi:10.1002/sim.5851

Examples

# Sampling of negative-binomial observations# with different offsetsset.seed(123)rnbinom(n=5, lambda=5, kappa=0.13, offset=c(3,3,2,3,2))

Sampling of overdispersed binomial data with constant overdispersion

Description

rqbinom samples overdispersed binomial data with constant overdispersion fromthe beta-binomial distribution such that the quasi-binomial assumption is fulfilled.

Usage

rqbinom(n, size, prob, phi)

Arguments

n

defines the number of clusters (i)

size

integer vector defining the number of trials per cluster (n_i)

prob

probability of success on each trial (\pi)

phi

dispersion parameter (\Phi)

Details

It is assumed that the dispersion parameter (\Phi)is constant for alli=1, ... I clusters, such that the variance becomes

var(y_i)=\Phi n_i \pi (1-\pi).

For the sampling(a+b)_i is defined as

(a+b)_i=(\Phi-n_i)/(1-\Phi)

wherea_i=\pi (a+b)_i andb_i=(a+b)_i-a_i. Then, the binomial proportionsfor each cluster are sampled from the beta distribution

\pi_i \sim Beta(a_i, b_i)

and the numbers of success for each cluster are sampled to be

y_i \sim Bin(n_i, \pi_i).

In this parametrizationE(\pi_i)=\pi andE(y_i)=n_i \pi.Please note, the quasi-binomial assumption is not in contradiction withthe beta-binomial distribution if all cluster sizes are the same.

Value

adata.frame with two columns (succ, fail)

Examples

# Sampling of example dataset.seed(456)qb_dat1 <- rqbinom(n=10, size=50, prob=0.1, phi=3)qb_dat1set.seed(456)qb_dat2 <- rqbinom(n=3, size=c(40, 50, 60), prob=0.1, phi=3)qb_dat2

Sampling of overdispersed Poisson data with constant overdispersion

Description

rqpois() samples overdispersed Poisson data with constant overdispersion fromthe negative-binomial distribution such that the quasi-Poisson assumption is fulfilled.The following description of the sampling process is based on the parametrizationused by Gsteiger et al. 2013.

Usage

rqpois(n, lambda, phi, offset = NULL)

Arguments

n

defines the number of clusters (I)

lambda

defines the overall Poisson mean (\lambda)

phi

dispersion parameter (\Phi)

offset

defines the number of experimental units per cluster (n_i)

Details

It is assumed that the dispersion parameter (\Phi)is constant for alli=1, ... I clusters, such that the variance becomes

var(y_i) = \Phi n_i \lambda

For the sampling\kappa_i is defined as

\kappa_i=(\Phi-1)/(n_i \lambda)

wherea_i=1/\kappa_i andb_i=1/(\kappa_i n_i \lambda). Then, the Poisson meansfor each cluster are sampled from the gamma distribution

\lambda_i \sim Gamma(a_i, b_i)

and the observations per cluster are sampled to be

y_i \sim Pois(\lambda_i).

Please note, that the quasi-Poisson assumption is not in contradiction with thenegative-binomial distribution, if the data structure is defined by the numberof clusters only (which is the case here) and the offsets are all the samen_h = n_{h´} = n.

Value

a data.frame containing the sampled observations and the offsets

References

Gsteiger, S., Neuenschwander, B., Mercier, F. and Schmidli, H. (2013):Using historical control information for the design and analysis of clinicaltrials with overdispersed count data. Statistics in Medicine, 32: 3609-3622.doi:10.1002/sim.5851

Examples

# set.seed(123)qp_dat1 <- rqpois(n=10, lambda=50, phi=3)qp_dat1# set.seed(123)qp_dat2 <- rqpois(n=3, lambda=50, phi=3)qp_dat2

Summarizing objects of classpredint

Description

This function gives a summary about the prediction intervals (and limits)computed withpredint.

Usage

## S3 method for class 'predint'summary(object, ...)

Arguments

object

object of classpredint

...

further arguments passed over tobase::cbind() andbase::data.frame()

Value

Adata.frame containing the current data (if provided vianewdat),the prediction interval (or limit), the expected value for the future observation,the bootstrap calibrated coefficient(s), the prediction standard error anda statement about the coverage for each future observation, if new observationswere provided vianewdat.

Examples

# Fitting a random effects model based on c2_dat1fit <- lme4::lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1)# Prediction interval using c2_dat2 as future datapred_int <- lmer_pi_futmat(model=fit, newdat=c2_dat2, alternative="both", nboot=100)summary(pred_int)#----------------------------------------------------------------------------# Please note that nboot was set to 100 in order to decrease computing time# of the example. For a valid analysis set nboot=10000.

[8]ページ先頭

©2009-2025 Movatter.jp