| Type: | Package |
| Title: | Prediction Intervals |
| Version: | 2.3.0 |
| Description: | An implementation of prediction intervals for overdispersed count data, for overdispersed binomial data and for linear random effects models. |
| License: | GPL-2 |GPL-3 [expanded from: GPL (≥ 2)] |
| Encoding: | UTF-8 |
| LazyData: | true |
| Imports: | stats, graphics, methods |
| RoxygenNote: | 7.3.2 |
| Suggests: | rmarkdown, knitr, testthat (≥ 3.0.0) |
| Config/testthat/edition: | 3 |
| Depends: | R (≥ 3.5.0), ggplot2, lme4, MASS |
| URL: | https://github.com/MaxMenssen/predint |
| BugReports: | https://github.com/MaxMenssen/predint/issues |
| NeedsCompilation: | no |
| Packaged: | 2025-07-22 10:33:24 UTC; maxmenssen |
| Author: | Max Menssen [aut, cre] |
| Maintainer: | Max Menssen <menssen@cell.uni-hannover.de> |
| Repository: | CRAN |
| Date/Publication: | 2025-07-22 11:01:31 UTC |
Historical numbers of revertant colonies in the Ames test (OECD 471)
Description
This data set contains artificial historical control data that was sampled inorder to mimic the number of revertant colonies based on two or three petri dishes.
Usage
ames_HCDFormat
Adata.frame with 2 rows and 10 columns:
- rev_col
no. of revertant colonies
- no_dish
no. of petri dishes in the control group
Store prediction intervals or limits as adata.frame
Description
Get the prediction intervals or limits of an object of classpredint andsave them as adata.frame.
Usage
## S3 method for class 'predint'as.data.frame(x, ...)Arguments
x | object of class |
... | additional arguments to be passed to |
Value
This function returns the prediction intervals or limits stored in anobject of class"predint" as adata.frame
Examples
### PI for quasi-Poisson datapred_int <- quasi_pois_pi(histdat=ames_HCD, newoffset=3, nboot=100, traceplot = FALSE)# Return the prediction intervals as a data.frameas.data.frame(pred_int)# Please note that nboot was set to 100 in order to decrease computing time# of the example. For a valid analysis set nboot=10000.Beta-binomial data (example 1)
Description
This data set contains sampled beta-binomial data from 10 clusterseach of size 50. The data set was sampled withrbbinom(n=10, size=50, prob=0.1, rho=0.06).
Usage
bb_dat1Format
Adata.frame with 10 rows and 2 columns:
- succ
number of successes
- fail
number of failures
Beta-binomial data (example 2)
Description
This data set contains sampled beta-binomial data from 3 clusters each ofdifferent size. The data set was sampled withrbbinom(n=3, size=c(40, 50, 60), prob=0.1, rho=0.06).
Usage
bb_dat2Format
Adata.frame with 3 rows and 2 columns:
- succ
number of successes
- fail
number of failures
Simple uncalibrated prediction intervals for beta-binomial data
Description
bb_pi() is a helper function that is internally called bybeta_bin_pi(). Itcalculates simple uncalibrated prediction intervals for binarydata with overdispersion changing between the clusters (beta-binomial).
Usage
bb_pi( newsize, histsize, pi, rho, q = qnorm(1 - 0.05/2), alternative = "both", newdat = NULL, histdat = NULL, algorithm = NULL)Arguments
newsize | number of experimental units in the historical clusters |
histsize | number of experimental units in the future clusters |
pi | binomial proportion |
rho | intra class correlation |
q | quantile used for interval calculation |
alternative | either "both", "upper" or "lower" |
newdat | additional argument to specify the current data set |
histdat | additional argument to specify the historical data set |
algorithm | used to define the algorithm for calibration if called via |
Details
This function returns a simple uncalibrated prediction interval
[l,u]_m = n^*_m \hat{\pi} \pm q \sqrt{n^*_m \hat{\pi} (1- \hat{\pi}) [1 + (n^*_m -1) \hat{\rho}] +[\frac{n^{*2}_m \hat{\pi} (1- \hat{\pi})}{\sum_h n_h} + \frac{\sum_h n_h -1}{\sum_h n_h} n^{*2}_m \hat{\pi} (1- \hat{\pi}) \hat{\rho}]}
withn^*_m as the number of experimental units in them=1, 2, ... , M future clusters,\hat{\pi} as the estimate for the binomial proportion obtained from thehistorical data,\hat{\rho} as the estimate for the intra class correlationandn_h as the number of experimental units per historical cluster.
The direct application of this uncalibrated prediction interval to real life datais not recommended. Please usebeta_bin_pi() for real life applications.
Value
bb_pi() returns an object of classc("predint", "betaBinomialPI")with prediction intervals or limits in the first entry ($prediction).
Examples
# Pointwise uncalibrated PIbb_pred <- bb_pi(newsize=c(50), pi=0.3, rho=0.05, histsize=rep(50, 20), q=qnorm(1-0.05/2))summary(bb_pred)Prediction intervals for beta-binomial data
Description
beta_bin_pi() calculates bootstrap calibrated prediction intervals forbeta-binomial data
Usage
beta_bin_pi( histdat, newdat = NULL, newsize = NULL, alternative = "both", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22mod")Arguments
histdat | a |
newdat | a |
newsize | a vector containing the future cluster sizes |
alternative | either "both", "upper" or "lower". |
alpha | defines the level of confidence (1- |
nboot | number of bootstraps |
delta_min | lower start value for bisection |
delta_max | upper start value for bisection |
tolerance | tolerance for the coverage probability in the bisection |
traceplot | if |
n_bisec | maximal number of bisection steps |
algorithm | either "MS22" or "MS22mod" (see details) |
Details
This function returns bootstrap-calibrated prediction intervals as well aslower or upper prediction limits.
Ifalgorithm is set to "MS22", both limits of the prediction intervalare calibrated simultaneously using the algorithm described in Menssen andSchaarschmidt (2022), section 3.2.4. The calibrated prediction interval is givenas
[l,u]_m = n^*_m \hat{\pi} \pm q^{calib} \hat{se}(Y_m - y^*_m)
where
\hat{se}(Y_m - y^*_m) = \sqrt{n^*_m \hat{\pi} (1- \hat{\pi}) [1 + (n^*_m -1) \hat{\rho}] +[\frac{n^{*2}_m \hat{\pi} (1- \hat{\pi})}{\sum_h n_h} + \frac{\sum_h n_h -1}{\sum_h n_h} n^{*2}_m \hat{\pi} (1- \hat{\pi}) \hat{\rho}]}
withn^*_m as the number of experimental units in the future clusters,\hat{\pi} as the estimate for the binomial proportion obtained from thehistorical data,q^{calib} as the bootstrap-calibrated coefficient,\hat{\rho} as the estimate for the intra class correlation (Lui et al. 2000)andn_h as the number of experimental units per historical cluster.
Ifalgorithm is set to "MS22mod", both limits of the prediction intervalare calibrated independently from each other. The resulting prediction intervalis given by
[l,u]_m = \big[n^*_m \hat{\pi} - q^{calib}_l \hat{se}(Y_m - y^*_m), \quad n^*_m \hat{\pi} + q^{calib}_u \hat{se}(Y_m - y^*_m) \big]
Please note, that this modification does not affect the calibration procedure, if onlyprediction limits are of interest.
Value
beta_bin_pi returns an object of classc("predint", "betaBinomialPI")with prediction intervals or limits in the first entry ($prediction).
References
Lui et al. (2000): Confidence intervals for the risk ratiounder cluster sampling based on the beta-binomial model. Statistics in Medicine.
doi:10.1002/1097-0258(20001115)19:21<2933::AID-SIM591>3.0.CO;2-Q
Menssen and Schaarschmidt (2022): Prediction intervals for all of M futureobservations based on linear random effects models. Statistica Neerlandica.doi:10.1111/stan.12260
Examples
# Prediction intervalpred_int <- beta_bin_pi(histdat=mortality_HCD, newsize=40, nboot=100)summary(pred_int)# Upper prediction boundpred_u <- beta_bin_pi(histdat=mortality_HCD, newsize=40, alternative="upper", nboot=100)summary(pred_u)# Please note that nboot was set to 100 in order to decrease computing time# of the example. For a valid analysis set nboot=10000.Bisection algorithm for bootstrap calibration of prediction intervals
Description
This helper function returns a bootstrap calibrated coefficient for the calculationof prediction intervals (and limits).
Usage
bisection( y_star_hat, pred_se, y_star, alternative, quant_min, quant_max, n_bisec, tol, alpha, traceplot = TRUE)Arguments
y_star_hat | a list of length |
pred_se | a list of length |
y_star | a list of length |
alternative | either "both", "upper" or "lower". |
quant_min | lower start value for bisection |
quant_max | upper start value for bisection |
n_bisec | maximal number of bisection steps |
tol | tolerance for the coverage probability in the bisection |
alpha | defines the level of confidence ( |
traceplot | if |
Details
This function is an implementation of the bisection algorithm of Menssenand Schaarschmidt 2022. It returns a calibrated coefficientq^{calib} for thecalculation of pointwise and simultaneous prediction intervals
[l,u] = \hat{y}^*_m \pm q^{calib} \hat{se}(Y_m - y^*_m),
lower prediction limits
l = \hat{y}^*_m - q^{calib} \hat{se}(Y_m - y^*_m)
or upper prediction limits
u = \hat{y}^*_m + q^{calib} \hat{se}(Y_m - y^*_m)
that cover all ofm=1, ... , M future observations.
In this notation,\hat{y}^*_m are the expected future observations for each ofthem future clusters,q^{calib} is thecalibrated coefficient and\hat{se}(Y_m - y^*_m)are the standard errors of the prediction.
Value
This function returnsq^{calib} in the equation above.
References
Menssen and Schaarschmidt (2022): Prediction intervals for all of M futureobservations based on linear random effects models. Statistica Neerlandica.
doi:10.1111/stan.12260
Bootstrap new data from uncalibrated prediction intervals
Description
boot_predint() is a helper function to bootstrap new data from the simpleuncalibrated prediction intervals implemented in predint.
Usage
boot_predint(pred_int, nboot, adjust = "within")Arguments
pred_int | object of class |
nboot | number of bootstraps |
adjust | specifies if simultaneous prediction should be done for severalcontrol groups of different studies ( |
Details
This function only works for binomial and Poisson type data. For the samplingof new data from random effects models seelmer_bs.
Value
boot_predint returns an object of classc("predint", "bootstrap")which is a list with two entries: One for bootstrapped historical observationsand one for bootstrapped future observations.
Examples
# Simple quasi-Poisson PItest_pi <- qp_pi(histoffset=c(3,3,3,4,5), newoffset=3, lambda=10, phi=3, q=1.96)# Draw 5 bootstrap samplestest_boot <- boot_predint(pred_int = test_pi, nboot=50)str(test_boot)summary(test_boot)# Please note that the low number of bootstrap samples was chosen in order to# decrease computing time. For valid analysis draw at least 10000 bootstrap samples.Cross-classified data (example 1)
Description
c2_dat1 contains data that is sampled from a balanced cross-classified design.This data set is used in order to demonstrate the functionality of thelmer_pi_...() functions.
Usage
c2_dat1Format
Adata.frame with 27 rows and 3 columns:
- y_ijk
observations
- a
treatment a
- b
treatment b
Cross-classified data (example 2)
Description
c2_dat2 contains data that was sampled from an unbalanced cross-classified design.This data set is used in order to demonstrate the functionality of thelmer_pi_...() functions.
Usage
c2_dat2Format
Adata.frame with 21 rows and 3 columns:
- y_ijk
observations
- a
treatment a
- b
treatment b
Cross-classified data (example 3)
Description
c2_dat3 contains data that was sampled from a balanced cross-classified design.This data set is used in order to demonstrate the functionality of thelmer_pi_...() functions.
Usage
c2_dat3Format
Adata.frame with 8 rows and 3 columns:
- y_ijk
observations
- a
treatment a
- b
treatment b
Cross-classified data (example 4)
Description
c2_dat4 contains data that was sampled from an unbalanced cross-classified design.This data set is used in order to demonstrate the functionality of thelmer_pi_...() functions.
Usage
c2_dat4Format
Adata.frame with 6 rows and 3 columns:
- y_ijk
observations
- a
treatment a
- b
treatment b
Sampling of bootstrap data from a given random effects model
Description
lmer_bs() draws bootstrap samples based on the estimates for the meanand the variance components drawn from a random effects model fit withlme4::lmer().Contrary tolme4::bootMer(), the number of observations for each random factorcan vary between the original data set and the bootstrapped data. Random effectsinmodel have to be specified as(1|random effect).
Usage
lmer_bs(model, newdat = NULL, futmat_list = NULL, nboot)Arguments
model | a random effects model of class lmerMod |
newdat | a |
futmat_list | a list that contains design matrices for each random factor |
nboot | number of bootstrap samples |
Details
The data sampling is based on a list of design matrices (one for each random factor)that can be obtained ifnewdat and the model formula are provided tolme4::lFormula(). Hence, each random factor that is part of the initialmodel must have at least two replicates innewdat.
If a random factor in the future data set does not have any replicate, a listthat contains design matrices (one for each random factor) can be provided viafutmat_list.
Value
A list of lengthnboot containing the bootstrapped observations.
Examples
# loading lme4library(lme4)# Fitting a random effects model based on c2_dat1fit <- lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1)summary(fit)#----------------------------------------------------------------------------### Using c2_dat2 as newdatc2_dat2lmer_bs(model=fit, newdat=c2_dat2, nboot=100)#----------------------------------------------------------------------------### Using futmat_list# c2_dat4 has no replication for b. Hence the list of design matrices can not be# generated by lme4::lFormula() and have to be provided by hand via futmat_list.c2_dat4# Build a list containing the design matricesfml <- vector(length=4, "list")names(fml) <- c("a:b", "b", "a", "Residual")fml[["a:b"]] <- matrix(nrow=6, ncol=2, data=c(1,1,0,0,0,0, 0,0,1,1,1,1))fml[["b"]] <- matrix(nrow=6, ncol=1, data=c(1,1,1,1,1,1))fml[["a"]] <- matrix(nrow=6, ncol=2, data=c(1,1,0,0,0,0, 0,0,1,1,1,1))fml[["Residual"]] <- diag(6)fmllmer_bs(model=fit, futmat_list=fml, nboot=100)Prediction intervals for future observations based on linear random effects models (DEPRECATED)
Description
This function is deprecated. Please uselmer_pi_unstruc(),lmer_pi_futvec() orlmer_pi_futmat().
Usage
lmer_pi( model, newdat = NULL, m = NULL, alternative = "both", alpha = 0.05, nboot = 10000, lambda_min = 0.01, lambda_max = 10, traceplot = TRUE, n_bisec = 30)Arguments
model | a random effects model of class |
newdat | a |
m | number of future observations |
alternative | either "both", "upper" or "lower". |
alpha | defines the level of confidence (1- |
nboot | number of bootstraps |
lambda_min | lower start value for bisection |
lambda_max | upper start value for bisection |
traceplot | if |
n_bisec | maximal number of bisection steps |
Details
This function returns a bootstrap calibrated prediction interval
[l,u] = \hat{y} \pm q \sqrt{\hat{var}(\hat{y} - y)}
with\hat{y} as the predicted future observation,y as the observed future observations,\sqrt{\hat{var}(\hat{y} - y)}as the prediction standard error andq as the bootstrap calibrated coefficient thatapproximates a quantile of the multivariate t-distribution.
Please note that this function relies on linear random effects models that arefitted withlmer() from the lme4 package. Random effects have to be specified as(1|random_effect).
Value
Ifnewdat is specified: Adata.frame that contains the future data,the historical mean (hist_mean), the calibrated coefficient (quant_calib),the prediction standard error (pred_se), the prediction interval (lower and upper)and a statement if the prediction interval covers the future observation (cover).
Ifm is specified: Adata.frame that contains the number of future observations (m)the historical mean (hist_mean), the calibrated coefficient (quant_calib),the prediction standard error (pred_se) and the prediction interval (lower and upper).
Ifalternative is set to "lower": Lower prediction limits are computed insteadof a prediction interval.
Ifalternative is set to "upper": Upper prediction limits are computed insteadof a prediction interval.
Iftraceplot=TRUE, a graphical overview about the bisection process is given.
Examples
# This function is deprecated.# Please use lmer_pi_unstruc() if you want exactly the same functionality.# Please use lmer_pi_futmat() or lmer_pi_futvec() if you want to take care# of the future experimental designPrediction intervals for future observations based on linear random effects models
Description
lmer_pi_futmat() calculates a bootstrap calibrated prediction interval for one or morefuture observation(s) based on linear random effects models. With this approach,the experimental design of the future data is taken into account (see below).
Usage
lmer_pi_futmat( model, newdat = NULL, futmat_list = NULL, alternative = "both", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22")Arguments
model | a random effects model of class |
newdat | either 1 or a |
futmat_list | a list that contains design matrices for each random factor |
alternative | either "both", "upper" or "lower". |
alpha | defines the level of confidence (1- |
nboot | number of bootstraps |
delta_min | lower start value for bisection |
delta_max | upper start value for bisection |
tolerance | tolerance for the coverage probability in the bisection |
traceplot | if |
n_bisec | maximal number of bisection steps |
algorithm | either "MS22" or "MS22mod" (see details) |
Details
This function returns bootstrap-calibrated prediction intervals as well aslower or upper prediction limits.
Ifalgorithm is set to "MS22", both limits of the prediction intervalare calibrated simultaneously using the algorithm described in Menssen andSchaarschmidt (2022), section 3.2.4. The calibrated prediction interval is givenas
[l,u] = \hat{\mu} \pm q^{calib} \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1}\hat{\sigma}^2_c}
with\hat{\mu} as the expected future observation (historical mean) and\hat{\sigma}^2_c as thec=1, 2, ..., C variance components and\hat{\sigma}^2_{C+1}as the residual variance obtained from the randomeffects model fitted withlme4::lmer() andq^{calib} as the as the bootstrap-calibratedcoefficient used for interval calculation.
Ifalgorithm is set to "MS22mod", both limits of the prediction intervalare calibrated independently from each other. The resulting prediction intervalis given by
[l,u] = \Big[\hat{\mu} - q^{calib}_l \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1} \hat{\sigma}^2_c}, \quad\hat{\mu} + q^{calib}_u \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1} \hat{\sigma}^2_c} \Big].
Please note, that this modification does not affect the calibration procedure, if onlyprediction limits are of interest.
Ifnewdat is defined, the bootstrapped future observations used for the calibrationprocess mimic the structure of the data set provided vianewdat. Thedata sampling is based on a list of design matrices (one for each random factor)that can be obtained ifnewdat and the model formula are provided tolme4::lFormula(). Hence, each random factor that is part of the initialmodel must have at least two replicates innewdat.
If a random factor in the future data set does not have any replicate, a listthat contains design matrices (one for each random factor) can be provided viafutmat_list.
This function is an implementation of the PI given in Menssen and Schaarschmidt 2022section 3.2.4, except, that the bootstrap calibration values are drawn frombootstrap samples that mimic the future data as described above.
Value
lmer_pi_futmat() returns an object of classc("predint", "normalPI")with prediction intervals or limits in the first entry ($prediction).
References
Menssen and Schaarschmidt (2022): Prediction intervals for all of M futureobservations based on linear random effects models. Statistica Neerlandica,doi:10.1111/stan.12260
Examples
# loading lme4library(lme4)# Fitting a random effects model based on c2_dat1fit <- lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1)summary(fit)#----------------------------------------------------------------------------### Using newdat# Prediction interval using c2_dat2 as future datapred_int <- lmer_pi_futmat(model=fit, newdat=c2_dat2, alternative="both", nboot=100)summary(pred_int)# Upper prediction limit for m=1 future observationspred_u <- lmer_pi_futmat(model=fit, newdat=1, alternative="upper", nboot=100)summary(pred_u)#----------------------------------------------------------------------------### Using futmat_list# c2_dat4 has no replication for b. Hence the list of design matrices can not be# generated by lme4::lFormula() and have to be provided by hand via futmat_list.c2_dat4# Build a list containing the design matricesfml <- vector(length=4, "list")names(fml) <- c("a:b", "b", "a", "Residual")fml[["a:b"]] <- matrix(nrow=6, ncol=2, data=c(1,1,0,0,0,0, 0,0,1,1,1,1))fml[["b"]] <- matrix(nrow=6, ncol=1, data=c(1,1,1,1,1,1))fml[["a"]] <- matrix(nrow=6, ncol=2, data=c(1,1,0,0,0,0, 0,0,1,1,1,1))fml[["Residual"]] <- diag(6)fml# Please note, that the design matrix for the interaction term a:b is also# provided even there is no replication for b, since it is assumed that# both, the historical and the future data descent from the same data generating# process.# Calculate the PIpred_fml <- lmer_pi_futmat(model=fit, futmat_list=fml, alternative="both", nboot=100)summary(pred_fml)#----------------------------------------------------------------------------# Please note that nboot was set to 100 in order to decrease computing time# of the example. For a valid analysis set nboot=10000.Prediction intervals for future observations based on linear random effects models
Description
lmer_pi_futvec() calculates a bootstrap calibrated prediction interval for one or morefuture observation(s) based on linear random effects models. With this approach,the experimental design of the future data is taken into account (see below).
Usage
lmer_pi_futvec( model, futvec, newdat = NULL, alternative = "both", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22")Arguments
model | a random effects model of class lmerMod |
futvec | an integer vector that defines the structure of the future data based on therow numbers of the historical data. If |
newdat | a |
alternative | either "both", "upper" or "lower". |
alpha | defines the level of confidence (1- |
nboot | number of bootstraps |
delta_min | lower start value for bisection |
delta_max | upper start value for bisection |
tolerance | tolerance for the coverage probability in the bisection |
traceplot | if |
n_bisec | maximal number of bisection steps |
algorithm | either "MS22" or "MS22mod" (see details) |
Details
This function returns bootstrap-calibrated prediction intervals as well aslower or upper prediction limits.
Ifalgorithm is set to "MS22", both limits of the prediction intervalare calibrated simultaneously using the algorithm described in Menssen andSchaarschmidt (2022), section 3.2.4. The calibrated prediction interval is givenas
[l,u] = \hat{\mu} \pm q^{calib} \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1}\hat{\sigma}^2_c}
with\hat{\mu} as the expected future observation (historical mean) and\hat{\sigma}^2_c as thec=1, 2, ..., C variance components and\hat{\sigma}^2_{C+1}as the residual variance obtained from the randomeffects model fitted withlme4::lmer() andq^{calib} as the as the bootstrap-calibratedcoefficient used for interval calculation.
Ifalgorithm is set to "MS22mod", both limits of the prediction intervalare calibrated independently from each other. The resulting prediction intervalis given by
[l,u] = \Big[\hat{\mu} - q^{calib}_l \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1} \hat{\sigma}^2_c}, \quad\hat{\mu} + q^{calib}_u \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1} \hat{\sigma}^2_c} \Big].
Please note, that this modification does not affect the calibration procedure, if onlyprediction limits are of interest.
Be aware that the sampling structure of the historical data must contain the structure of thefuture data. This means that the observations per random factor must be less orequal in the future data compared to the historical data.
This function is an implementation of the PI given in Menssen and Schaarschmidt 2022 section 3.2.4except that the bootstrap calibration values are drawn from bootstrap samples thatmimic the future data.
Value
lmer_pi_futvec() returns an object of classc("predint", "normalPI")with prediction intervals or limits in the first entry ($prediction).
References
Menssen and Schaarschmidt (2022): Prediction intervals for all of M futureobservations based on linear random effects models. Statistica Neerlandica,doi:10.1111/stan.12260
Examples
# loading lme4library(lme4)# Fitting a random effects model based on c2_dat1fit <- lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1)summary(fit)#----------------------------------------------------------------------------### Prediction interval using c2_dat3 as future data# without printing c2_dat3 in the output# Row numbers of the historical data c2_dat1 that define the structure of# the future data c2_dat3futvec <- c(1, 2, 4, 5, 10, 11, 13, 14)# Calculating the PIpred_int <- lmer_pi_futvec(model=fit, futvec=futvec, nboot=100)summary(pred_int)#----------------------------------------------------------------------------### Calculating the PI with c2_dat3 printed in the outputpred_int_new <- lmer_pi_futvec(model=fit, futvec=futvec, newdat=c2_dat3, nboot=100)summary(pred_int_new)#----------------------------------------------------------------------------### Upper prediction limit for m=1 future observationpred_u <- lmer_pi_futvec(model=fit, futvec=1, alternative="upper", nboot=100)summary(pred_u)#----------------------------------------------------------------------------# Please note that nboot was set to 100 in order to decrease computing time# of the example. For a valid analysis set nboot=10000.Prediction intervals for future observations based on linear random effects models
Description
lmer_pi_unstruc() calculates a bootstrap calibrated prediction interval for one or morefuture observation(s) based on linear random effects models as described in section3.2.4. of Menssen and Schaarschmidt (2022).Please note, that the bootstrap calibration used here does not consider the samplingstructure of the future data, since the calibration values are drawn randomly frombootstrap data sets that have the same structure as the historical data.
Usage
lmer_pi_unstruc( model, newdat = NULL, m = NULL, alternative = "both", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22")Arguments
model | a random effects model of class lmerMod |
newdat | a |
m | number of future observations |
alternative | either "both", "upper" or "lower". |
alpha | defines the level of confidence (1- |
nboot | number of bootstraps |
delta_min | lower start value for bisection |
delta_max | upper start value for bisection |
tolerance | tolerance for the coverage probability in the bisection |
traceplot | if |
n_bisec | maximal number of bisection steps |
algorithm | either "MS22" or "MS22mod" (see details) |
Details
This function returns bootstrap-calibrated prediction intervals as well aslower or upper prediction limits.
Ifalgorithm is set to "MS22", both limits of the prediction intervalare calibrated simultaneously using the algorithm described in Menssen andSchaarschmidt (2022), section 3.2.4. The calibrated prediction interval is givenas
[l,u] = \hat{\mu} \pm q^{calib} \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1}\hat{\sigma}^2_c}
with\hat{\mu} as the expected future observation (historical mean) and\hat{\sigma}^2_c as thec=1, 2, ..., C variance components and\hat{\sigma}^2_{C+1}as the residual variance obtained from the randomeffects model fitted withlme4::lmer() andq^{calib} as the as the bootstrap-calibratedcoefficient used for interval calculation.
Ifalgorithm is set to "MS22mod", both limits of the prediction intervalare calibrated independently from each other. The resulting prediction intervalis given by
[l,u] = \Big[\hat{\mu} - q^{calib}_l \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1} \hat{\sigma}^2_c}, \quad\hat{\mu} + q^{calib}_u \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1} \hat{\sigma}^2_c} \Big].
Please note, that this modification does not affect the calibration procedure, if onlyprediction limits are of interest.
This function is an direct implementation of the PI given in Menssen and Schaarschmidt2022 section 3.2.4.
Value
lmer_pi_futvec() returns an object of classc("predint", "normalPI")with prediction intervals or limits in the first entry ($prediction).
References
Menssen and Schaarschmidt (2022): Prediction intervals for all of M futureobservations based on linear random effects models. Statistica Neerlandica,doi:10.1111/stan.12260
Examples
# loading lme4library(lme4)# Fitting a random effects model based on c2_dat1fit <- lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1)summary(fit)# Prediction interval using c2_dat2 as future datapred_int <- lmer_pi_unstruc(model=fit, newdat=c2_dat2, alternative="both", nboot=100)summary(pred_int)# Upper prediction limit for m=3 future observationspred_u <- lmer_pi_unstruc(model=fit, m=3, alternative="upper", nboot=100)summary(pred_u)# Please note that nboot was set to 100 in order to decrease computing time# of the example. For a valid analysis set nboot=10000.Historical mortality of male B6C3F1-mice
Description
This data set contains historical control data about the mortality of male B6C3F1-miceobtained in long term carcinogenicity studies at the National Toxicology Programpresented in NTP Historical Control Reports from 2013 to 2016.It was used in Menssen and Schaarschmidt 2019 as a real life example.
Usage
mortality_HCDFormat
Adata.frame with 2 rows and 10 columns:
- dead
no. of dead mice
- alive
no. of living mice
References
Menssen and Schaarschmidt (2019): Prediction intervals for overdispersed binomialdata with application to historical controls. Statistics in Medicine.doi:10.1002/sim.8124
NTP Historical Control Reports:https://ntp.niehs.nih.gov/data/controls
Simple uncalibrated prediction intervals for negative-binomial data
Description
nb_pi() is a helper function that is internally called byneg_bin_pi(). Itcalculates simple uncalibrated prediction intervals for negative-binomial datawith offsets.
Usage
nb_pi( newoffset, histoffset, lambda, kappa, q = qnorm(1 - 0.05/2), alternative = "both", newdat = NULL, histdat = NULL, algorithm = NULL)Arguments
newoffset | number of experimental units in the future clusters |
histoffset | number of experimental units in the historical clusters |
lambda | overall Poisson mean |
kappa | dispersion parameter |
q | quantile used for interval calculation |
alternative | either "both", "upper" or "lower". |
newdat | additional argument to specify the current data set |
histdat | additional argument to specify the historical data set |
algorithm | used to define the algorithm for calibration if called via |
Details
This function returns a simple uncalibrated prediction interval
[l,u]_m = n^*_m \hat{\lambda} \pm q \sqrt{n^*_m\frac{\hat{\lambda} + \hat{\kappa} \bar{n} \hat{\lambda}}{\bar{n} H} +(n^*_m \hat{\lambda} + \hat{\kappa} n^{*2}_m \hat{\lambda}^2)}
withn^*_m as the number of experimental units inm=1, 2, ... , M future clusters,\hat{\lambda} as the estimate for the Poisson mean obtained from thehistorical data,\hat{\kappa} as the estimate for the dispersion parameter,n_h as the number of experimental units per historical cluster and\bar{n}=\sum_h^{n_h} n_h / H.
The direct application of this uncalibrated prediction interval to real life datais not recommended. Please use theneg_bin_pi() function for real life applications.
Value
np_pi returns an object of classc("predint", "negativeBinomialPI").
Examples
# Prediction intervalnb_pred <- nb_pi(newoffset=3, lambda=3, kappa=0.04, histoffset=1:9, q=qnorm(1-0.05/2))summary(nb_pred)Prediction intervals for negative-binomial data
Description
neg_bin_pi() calculates bootstrap calibrated prediction intervals fornegative-binomial data.
Usage
neg_bin_pi( histdat, newdat = NULL, newoffset = NULL, alternative = "both", adjust = "within", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22mod")Arguments
histdat | a |
newdat |
|
newoffset | vector with future number of experimental units per historicalstudy. |
alternative | either "both", "upper" or "lower". |
adjust | specifies if simultaneous prediction should be done for severalcontrol groups of different studies ( |
alpha | defines the level of confidence ( |
nboot | number of bootstraps |
delta_min | lower start value for bisection |
delta_max | upper start value for bisection |
tolerance | tolerance for the coverage probability in the bisection |
traceplot | if |
n_bisec | maximal number of bisection steps |
algorithm | either "MS22" or "MS22mod" (see details) |
Details
This function returns bootstrap-calibrated prediction intervals as well aslower or upper prediction limits.
Ifalgorithm is set to "MS22", both limits of the prediction intervalare calibrated simultaneously using the algorithm described in Menssen andSchaarschmidt (2022), section 3.2.4. The calibrated prediction interval is givenas
[l,u]_m = n^*_m \hat{\lambda} \pm q \sqrt{n^*_m\frac{\hat{\lambda} + \hat{\kappa} \bar{n} \hat{\lambda}}{\bar{n} H} +(n^*_m \hat{\lambda} + \hat{\kappa} n^{*2}_m \hat{\lambda}^2)}
withn^*_m as the number of experimental units in the future clusters,\hat{\lambda} as the estimate for the Poisson mean obtained from thehistorical data,\hat{\kappa} as the estimate for the dispersion parameter,n_h as the number of experimental units per historical cluster and\bar{n}=\sum_h^{n_h} n_h / H.
Ifalgorithm is set to "MS22mod", both limits of the prediction intervalare calibrated independently from each other. The resulting prediction intervalis given by
[l,u] = \Big[n^*_m \hat{\lambda} - q^{calib}_l \sqrt{n^*_m\frac{\hat{\lambda} + \hat{\kappa} \bar{n} \hat{\lambda}}{\bar{n} H} +(n^*_m \hat{\lambda} + \hat{\kappa} n^{*2}_m \hat{\lambda}^2)}, \quad n^*_m \hat{\lambda} + q^{calib}_u \sqrt{n^*_m\frac{\hat{\lambda} + \hat{\kappa} \bar{n} \hat{\lambda}}{\bar{n} H} +(n^*_m \hat{\lambda} + \hat{\kappa} n^{*2}_m \hat{\lambda}^2)} \Big]
Please note, that this modification does not affect the calibration procedure, if onlyprediction limits are of interest.
Value
neg_bin_pi() returns an object of classc("predint", "negativeBinomialPI")with prediction intervals or limits in the first entry ($prediction).
References
Menssen et al. (2025): Prediction Intervals for Overdispersed PoissonData and Their Application in Medical and Pre-Clinical Quality Control.Pharmaceutical Statisticsdoi:10.1002/pst.2447
Examples
# HCD from the Ames testames_HCD# Pointwise prediction interval for one future number of revertant colonies# obtained in three petridishespred_int <- neg_bin_pi(histdat=ames_HCD, newoffset=3, nboot=100)summary(pred_int)# Simultaneous prediction interval for the numbers of revertant colonies obtained in# the control and three treatment groups of a future trialpred_int_w <- neg_bin_pi(histdat=ames_HCD, newoffset=c(3, 3, 3, 3), adjust="within", nboot=100)summary(pred_int_w)# Please note that nboot was set to 100 in order to decrease computing time# of the example. For a valid analysis set nboot=10000.Simple uncalibrated prediction intervals for normal distributed data
Description
normal_pi() is a helper function that is internally called by thelmer_pi_...() functions.It calculates simple uncalibrated prediction intervals for normal distributedobservations.
Usage
normal_pi( mu, pred_se, m = 1, q = qnorm(1 - 0.05/2), alternative = "both", futmat_list = NULL, futvec = NULL, newdat = NULL, histdat = NULL, algorithm = NULL)Arguments
mu | overall mean |
pred_se | standard error of the prediction |
m | number of future observations |
q | quantile used for interval calculation |
alternative | either "both", "upper" or "lower" |
futmat_list | used to add the list of future design matrices to the outputif called via |
futvec | used to add the vector of the historical row numbers that definethe future experimental design to the output if called via |
newdat | additional argument to specify the current data set |
histdat | additional argument to specify the historical data set |
algorithm | used to define the algorithm for calibration if called via |
Details
This function returns a simple uncalibrated prediction interval asgiven in Menssen and Schaarschmidt 2022
[l,u] = \hat{\mu} \pm q \sqrt{\widehat{var}(\hat{\mu}) + \sum_{c=1}^{C+1} \hat{\sigma}^2_c}
with\hat{\mu} as the expected future observation (historical mean) and\hat{\sigma}^2_c as thec=1, 2, ..., C variance components and\hat{\sigma}^2_{C+1}as the residual variance andq as the quantile used for interval calculation.
The direct application of this uncalibrated prediction interval to real life datais not recommended. Please use thelmer_pi_...() functions for real life applications.
Value
normal_pi() returns an object of classc("predint", "normalPI")with prediction intervals or limits in the first entry ($prediction).
References
Menssen and Schaarschmidt (2022): Prediction intervals for all of M futureobservations based on linear random effects models. Statistica Neerlandica,doi:10.1111/stan.12260
Examples
# simple PInorm_pred <- normal_pi(mu=10, pred_se=3, m=1)summary(norm_pred)Estimation of the binomial proportion and the intra class correlation.
Description
pi_rho_est() estimates the overall binomial proportion\hat{\pi} and the intraclass correlation\hat{\rho} of data that is assumed to follow the beta-binomialdistribution. The estimation of\hat{\pi} and\hat{\rho} is done followingthe approach of Lui et al. 2000.
Usage
pi_rho_est(dat)Arguments
dat | a |
Value
a vector containing estimates for\pi and\rho
References
Lui, K.-J., Mayer, J.A. and Eckhardt, L: Confidence intervals for the risk ratiounder cluster sampling based on the beta-binomial model. Statistics in Medicine.2000;19:2933-2942.doi:10.1002/1097-0258(20001115)19:21<2933::AID-SIM591>3.0.CO;2-Q
Examples
# Estimates for bb_dat1pi_rho_est(bb_dat1)Plots ofpredint objects
Description
This function provides methodology for plotting the prediction intervals orlimits that are calculated using the functionality of thepredint package.
Usage
## S3 method for class 'predint'plot(x, ..., size = 4, width = 0.05, alpha = 0.5)Arguments
x | object of class |
... | arguments handed over to |
size | size of the dots |
width | margin of jittering |
alpha | opacity of dot colors |
Value
Sinceplot.predint() is based onggplot2::ggplot, it returnsan object of classc("gg", "ggplot").
Examples
### PI for quasi-Poisson datapred_int <- quasi_pois_pi(histdat=ames_HCD, newoffset=3, nboot=100, traceplot = FALSE)### Plot the PIplot(pred_int)### Since plot.predint is based on ggplot, the grafic can be altered using# the methodology provided via ggplot2plot(pred_int)+ theme_classic()Print objects of classpredint
Description
Print objects of classpredint
Usage
## S3 method for class 'predint'print(x, ...)Arguments
x | an object of class |
... | additional arguments passed over to |
Value
prints output to the console
Quasi-binomial data (example 1)
Description
This data set contains sampled quasi-binomial data from 10 clusterseach of size 50. The data set was sampled withrqbinom(n=10, size=50, prob=0.1, phi=3).
Usage
qb_dat1Format
Adata.frame with 3 rows and 2 columns:
- succ
numbers of success
- fail
numbers of failures
Quasi-binomial data (example 2)
Description
This data set contains sampled quasi binomial data from 3 clusters withdifferent size.The data set was sampled withrqbinom(n=3, size=c(40, 50, 60), prob=0.1, phi=3).
Usage
qb_dat2Format
Adata.frame with 3 rows and 2 columns:
- succ
numbers of success
- fail
numbers of failures
Simple uncalibrated prediction intervals for quasi-binomial data
Description
qb_pi() is a helper function that is internally called byquasi_bin_pi(). Itcalculates simple uncalibrated prediction intervals for binarydata with constant overdispersion (quasi-binomial assumption).
Usage
qb_pi( newsize, histsize, pi, phi, q = qnorm(1 - 0.05/2), alternative = "both", newdat = NULL, histdat = NULL, algorithm = NULL)Arguments
newsize | number of experimental units in the historical clusters. |
histsize | number of experimental units in the future clusters. |
pi | binomial proportion |
phi | dispersion parameter |
q | quantile used for interval calculation |
alternative | either "both", "upper" or "lower" |
newdat | additional argument to specify the current data set |
histdat | additional argument to specify the historical data set |
algorithm | used to define the algorithm for calibration if called via |
Details
This function returns a simple uncalibrated prediction interval
[l,u]_m = n^*_m \hat{\pi} \pm q \sqrt{\hat{\phi} n^*_m \hat{\pi} (1- \hat{\pi}) +\frac{\hat{\phi} n^{*2}_m \hat{\pi} (1- \hat{\pi})}{\sum_h n_h}}
withn^*_m as the number of experimental units in them=1, 2, ... , M future clusters,\hat{\pi} as the estimate for the binomial proportion obtained from thehistorical data,\hat{\phi} as the estimate for the dispersion parameterandn_h as the number of experimental units per historical cluster.
The direct application of this uncalibrated prediction interval to real life datais not recommended. Please use thebeta_bin_pi() functions for real life applications.
Value
qb_pi returns an object of classc("predint", "quasiBinomailPI").
Examples
qb_pred <- qb_pi(newsize=50, pi=0.3, phi=3, histsize=c(50, 50, 30), q=qnorm(1-0.05/2))summary(qb_pred)Quasi-Poisson data (example 1)
Description
This data set contains sampled quasi-Poisson data for 10 clusters.The data set was sampled withrqpois(n=10, lambda=50, phi=3).
Usage
qp_dat1Format
A data.frame with two columns
- y
numbers of eventzs
- offset
size of experimental units
Quasi-Poisson data (example 2)
Description
This data set contains sampled quasi-Poisson data for 3 clusters.The data set was sampled withrqpois(n=3, lambda=50, phi=3).
Usage
qp_dat2Format
A data.frame with two columns
- y
numbers of eventzs
- offset
size of experimental units
Simple uncalibrated prediction intervals for quasi-Poisson data
Description
qp_pi() is a helper function that is internally called byquasi_pois_pi(). Itcalculates simple uncalibrated prediction intervals for Poissondata with constant overdispersion (quasi-Poisson assumption).
Usage
qp_pi( newoffset, histoffset, lambda, phi, q = qnorm(1 - 0.05/2), alternative = "both", adjust = NULL, newdat = NULL, histdat = NULL, algorithm = NULL)Arguments
newoffset | number of experimental units in the future clusters |
histoffset | number of experimental units in the historical clusters |
lambda | overall Poisson mean |
phi | dispersion parameter |
q | quantile used for interval calculation |
alternative | either "both", "upper" or "lower" |
adjust | only important if called via |
newdat | additional argument to specify the current data set |
histdat | additional argument to specify the historical data set |
algorithm | used to define the algorithm for calibration if called via |
Details
This function returns a simple uncalibrated prediction interval
[l,u]_m = n^*_m \hat{\lambda} \pm q \sqrt{n^*_m \hat{\phi} \hat{\lambda} + \frac{n^{*2}_m \hat{\phi} \hat{\lambda}}{\sum_h n_h}}
withn^*_m as the number of experimental units in them=1, 2, ... , M future clusters,\hat{\lambda} as the estimate for the Poisson mean obtained from thehistorical data,\hat{\phi} as the estimate for the dispersion parameterandn_h as the number of experimental units per historical cluster.
The direct application of this uncalibrated prediction interval to real life datais not recommended. Please use thequasi_pois_pi_pi() functions for real life applications.
Value
qp_pi returns an object of classc("predint", "quasiPoissonPI").
Examples
# Prediction intervalqp_pred <- qp_pi(newoffset=3, lambda=3, phi=3, histoffset=1:9, q=qnorm(1-0.05/2))summary(qp_pred)Prediction intervals for quasi-binomial data
Description
quasi_bin_pi() calculates bootstrap calibrated prediction intervals for binomialdata with constant overdispersion (quasi-binomial assumption).
Usage
quasi_bin_pi( histdat, newdat = NULL, newsize = NULL, alternative = "both", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22mod")Arguments
histdat | a |
newdat | a |
newsize | a vector containing the future cluster sizes |
alternative | either "both", "upper" or "lower". |
alpha | defines the level of confidence (1-alpha) |
nboot | number of bootstraps |
delta_min | lower start value for bisection |
delta_max | upper start value for bisection |
tolerance | tolerance for the coverage probability in the bisection |
traceplot | if |
n_bisec | maximal number of bisection steps |
algorithm | either "MS22" or "MS22mod" (see details) |
Details
This function returns bootstrap-calibrated prediction intervals as well aslower or upper prediction limits.
Ifalgorithm is set to "MS22", both limits of the prediction intervalare calibrated simultaneously using the algorithm described in Menssen andSchaarschmidt (2022), section 3.2.4. The calibrated prediction interval is givenas
[l,u]_m = n^*_m \hat{\pi} \pm q^{calib} \hat{se}(Y_m - y^*_m)
where
\hat{se}(Y_m - y^*_m) = \sqrt{\hat{\phi} n^*_m \hat{\pi} (1- \hat{\pi}) +\frac{\hat{\phi} n^{*2}_m \hat{\pi} (1- \hat{\pi})}{\sum_h n_h}}
withn^*_m as the number of experimental units in the future clusters,\hat{\pi} as the estimate for the binomial proportion obtained from thehistorical data,q^{calib} as the bootstrap-calibrated coefficient,\hat{\phi} as the estimate for the dispersion parameterandn_h as the number of experimental units per historical cluster.
Ifalgorithm is set to "MS22mod", both limits of the prediction intervalare calibrated independently from each other. The resulting prediction intervalis given by
[l,u] = \Big[n^*_m \hat{\pi} - q^{calib}_l \hat{se}(Y_m - y^*_m), \quadn^*_m \hat{\pi} + q^{calib}_u \hat{se}(Y_m - y^*_m) \Big]
Please note, that this modification does not affect the calibration procedure, if onlyprediction limits are of interest.
Value
quasi_bin_pi returns an object of classc("predint", "quasiBinomialPI")with prediction intervals or limits in the first entry ($prediction).
References
Menssen and Schaarschmidt (2019): Prediction intervals for overdispersed binomialdata with application to historical controls. Statistics in Medicine.doi:10.1002/sim.8124
Menssen and Schaarschmidt (2022): Prediction intervals for all of M futureobservations based on linear random effects models. Statistica Neerlandica,doi:10.1111/stan.12260
Examples
# Pointwise prediction intervalpred_int <- quasi_bin_pi(histdat=mortality_HCD, newsize=40, nboot=100)summary(pred_int)# Pointwise upper prediction limitpred_u <- quasi_bin_pi(histdat=mortality_HCD, newsize=40, alternative="upper", nboot=100)summary(pred_u)# Please note that nboot was set to 100 in order to decrease computing time# of the example. For a valid analysis set nboot=10000.Prediction intervals for quasi-Poisson data
Description
quasi_pois_pi() calculates bootstrap calibrated prediction intervals for Poissondata with constant overdispersion (quasi-Poisson).
Usage
quasi_pois_pi( histdat, newdat = NULL, newoffset = NULL, alternative = "both", adjust = "within", alpha = 0.05, nboot = 10000, delta_min = 0.01, delta_max = 10, tolerance = 0.001, traceplot = TRUE, n_bisec = 30, algorithm = "MS22mod")Arguments
histdat | a |
newdat | a |
newoffset | vector with future number of experimental units per historicalstudy. |
alternative | either "both", "upper" or "lower". |
adjust | specifies if simultaneous prediction should be done for severalcontrol groups of different studies ( |
alpha | defines the level of confidence ( |
nboot | number of bootstraps |
delta_min | lower start value for bisection |
delta_max | upper start value for bisection |
tolerance | tolerance for the coverage probability in the bisection |
traceplot | if |
n_bisec | maximal number of bisection steps |
algorithm | either "MS22" or "MS22mod" (see details) |
Details
This function returns bootstrap-calibrated prediction intervals as well aslower or upper prediction limits.
Ifalgorithm is set to "MS22", both limits of the prediction intervalare calibrated simultaneously using the algorithm described in Menssen andSchaarschmidt (2022), section 3.2.4. The calibrated prediction interval is givenas
[l,u]_m = n^*_m \hat{\lambda} \pm q^{calib} \sqrt{n^*_m \hat{\phi} \hat{\lambda} + \frac{n^{*2}_m \hat{\phi} \hat{\lambda}}{\sum_h n_h}}
withn^*_m as the number of experimental units in the future clusters,\hat{\lambda} as the estimate for the Poisson mean obtained from thehistorical data,q^{calib} as the bootstrap-calibrated coefficient,\hat{\phi} as the estimate for the dispersion parameterandn_h as the number of experimental units per historical cluster.
Ifalgorithm is set to "MS22mod", both limits of the prediction intervalare calibrated independently from each other. The resulting prediction intervalis given by
[l,u] = \Big[n^*_m \hat{\lambda} - q^{calib}_l \sqrt{n^*_m \hat{\phi} \hat{\lambda} + \frac{n^{*2}_m \hat{\phi} \hat{\lambda}}{\sum_h n_h}}, \quad n^*_m \hat{\lambda} + q^{calib}_u \sqrt{n^*_m \hat{\phi} \hat{\lambda} + \frac{n^{*2}_m \hat{\phi} \hat{\lambda}}{\sum_h n_h}} \Big]
Please note, that this modification does not affect the calibration procedure, if onlyprediction limits are of interest.
Value
quasi_pois_pi returns an object of classc("predint", "quasiPoissonPI")with prediction intervals or limits in the first entry ($prediction).
References
Menssen et al. (2025): Prediction Intervals for Overdispersed PoissonData and Their Application in Medical and Pre-Clinical Quality Control.Pharmaceutical Statisticsdoi:10.1002/pst.2447
Examples
#' # Historical dataqp_dat1# Future dataqp_dat2# Pointwise prediction intervalpred_int <- quasi_pois_pi(histdat=ames_HCD, newoffset=3, nboot=100)summary(pred_int)# Pointwise upper predictionpred_u <- quasi_pois_pi(histdat=ames_HCD, newoffset=3, alternative="upper", nboot=100)summary(pred_u)# Please note that nboot was set to 100 in order to decrease computing time# of the example. For a valid analysis set nboot=10000.Sampling of beta-binomial data
Description
rbbinom() samples beta-binomial data according to Menssen and Schaarschmidt (2019).
Usage
rbbinom(n, size, prob, rho)Arguments
n | defines the number of clusters ( |
size | integer vector defining the number of trials per cluster ( |
prob | probability of success on each trial ( |
rho | intra class correlation ( |
Details
For beta binomial data withi=1, ... I clusters, the variance is
var(y_i)= n_i \pi (1-\pi) [1+ (n_i - 1) \rho]
with\rho as the intra class correlation coefficient
\rho = 1 / (1+a+b).
For the sampling(a+b) is defined as
(a+b)=(1-\rho)/\rho
wherea=\pi (a+b) andb=(a+b)-a. Then, the binomial proportionsfor each cluster are sampled from the beta distribution
\pi_i \sim Beta(a, b)
and the number of successes for each cluster are sampled to be
y_i \sim Bin(n_i, \pi_i).
In this parametrizationE(\pi_i)=\pi=a/(a+b) andE(y_i)=n_i \pi.Please note, that1+ (n_i-1) \rho is a constant if all cluster sizes arethe same and hence, in this special case, also the quasi-binomial assumption isfulfilled.
Value
adata.frame with two columns (succ, fail)
References
Menssen M, Schaarschmidt F.: Prediction intervals for overdispersed binomial datawith application to historical controls. Statistics in Medicine. 2019;38:2652-2663.doi:10.1002/sim.8124
Examples
# Sampling of example dataset.seed(234)bb_dat1 <- rbbinom(n=10, size=50, prob=0.1, rho=0.06)bb_dat1set.seed(234)bb_dat2 <- rbbinom(n=3, size=c(40, 50, 60), prob=0.1, rho=0.06)bb_dat2Sampling of negative binomial data
Description
rnbinom() samples negative-binomial data.The following description of the sampling process is based on the parametrizationused by Gsteiger et al. 2013.
Usage
rnbinom(n, lambda, kappa, offset = NULL)Arguments
n | defines the number of clusters ( |
lambda | defines the overall Poisson mean ( |
kappa | dispersion parameter ( |
offset | defines the number of experimental units per cluster ( |
Details
The variance of the negative-binomial distribution is
var(Y_i) = n_i \lambda (1+ \kappa n_i \lambda).
Negative-biomial observations can be sampled based on predefined values of\kappa,\lambda andn_i:
Define the parameters of the gamma distribution asa=\frac{1}{\kappa} andb_i=\frac{1}{\kappa n_i \lambda}. Then, sample the Poisson means for each cluster
\lambda_i \sim Gamma(a, b_i).
Finally, the observationsy_i are sampled from the Poisson distribution
y_i \sim Pois(\lambda_i)
Value
rnbinom() returns adata.frame with two columns:y as the observations andoffset as the number of offsets perobservation.
References
Gsteiger, S., Neuenschwander, B., Mercier, F. and Schmidli, H. (2013):Using historical control information for the design and analysis of clinicaltrials with overdispersed count data. Statistics in Medicine, 32: 3609-3622.doi:10.1002/sim.5851
Examples
# Sampling of negative-binomial observations# with different offsetsset.seed(123)rnbinom(n=5, lambda=5, kappa=0.13, offset=c(3,3,2,3,2))Sampling of overdispersed binomial data with constant overdispersion
Description
rqbinom samples overdispersed binomial data with constant overdispersion fromthe beta-binomial distribution such that the quasi-binomial assumption is fulfilled.
Usage
rqbinom(n, size, prob, phi)Arguments
n | defines the number of clusters ( |
size | integer vector defining the number of trials per cluster ( |
prob | probability of success on each trial ( |
phi | dispersion parameter ( |
Details
It is assumed that the dispersion parameter (\Phi)is constant for alli=1, ... I clusters, such that the variance becomes
var(y_i)=\Phi n_i \pi (1-\pi).
For the sampling(a+b)_i is defined as
(a+b)_i=(\Phi-n_i)/(1-\Phi)
wherea_i=\pi (a+b)_i andb_i=(a+b)_i-a_i. Then, the binomial proportionsfor each cluster are sampled from the beta distribution
\pi_i \sim Beta(a_i, b_i)
and the numbers of success for each cluster are sampled to be
y_i \sim Bin(n_i, \pi_i).
In this parametrizationE(\pi_i)=\pi andE(y_i)=n_i \pi.Please note, the quasi-binomial assumption is not in contradiction withthe beta-binomial distribution if all cluster sizes are the same.
Value
adata.frame with two columns (succ, fail)
Examples
# Sampling of example dataset.seed(456)qb_dat1 <- rqbinom(n=10, size=50, prob=0.1, phi=3)qb_dat1set.seed(456)qb_dat2 <- rqbinom(n=3, size=c(40, 50, 60), prob=0.1, phi=3)qb_dat2Sampling of overdispersed Poisson data with constant overdispersion
Description
rqpois() samples overdispersed Poisson data with constant overdispersion fromthe negative-binomial distribution such that the quasi-Poisson assumption is fulfilled.The following description of the sampling process is based on the parametrizationused by Gsteiger et al. 2013.
Usage
rqpois(n, lambda, phi, offset = NULL)Arguments
n | defines the number of clusters ( |
lambda | defines the overall Poisson mean ( |
phi | dispersion parameter ( |
offset | defines the number of experimental units per cluster ( |
Details
It is assumed that the dispersion parameter (\Phi)is constant for alli=1, ... I clusters, such that the variance becomes
var(y_i) = \Phi n_i \lambda
For the sampling\kappa_i is defined as
\kappa_i=(\Phi-1)/(n_i \lambda)
wherea_i=1/\kappa_i andb_i=1/(\kappa_i n_i \lambda). Then, the Poisson meansfor each cluster are sampled from the gamma distribution
\lambda_i \sim Gamma(a_i, b_i)
and the observations per cluster are sampled to be
y_i \sim Pois(\lambda_i).
Please note, that the quasi-Poisson assumption is not in contradiction with thenegative-binomial distribution, if the data structure is defined by the numberof clusters only (which is the case here) and the offsets are all the samen_h = n_{h´} = n.
Value
a data.frame containing the sampled observations and the offsets
References
Gsteiger, S., Neuenschwander, B., Mercier, F. and Schmidli, H. (2013):Using historical control information for the design and analysis of clinicaltrials with overdispersed count data. Statistics in Medicine, 32: 3609-3622.doi:10.1002/sim.5851
Examples
# set.seed(123)qp_dat1 <- rqpois(n=10, lambda=50, phi=3)qp_dat1# set.seed(123)qp_dat2 <- rqpois(n=3, lambda=50, phi=3)qp_dat2Summarizing objects of classpredint
Description
This function gives a summary about the prediction intervals (and limits)computed withpredint.
Usage
## S3 method for class 'predint'summary(object, ...)Arguments
object | object of class |
... | further arguments passed over to |
Value
Adata.frame containing the current data (if provided vianewdat),the prediction interval (or limit), the expected value for the future observation,the bootstrap calibrated coefficient(s), the prediction standard error anda statement about the coverage for each future observation, if new observationswere provided vianewdat.
Examples
# Fitting a random effects model based on c2_dat1fit <- lme4::lmer(y_ijk~(1|a)+(1|b)+(1|a:b), c2_dat1)# Prediction interval using c2_dat2 as future datapred_int <- lmer_pi_futmat(model=fit, newdat=c2_dat2, alternative="both", nboot=100)summary(pred_int)#----------------------------------------------------------------------------# Please note that nboot was set to 100 in order to decrease computing time# of the example. For a valid analysis set nboot=10000.