| Type: | Package |
| Title: | Flexible Parametric Survival and Multi-State Models |
| Version: | 2.3.2 |
| Date: | 2024-08-16 |
| Description: | Flexible parametric models for time-to-event data, including the Royston-Parmar spline model, generalized gamma and generalized F distributions. Any user-defined parametric distribution can be fitted, given at least an R function defining the probability density or hazard. There are also tools for fitting and predicting from fully parametric multi-state models, based on either cause-specific hazards or mixture models. |
| License: | GPL-2 |GPL-3 [expanded from: GPL (≥ 2)] |
| Depends: | survival, R (≥ 2.15.0) |
| Imports: | assertthat, deSolve, generics, magrittr, mstate (≥ 0.2.10),Matrix, muhaz, mvtnorm, numDeriv, quadprog, Rcpp (≥ 0.11.5),rlang, rstpm2, purrr, statmod, tibble, tidyr, dplyr,tidyselect, ggplot2 |
| Encoding: | UTF-8 |
| Suggests: | splines2, flexsurvcure, survminer, lubridate, rmarkdown,colorspace, eha, knitr, msm, testthat, TH.data, broom, covr |
| URL: | https://github.com/chjackson/flexsurv,http://chjackson.github.io/flexsurv/ |
| BugReports: | https://github.com/chjackson/flexsurv/issues |
| VignetteBuilder: | knitr |
| LazyData: | yes |
| LinkingTo: | Rcpp |
| RoxygenNote: | 7.3.1 |
| NeedsCompilation: | yes |
| Packaged: | 2024-08-16 15:05:50 UTC; Chris |
| Author: | Christopher Jackson [aut, cre], Paul Metcalfe [ctb], Jordan Amdahl [ctb], Matthew T. Warkentin [ctb], Michael Sweeting [ctb], Kevin Kunzmann [ctb] |
| Maintainer: | Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk> |
| Repository: | CRAN |
| Date/Publication: | 2024-08-17 05:50:02 UTC |
flexsurv: Flexible parametric survival and multi-state models
Description
flexsurv: Flexible parametric models for time-to-event data, including thegeneralized gamma, the generalized F and the Royston-Parmar spline model,and extensible to user-defined distributions.
Details
flexsurvreg fits parametric models for time-to-event(survival) data. Data may be right-censored, and/or left-censored, and/orleft-truncated. Several built-in parametric distributions are available.Any user-defined parametric model can also be employed by supplying a listwith basic information about the distribution, including the density orhazard and ideally also the cumulative distribution or hazard.
Covariates can be included using a linear model on any parameter of thedistribution, log-transformed to the real line if necessary. Thistypically defines an accelerated failure time or proportional hazardsmodel, depending on the distribution and parameter.
flexsurvspline fits the flexible survival model of Roystonand Parmar (2002) in which the log cumulative hazard is modelled as anatural cubic spline function of log time. Covariates can be included onany of the spline parameters, giving either a proportional hazards model oran arbitrarily-flexible time-dependent effect. Alternative proportionalodds or probit parameterisations are available.
Output from the models can be presented as survivor, cumulative hazard andhazard functions (summary.flexsurvreg). These can be plottedagainst nonparametric estimates (plot.flexsurvreg) to assessgoodness-of-fit. Any other user-defined function of the parameters may besummarised in the same way.
Multi-state models for time-to-event data can also be fitted with the samefunctions. Predictions from those models can then be made using thefunctionspmatrix.fs,pmatrix.simfs,totlos.fs,totlos.simfs, orsim.fmsm, or alternatively bymsfit.flexsurvregfollowed bymssample orprobtrans from the packagemstate.
Distribution (“dpqr”) functions for the generalized gamma and Fdistributions are given inGenGamma,GenF(preferred parameterisations) andGenGamma.orig,GenF.orig (original parameterisations).flexsurv also includes the standard Gompertz distributionwith unrestricted shape parameter, seeGompertz.
User guide
Theflexsurv user guide vignette explains themethods in detail, and gives several worked examples. A further vignetteflexsurv-examples gives a few more complicated examples, and usersare encouraged to submit their own.
Author(s)
Christopher Jacksonchris.jackson@mrc-bsu.cam.ac.uk
References
Jackson, C. (2016). flexsurv: A Platform for ParametricSurvival Modeling in R. Journal of Statistical Software, 70(8), 1-33.doi:10.18637/jss.v070.i08
Royston, P. and Parmar, M. (2002). Flexible parametricproportional-hazards and proportional-odds models for censored survivaldata, with application to prognostic modelling and estimation of treatmenteffects. Statistics in Medicine 21(1):2175-2197.
Cox, C. (2008). The generalizedF distribution: An umbrella forparametric survival analysis. Statistics in Medicine 27:4301-4312.
Cox, C., Chu, H., Schneider, M. F. and Muñoz, A. (2007). Parametricsurvival analysis and taxonomy of hazard functions for the generalizedgamma distribution. Statistics in Medicine 26:4252-4374
See Also
Useful links:
Report bugs athttps://github.com/chjackson/flexsurv/issues
helper function to safely convert a Hessian matrix to covariance matrix
Description
helper function to safely convert a Hessian matrix to covariance matrix
Usage
.hess_to_cov(hessian, tol.solve = 1e-09, tol.evalues = 1e-05, ...)Arguments
hessian | hessian matrix to convert to covariance matrix (must be evaluated at MLE) |
tol.solve | tolerance used for solve() |
tol.evalues | accepted tolerance for negative eigenvalues of the covariance matrix |
... | arguments passed to Matrix::nearPD |
Numerical evaluation of the hessian of a function using numDeriv::hessian
Description
We perform a quick check about the expected runtime and adjust theprecision accordingly.
Usage
.hessian(f, x, seconds.warning = 60, default.r = 6, min.r = 2, ...)Arguments
f | function to compute Hessian for |
x | location to evaluate Hessian at |
seconds.warning | time threshold in seconds to trigger message andreduce the number of iterations for Richardson extrapolation ofnumDeriv::hessian |
default.r | default number of iterations (high-precision recommendationof numDeriv) |
min.r | minial number of iteration, must be at least 2, |
... | further arguments passed to method.args of numDeriv::hessian |
Akaike's information criterion from a flexible parametric multistate model
Description
Defined as the sum of the AICs of the transition-specific models.
Usage
## S3 method for class 'fmsm'AIC(object, ..., k = 2)Arguments
object | Object returned by |
... | Further arguments (currently unused). |
k | Penalty applied to number of parameters (defaults to the standard 2). |
Value
The sum of the AICs of the transition-specific models.
Second-order Akaike information criterion
Description
Generic function for the second-order Akaike information criterion.The only current implementation of this inflexsurv isinAICc.flexsurvreg, see there for more details.
Usage
AICc(object, ...)AICC(object, ...)Arguments
object | Fitted model object. |
... | Other arguments (currently unused). |
Details
This can be spelt either asAICC orAICc.
Second-order Akaike information criterion
Description
Second-order or "corrected" Akaike information criterion, oftenknown as AICc (see, e.g. Burnham and Anderson 2002). This isdefined as -2 log-likelihood +(2*p*n)/(n - p -1), whereasthe standard AIC is defined as -2 log-likelihood +2*p,wherep is the number of parameters andn is thesample size. The correction is intended to adjust AIC forsmall-sample bias, hence it only makes a difference to the resultfor smalln.
Usage
## S3 method for class 'flexsurvreg'AICc(object, cens = TRUE, ...)## S3 method for class 'flexsurvreg'AICC(object, cens = TRUE, ...)Arguments
object | Fitted model returned by |
cens | Include censored observations in the sample size term( |
... | Other arguments (currently unused). |
Details
This can be spelt either asAICC orAICc.
Value
The second-order AIC of the fitted model.
References
Burnham, K. P., Anderson, D. R. (2002) Model Selection and Multimodel Inference: a practical information-theoretic approach. Second edition. Springer: New York.
See Also
BIC,AIC,BIC.flexsurvreg,nobs.flexsurvreg
Bayesian Information Criterion (BIC) for comparison of flexsurvreg models
Description
Bayesian Information Criterion (BIC) for comparison of flexsurvreg models
Usage
## S3 method for class 'flexsurvreg'BIC(object, cens = TRUE, ...)Arguments
object | Fitted model returned by |
cens | Include censored observations in the sample size term( |
... | Other arguments (currently unused). |
Details
There is no "official" definition of what the sample sizeshould be for the use of BIC in censored survival analysis. BICis based on an approximation to Bayesian model comparison usingBayes factors and an implicit vague prior. Informally, thesample size describes the number of "units" giving rise to adistinct piece of information (Kass and Raftery 1995). Howevercensored observations provide less information than observedevents, in principle. The default used here is the number ofindividuals, for consistency with more familiar kinds ofstatistical modelling. However if censoring is heavy, then thenumber of events may be a better represent the amount ofinformation. Following these principles, the best approximationwould be expected to be somewere in between.
AIC and BIC are intended to measure different things. Briefly,AIC measures predictive ability, whereas BIC is expected to choosethe true model from a set of models where one of them is thetruth. Therefore BIC chooses simpler models for all but thetiniest sample sizes (log(n)>2,n>7). AIC might be preferred in thetypical situation where"all models are wrong but some are useful". AIC also gives similarresults to cross-validation (Stone 1977).
Value
The BIC of the fitted model. This is minus twice the log likelihood plusp*log(n), wherep is the number of parameters andn is the samplesize of the data. Ifweights was supplied toflexsurv, the sample size is defined as the sum of theweights.
References
Kass, R. E., & Raftery, A. E. (1995). Bayesfactors. Journal of the American Statistical Association,90(430), 773-795.
Stone, M. (1977). An asymptotic equivalence of choiceof model by cross‐validation and Akaike's criterion. Journal ofthe Royal Statistical Society: Series B (Methodological), 39(1),44-47.
See Also
BIC,AIC,AICC.flexsurvreg,nobs.flexsurvreg
Generalized F distribution
Description
Density, distribution function, hazards, quantile function andrandom generation for the generalized F distribution, using thereparameterisation by Prentice (1975).
Usage
dgenf(x, mu = 0, sigma = 1, Q, P, log = FALSE)pgenf(q, mu = 0, sigma = 1, Q, P, lower.tail = TRUE, log.p = FALSE)Hgenf(x, mu = 0, sigma = 1, Q, P)hgenf(x, mu = 0, sigma = 1, Q, P)qgenf(p, mu = 0, sigma = 1, Q, P, lower.tail = TRUE, log.p = FALSE)rgenf(n, mu = 0, sigma = 1, Q, P)Arguments
x,q | Vector of quantiles. |
mu | Vector of location parameters. |
sigma | Vector of scale parameters. |
Q | Vector of first shape parameters. |
P | Vector of second shape parameters. |
log,log.p | logical; if TRUE, probabilities p are given as log(p). |
lower.tail | logical; if TRUE (default), probabilities are |
p | Vector of probabilities. |
n | number of observations. If |
Details
Ify \sim F(2s_1, 2s_2), andw = \log(y) thenx = \exp(w\sigma + \mu) has the original generalized F distribution withlocation parameter\mu, scale parameter\sigma>0and shape parameterss_1,s_2.
In this more stable version described by Prentice (1975),s_1,s_2 are replaced by shape parametersQ,P, withP>0, and
s_1 = 2(Q^2 + 2P + Q\delta)^{-1}, \quad s_2 = 2(Q^2 + 2P -Q\delta)^{-1}
equivalently
Q = \left(\frac{1}{s_1} -\frac{1}{s_2}\right)\left(\frac{1}{s_1} + \frac{1}{s_2}\right)^{-1/2},\quad P = \frac{2}{s_1 + s_2}
Define\delta = (Q^2 + 2P)^{1/2}, andw = (\log(x) - \mu)\delta /\sigma,then the probability density function ofx is
f(x) =\frac{\delta (s_1/s_2)^{s_1} e^{s_1 w}}{\sigma x (1 + s_1 e^w/s_2) ^ {(s_1+ s_2)} B(s_1, s_2)}
Theoriginal parameterisation is available in this package asdgenf.orig, for the sake of completion / compatibility. Withthe above definitions,
dgenf(x, mu=mu, sigma=sigma, Q=Q, P=P) = dgenf.orig(x, mu=mu,sigma=sigma/delta, s1=s1, s2=s2)
The generalized F distribution withP=0 is equivalent to thegeneralized gamma distributiondgengamma, so thatdgenf(x, mu, sigma, Q, P=0) equalsdgengamma(x, mu, sigma,Q). The generalized gamma reduces further to several commondistributions, as described in theGenGamma help page.
The generalized F distribution includes the log-logistic distribution (seeeha::dllogis) as a further special case:
dgenf(x, mu=mu, sigma=sigma, Q=0, P=1) =eha::dllogis(x,shape=sqrt(2)/sigma, scale=exp(mu))
The range of hazard trajectories available under this distribution arediscussed in detail by Cox (2008). Jackson et al. (2010) give anapplication to modelling oral cancer survival for use in a health economicevaluation of screening.
Value
dgenf gives the density,pgenf gives the distributionfunction,qgenf gives the quantile function,rgenf generatesrandom deviates,Hgenf retuns the cumulative hazard andhgenfthe hazard.
Note
The parametersQ andP are usually calledq andp in the literature - they were made upper-case in these R functionsto avoid clashing with the conventional argumentsq in theprobability function andp in the quantile function.
Author(s)
Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>
References
R. L. Prentice (1975). Discrimination among some parametricmodels. Biometrika 62(3):607-614.
Cox, C. (2008). The generalizedF distribution: An umbrella forparametric survival analysis. Statistics in Medicine 27:4301-4312.
Jackson, C. H. and Sharples, L. D. and Thompson, S. G. (2010). Survivalmodels in health economic evaluations: balancing fit and parsimony toimprove prediction. International Journal of Biostatistics 6(1):Article34.
See Also
Generalized F distribution (original parameterisation)
Description
Density, distribution function, quantile function and randomgeneration for the generalized F distribution, using the lessflexible original parameterisation described by Prentice (1975).
Usage
dgenf.orig(x, mu = 0, sigma = 1, s1, s2, log = FALSE)pgenf.orig(q, mu = 0, sigma = 1, s1, s2, lower.tail = TRUE, log.p = FALSE)Hgenf.orig(x, mu = 0, sigma = 1, s1, s2)hgenf.orig(x, mu = 0, sigma = 1, s1, s2)qgenf.orig(p, mu = 0, sigma = 1, s1, s2, lower.tail = TRUE, log.p = FALSE)rgenf.orig(n, mu = 0, sigma = 1, s1, s2)Arguments
x,q | vector of quantiles. |
mu | Vector of location parameters. |
sigma | Vector of scale parameters. |
s1 | Vector of first F shape parameters. |
s2 | vector of second F shape parameters. |
log,log.p | logical; if TRUE, probabilities p are given as log(p). |
lower.tail | logical; if TRUE (default), probabilities are |
p | vector of probabilities. |
n | number of observations. If |
Details
Ify \sim F(2s_1, 2s_2), andw = \log(y) thenx = \exp(w\sigma + \mu) has the original generalized F distribution withlocation parameter\mu, scale parameter\sigma>0and shape parameterss_1>0,s_2>0. The probability densityfunction ofx is
f(x | \mu, \sigma, s_1, s_2) = \frac{(s_1/s_2)^{s_1} e^{s_1 w}}{\sigma x (1 + s_1 e^w/s_2) ^ {(s_1 + s_2)} B(s_1, s_2)}
wherew = (\log(x) - \mu)/\sigma, andB(s_1,s_2) = \Gamma(s_1)\Gamma(s_2)/\Gamma(s_1+s_2) is the beta function.
Ass_2 \rightarrow \infty, the distributionofx tends towards an original generalized gammadistribution with the following parameters:
dgengamma.orig(x, shape=1/sigma, scale=exp(mu) /s1^sigma, k=s1)
SeeGenGamma.orig for how this includes severalother common distributions as special cases.
The alternative parameterisation of the generalized Fdistribution, originating from Prentice (1975) and given in thispackage asGenF, is preferred for statisticalmodelling, since it is more stable ass_1 tends toinfinity, and includes a further new class of distributions withnegative first shape parameter. The original is provided here forthe sake of completion and compatibility.
Value
dgenf.orig gives the density,pgenf.orig gives thedistribution function,qgenf.orig gives the quantile function,rgenf.orig generates random deviates,Hgenf.orig retuns thecumulative hazard andhgenf.orig the hazard.
Author(s)
Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>
References
R. L. Prentice (1975). Discrimination among some parametricmodels. Biometrika 62(3):607-614.
See Also
Generalized gamma distribution
Description
Density, distribution function, hazards, quantile function andrandom generation for the generalized gamma distribution, usingthe parameterisation originating from Prentice (1974). Also knownas the (generalized) log-gamma distribution.
Usage
dgengamma(x, mu = 0, sigma = 1, Q, log = FALSE)pgengamma(q, mu = 0, sigma = 1, Q, lower.tail = TRUE, log.p = FALSE)Hgengamma(x, mu = 0, sigma = 1, Q)hgengamma(x, mu = 0, sigma = 1, Q)qgengamma(p, mu = 0, sigma = 1, Q, lower.tail = TRUE, log.p = FALSE)rgengamma(n, mu = 0, sigma = 1, Q)Arguments
x,q | vector of quantiles. |
mu | Vector of “location” parameters. |
sigma | Vector of “scale” parameters. Note the inconsistentmeanings of the term “scale” - this parameter is analogous to the(log-scale) standard deviation of the log-normal distribution, “sdlog” in |
Q | Vector of shape parameters. |
log,log.p | logical; if TRUE, probabilities p are given as log(p). |
lower.tail | logical; if TRUE (default), probabilities are |
p | vector of probabilities. |
n | number of observations. If |
Details
If\gamma \sim Gamma(Q^{-2}, 1) , andw =log(Q^2 \gamma) / Q, thenx = \exp(\mu + \sigmaw) follows the generalized gamma distribution withprobability density function
f(x | \mu, \sigma, Q) = \frac{|Q|(Q^{-2})^{Q^{-2}}}{\sigma x\Gamma(Q^{-2})} \exp(Q^{-2}(Qw - \exp(Qw)))
This parameterisation is preferred to the originalparameterisation of the generalized gamma by Stacy (1962) since itis more numerically stable near toQ=0 (the log-normaldistribution), and allowsQ<=0. The original is availablein this package asdgengamma.orig, for the sake ofcompletion and compatibility with other software - this isimplicitly restricted toQ>0 (ork>0 in the originalnotation). The parameters ofdgengamma anddgengamma.orig are related as follows.
dgengamma.orig(x, shape=shape, scale=scale, k=k) =
dgengamma(x, mu=log(scale) + log(k)/shape, sigma=1/(shape*sqrt(k)),Q=1/sqrt(k))
The generalized gamma distribution simplifies to the gamma,log-normal and Weibull distributions with the followingparameterisations:
dgengamma(x, mu, sigma, Q=0) | = | dlnorm(x, mu, sigma) |
dgengamma(x, mu, sigma, Q=1) | = | dweibull(x, shape=1/sigma, scale=exp(mu)) |
dgengamma(x, mu, sigma, Q=sigma) | = | dgamma(x,shape=1/sigma^2, rate=exp(-mu) / sigma^2) |
The properties of thegeneralized gamma and its applications to survival analysis are discussedin detail by Cox (2007).
The generalized F distributionGenF extends the generalizedgamma to four parameters.
Value
dgengamma gives the density,pgengamma gives thedistribution function,qgengamma gives the quantile function,rgengamma generates random deviates,Hgengamma retuns thecumulative hazard andhgengamma the hazard.
Author(s)
Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>
References
Prentice, R. L. (1974). A log gamma model and its maximumlikelihood estimation. Biometrika 61(3):539-544.
Farewell, V. T. and Prentice, R. L. (1977). A study ofdistributional shape in life testing. Technometrics 19(1):69-75.
Lawless, J. F. (1980). Inference in the generalized gamma and loggamma distributions. Technometrics 22(3):409-419.
Cox, C., Chu, H., Schneider, M. F. and Muñoz, A. (2007).Parametric survival analysis and taxonomy of hazard functions forthe generalized gamma distribution. Statistics in Medicine26:4252-4374
Stacy, E. W. (1962). A generalization of the gamma distribution.Annals of Mathematical Statistics 33:1187-92
See Also
GenGamma.orig,GenF,Lognormal,GammaDist,Weibull.
Generalized gamma distribution (original parameterisation)
Description
Density, distribution function, hazards, quantile function andrandom generation for the generalized gamma distribution, usingthe original parameterisation from Stacy (1962).
Usage
dgengamma.orig(x, shape, scale = 1, k, log = FALSE)pgengamma.orig(q, shape, scale = 1, k, lower.tail = TRUE, log.p = FALSE)Hgengamma.orig(x, shape, scale = 1, k)hgengamma.orig(x, shape, scale = 1, k)qgengamma.orig(p, shape, scale = 1, k, lower.tail = TRUE, log.p = FALSE)rgengamma.orig(n, shape, scale = 1, k)Arguments
x,q | vector of quantiles. |
shape | vector of “Weibull” shape parameters. |
scale | vector of scale parameters. |
k | vector of “Gamma” shape parameters. |
log,log.p | logical; if TRUE, probabilities p are given as log(p). |
lower.tail | logical; if TRUE (default), probabilities are |
p | vector of probabilities. |
n | number of observations. If |
Details
Ifw \sim Gamma(k,1), thenx =\exp(w/shape + \log(scale))follows the original generalised gamma distribution with theparameterisation given here (Stacy 1962). Definingshape=b>0,scale=a>0,x hasprobability density
f(x | a, b, k) = \frac{b}{\Gamma(k)} \frac{x^{bk - 1}}{a^{bk}}
\exp(-(x/a)^b)
The original generalized gamma distribution simplifies to thegamma, exponential and Weibull distributions with the followingparameterisations:
dgengamma.orig(x, shape, scale, k=1) | = | dweibull(x, shape, scale) |
dgengamma.orig(x,shape=1, scale, k) | = | dgamma(x, shape=k,scale) |
dgengamma.orig(x, shape=1, scale, k=1) | = | dexp(x, rate=1/scale) |
Also as k tends to infinity, it tends to the log normal (as indlnorm) with the following parameters (Lawless,1980):
dlnorm(x, meanlog=log(scale) + log(k)/shape,sdlog=1/(shape*sqrt(k)))
For more stable behaviour as the distribution tends to the log-normal, analternative parameterisation was developed by Prentice (1974). This isgiven indgengamma, and is now preferred for statisticalmodelling. It is also more flexible, including a further new class ofdistributions with negative shapek.
The generalized F distributionGenF.orig, and its similaralternative parameterisationGenF, extend the generalizedgamma to four parameters.
Value
dgengamma.orig gives the density,pgengamma.origgives the distribution function,qgengamma.orig gives the quantilefunction,rgengamma.orig generates random deviates,Hgengamma.orig retuns the cumulative hazard andhgengamma.orig the hazard.
Author(s)
Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>
References
Stacy, E. W. (1962). A generalization of the gammadistribution. Annals of Mathematical Statistics 33:1187-92.
Prentice, R. L. (1974). A log gamma model and its maximum likelihoodestimation. Biometrika 61(3):539-544.
Lawless, J. F. (1980). Inference in the generalized gamma and log gammadistributions. Technometrics 22(3):409-419.
See Also
GenGamma,GenF.orig,GenF,Lognormal,GammaDist,Weibull.
The Gompertz distribution
Description
Density, distribution function, hazards, quantile function and randomgeneration for the Gompertz distribution with unrestricted shape.
Usage
dgompertz(x, shape, rate = 1, log = FALSE)pgompertz(q, shape, rate = 1, lower.tail = TRUE, log.p = FALSE)qgompertz(p, shape, rate = 1, lower.tail = TRUE, log.p = FALSE)rgompertz(n, shape = 1, rate = 1)hgompertz(x, shape, rate = 1, log = FALSE)Hgompertz(x, shape, rate = 1, log = FALSE)Arguments
x,q | vector of quantiles. |
shape,rate | vector of shape and rate parameters. |
log,log.p | logical; if TRUE, probabilities p are given as log(p). |
lower.tail | logical; if TRUE (default), probabilities are |
p | vector of probabilities. |
n | number of observations. If |
Details
The Gompertz distribution withshape parametera andrate parameterb has probability density function
f(x | a, b) = be^{ax}\exp(-b/a (e^{ax} - 1))
and hazard
h(x | a, b) = b e^{ax}
The hazard is increasing for shapea>0 and decreasing fora<0.Fora=0 the Gompertz is equivalent to the exponential distributionwith constant hazard and rateb.
The probability distribution function is
F(x | a, b) = 1 - \exp(-b/a(e^{ax} - 1))
Thus ifa is negative, lettingx tend to infinity shows thatthere is a non-zero probability\exp(b/a) of livingforever. On these occasionsqgompertz andrgompertz willreturnInf.
Value
dgompertz gives the density,pgompertz gives thedistribution function,qgompertz gives the quantile function,hgompertz gives the hazard function,Hgompertz gives thecumulative hazard function, andrgompertz generates random deviates.
Note
Some implementations of the Gompertz restricta to be strictlypositive, which ensures that the probability of survival decreases to zeroasx increases to infinity. The more flexible implementation givenhere is consistent withstreg in Stata.
The functionseha::dgompertz and similar available in thepackageeha label the parameters the other way round, so that what iscalled theshape there is called therate here, and what iscalled1 / scale there is called theshape here. Theterminology here is consistent with the exponentialdexp andWeibulldweibull distributions in R.
Author(s)
Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>
References
Stata Press (2007) Stata release 10 manual: Survival analysisand epidemiological tables.
See Also
The log-logistic distribution
Description
Density, distribution function, hazards, quantile function andrandom generation for the log-logistic distribution.
Usage
dllogis(x, shape = 1, scale = 1, log = FALSE)pllogis(q, shape = 1, scale = 1, lower.tail = TRUE, log.p = FALSE)qllogis(p, shape = 1, scale = 1, lower.tail = TRUE, log.p = FALSE)rllogis(n, shape = 1, scale = 1)hllogis(x, shape = 1, scale = 1, log = FALSE)Hllogis(x, shape = 1, scale = 1, log = FALSE)Arguments
x,q | vector of quantiles. |
shape,scale | vector of shape and scale parameters. |
log,log.p | logical; if TRUE, probabilities p are given aslog(p). |
lower.tail | logical; if TRUE (default), probabilities are |
p | vector of probabilities. |
n | number of observations. If |
Details
The log-logistic distribution withshape parametera>0 andscale parameterb>0 has probabilitydensity function
f(x | a, b) = (a/b) (x/b)^{a-1} / (1 + (x/b)^a)^2
and hazard
h(x | a, b) = (a/b) (x/b)^{a-1} / (1 + (x/b)^a)
forx>0. The hazard is decreasing for shapea\leq 1, and unimodal fora > 1.
The probability distribution function is
F(x | a, b) = 1 - 1/ (1 + (x/b)^a)
Ifa > 1, the mean isb c / sin(c), and ifa > 2the variance isb^2 * (2*c/sin(2*c) - c^2/sin(c)^2), wherec = \pi/a, otherwise these are undefined.
Value
dllogis gives the density,pllogis gives thedistribution function,qllogis gives the quantile function,hllogis gives the hazard function,Hllogis gives thecumulative hazard function, andrllogis generates randomdeviates.
Note
Various different parameterisations of this distribution areused. In the one used here, the interpretation of the parametersis the same as in the standard Weibull distribution(dweibull). Like the Weibull, the survivor functionis a transformation of(x/b)^a from the non-negative real line to [0,1],but with a different link function. Covariates onbrepresent time acceleration factors, or ratios of expectedsurvival.
The same parameterisation is also used ineha::dllogis in theeha package.
Author(s)
Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>
References
Stata Press (2007) Stata release 10 manual: Survival analysisand epidemiological tables.
See Also
Royston/Parmar spline survival distribution
Description
Probability density, distribution, quantile, random generation, hazard, cumulative hazard, mean and restricted mean functions for the Royston/Parmarspline model. These functions have all parameters of the distribution collectedtogether in a single argumentgamma. For the equivalent functions withone argument per parameter, seeSurvsplinek.
Usage
dsurvspline( x, gamma, beta = 0, X = 0, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp", offset = 0, log = FALSE)psurvspline( q, gamma, beta = 0, X = 0, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp", offset = 0, lower.tail = TRUE, log.p = FALSE)qsurvspline( p, gamma, beta = 0, X = 0, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp", offset = 0, lower.tail = TRUE, log.p = FALSE)rsurvspline( n, gamma, beta = 0, X = 0, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp", offset = 0)Hsurvspline( x, gamma, beta = 0, X = 0, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp", offset = 0)hsurvspline( x, gamma, beta = 0, X = 0, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp", offset = 0)rmst_survspline( t, gamma, beta = 0, X = 0, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp", offset = 0, start = 0)mean_survspline( gamma, beta = 0, X = 0, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp", offset = 0)Arguments
x,q,t | Vector of times. |
gamma | Parameters describing the baseline spline function, asdescribed in |
beta | Vector of covariate effects. Not supported and ignored since version 2.3, and this argument will be removed in 2.4. |
X | Matrix of covariate values. Not supported and ignored since version 2.3, and this argument will be removed in 2.4. |
knots | Locations of knots on the axis of log time, supplied inincreasing order. Unlike in This may in principle be supplied as a matrix, in the same way as for |
scale |
|
timescale |
|
spline |
|
offset | An extra constant to add to the linear predictor |
log,log.p | Return log density or probability. |
lower.tail | logical; if TRUE (default), probabilities are |
p | Vector of probabilities. |
n | Number of random numbers to simulate. |
start | Optional left-truncation time or times. The returnedrestricted mean survival will be conditioned on survival up tothis time. |
Value
dsurvspline gives the density,psurvspline gives thedistribution function,hsurvspline gives the hazard andHsurvspline gives the cumulative hazard, as described inflexsurvspline.
qsurvspline gives the quantile function, which is computed by crudenumerical inversion (usingqgeneric).
rsurvspline generates random survival times by usingqsurvspline on a sample of uniform random numbers. Due to thenumerical root-finding involved inqsurvspline, it is slow comparedto typical random number generation functions.
Author(s)
Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>
References
Royston, P. and Parmar, M. (2002). Flexible parametricproportional-hazards and proportional-odds models for censored survivaldata, with application to prognostic modelling and estimation of treatmenteffects. Statistics in Medicine 21(1):2175-2197.
Wang W, Yan J (2021). Shape-Restricted Regression Splines with RPackage splines2. Journal of Data Science, 19(3), 498-517.
See Also
Examples
## reduces to the weibullregscale <- 0.786; cf <- 1.82a <- 1/regscale; b <- exp(cf)dweibull(1, shape=a, scale=b)dsurvspline(1, gamma=c(log(1 / b^a), a)) # should be the same## reduces to the log-normalmeanlog <- 1.52; sdlog <- 1.11dlnorm(1, meanlog, sdlog) dsurvspline(1, gamma = c(-meanlog/sdlog, 1/sdlog), scale="normal")# should be the sameRoyston/Parmar spline survival distribution functions with one argument per parameter
Description
Probability density, distribution, quantile, random generation, hazard, cumulative hazard, mean and restricted mean functions for the Royston/Parmarspline model, with one argument per parameter. For the equivalent functions with all parameters collected together in a single argument, seeSurvspline.
Usage
mean_survspline0( gamma0, gamma1, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp")mean_survspline1( gamma0, gamma1, gamma2, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp")mean_survspline2( gamma0, gamma1, gamma2, gamma3, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp")mean_survspline3( gamma0, gamma1, gamma2, gamma3, gamma4, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp")mean_survspline4( gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp")mean_survspline5( gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp")mean_survspline6( gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, gamma7, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp")mean_survspline7( gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, gamma7, gamma8, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp")rmst_survspline0( t, gamma0, gamma1, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp", start = 0)rmst_survspline1( t, gamma0, gamma1, gamma2, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp", start = 0)rmst_survspline2( t, gamma0, gamma1, gamma2, gamma3, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp", start = 0)rmst_survspline3( t, gamma0, gamma1, gamma2, gamma3, gamma4, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp", start = 0)rmst_survspline4( t, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp", start = 0)rmst_survspline5( t, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp", start = 0)rmst_survspline6( t, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, gamma7, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp", start = 0)rmst_survspline7( t, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, gamma7, gamma8, knots = c(-10, 10), scale = "hazard", timescale = "log", spline = "rp", start = 0)dsurvspline0( x, gamma0, gamma1, knots, scale = "hazard", timescale = "log", spline = "rp", log = FALSE)dsurvspline1( x, gamma0, gamma1, gamma2, knots, scale = "hazard", timescale = "log", spline = "rp", log = FALSE)dsurvspline2( x, gamma0, gamma1, gamma2, gamma3, knots, scale = "hazard", timescale = "log", spline = "rp", log = FALSE)dsurvspline3( x, gamma0, gamma1, gamma2, gamma3, gamma4, knots, scale = "hazard", timescale = "log", spline = "rp", log = FALSE)dsurvspline4( x, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, knots, scale = "hazard", timescale = "log", spline = "rp", log = FALSE)dsurvspline5( x, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, knots, scale = "hazard", timescale = "log", spline = "rp", log = FALSE)dsurvspline6( x, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, gamma7, knots, scale = "hazard", timescale = "log", spline = "rp", log = FALSE)dsurvspline7( x, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, gamma7, gamma8, knots, scale = "hazard", timescale = "log", spline = "rp", log = FALSE)psurvspline0( q, gamma0, gamma1, knots, scale = "hazard", timescale = "log", spline = "rp", lower.tail = TRUE, log.p = FALSE)psurvspline1( q, gamma0, gamma1, gamma2, knots, scale = "hazard", timescale = "log", spline = "rp", lower.tail = TRUE, log.p = FALSE)psurvspline2( q, gamma0, gamma1, gamma2, gamma3, knots, scale = "hazard", timescale = "log", spline = "rp", lower.tail = TRUE, log.p = FALSE)psurvspline3( q, gamma0, gamma1, gamma2, gamma3, gamma4, knots, scale = "hazard", timescale = "log", spline = "rp", lower.tail = TRUE, log.p = FALSE)psurvspline4( q, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, knots, scale = "hazard", timescale = "log", spline = "rp", lower.tail = TRUE, log.p = FALSE)psurvspline5( q, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, knots, scale = "hazard", timescale = "log", spline = "rp", lower.tail = TRUE, log.p = FALSE)psurvspline6( q, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, gamma7, knots, scale = "hazard", timescale = "log", spline = "rp", lower.tail = TRUE, log.p = FALSE)psurvspline7( q, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, gamma7, gamma8, knots, scale = "hazard", timescale = "log", spline = "rp", lower.tail = TRUE, log.p = FALSE)qsurvspline0( p, gamma0, gamma1, knots, scale = "hazard", timescale = "log", spline = "rp", lower.tail = TRUE, log.p = FALSE)qsurvspline1( p, gamma0, gamma1, gamma2, knots, scale = "hazard", timescale = "log", spline = "rp", lower.tail = TRUE, log.p = FALSE)qsurvspline2( p, gamma0, gamma1, gamma2, gamma3, knots, scale = "hazard", timescale = "log", spline = "rp", lower.tail = TRUE, log.p = FALSE)qsurvspline3( p, gamma0, gamma1, gamma2, gamma3, gamma4, knots, scale = "hazard", timescale = "log", spline = "rp", lower.tail = TRUE, log.p = FALSE)qsurvspline4( p, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, knots, scale = "hazard", timescale = "log", spline = "rp", lower.tail = TRUE, log.p = FALSE)qsurvspline5( p, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, knots, scale = "hazard", timescale = "log", spline = "rp", lower.tail = TRUE, log.p = FALSE)qsurvspline6( p, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, gamma7, knots, scale = "hazard", timescale = "log", spline = "rp", lower.tail = TRUE, log.p = FALSE)qsurvspline7( p, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, gamma7, gamma8, knots, scale = "hazard", timescale = "log", spline = "rp", lower.tail = TRUE, log.p = FALSE)rsurvspline0( n, gamma0, gamma1, knots, scale = "hazard", timescale = "log", spline = "rp")rsurvspline1( n, gamma0, gamma1, gamma2, knots, scale = "hazard", timescale = "log", spline = "rp")rsurvspline2( n, gamma0, gamma1, gamma2, gamma3, knots, scale = "hazard", timescale = "log", spline = "rp")rsurvspline3( n, gamma0, gamma1, gamma2, gamma3, gamma4, knots, scale = "hazard", timescale = "log", spline = "rp")rsurvspline4( n, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, knots, scale = "hazard", timescale = "log", spline = "rp")rsurvspline5( n, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, knots, scale = "hazard", timescale = "log", spline = "rp")rsurvspline6( n, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, gamma7, knots, scale = "hazard", timescale = "log", spline = "rp")rsurvspline7( n, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, gamma7, gamma8, knots, scale = "hazard", timescale = "log", spline = "rp")hsurvspline0( x, gamma0, gamma1, knots, scale = "hazard", timescale = "log", spline = "rp")hsurvspline1( x, gamma0, gamma1, gamma2, knots, scale = "hazard", timescale = "log", spline = "rp")hsurvspline2( x, gamma0, gamma1, gamma2, gamma3, knots, scale = "hazard", timescale = "log", spline = "rp")hsurvspline3( x, gamma0, gamma1, gamma2, gamma3, gamma4, knots, scale = "hazard", timescale = "log", spline = "rp")hsurvspline4( x, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, knots, scale = "hazard", timescale = "log", spline = "rp")hsurvspline5( x, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, knots, scale = "hazard", timescale = "log", spline = "rp")hsurvspline6( x, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, gamma7, knots, scale = "hazard", timescale = "log", spline = "rp")hsurvspline7( x, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, gamma7, gamma8, knots, scale = "hazard", timescale = "log", spline = "rp")Hsurvspline0( x, gamma0, gamma1, knots, scale = "hazard", timescale = "log", spline = "rp")Hsurvspline1( x, gamma0, gamma1, gamma2, knots, scale = "hazard", timescale = "log", spline = "rp")Hsurvspline2( x, gamma0, gamma1, gamma2, gamma3, knots, scale = "hazard", timescale = "log", spline = "rp")Hsurvspline3( x, gamma0, gamma1, gamma2, gamma3, gamma4, knots, scale = "hazard", timescale = "log", spline = "rp")Hsurvspline4( x, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, knots, scale = "hazard", timescale = "log", spline = "rp")Hsurvspline5( x, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, knots, scale = "hazard", timescale = "log", spline = "rp")Hsurvspline6( x, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, gamma7, knots, scale = "hazard", timescale = "log", spline = "rp")Hsurvspline7( x, gamma0, gamma1, gamma2, gamma3, gamma4, gamma5, gamma6, gamma7, gamma8, knots, scale = "hazard", timescale = "log", spline = "rp")Arguments
gamma0,gamma1,gamma2,gamma3,gamma4,gamma5,gamma6,gamma7,gamma8 | Parameters describing the baseline spline function, asdescribed in |
knots | Locations of knots on the axis of log time, supplied inincreasing order. Unlike in This may in principle be supplied as a matrix, in the same way as for |
scale |
|
timescale |
|
spline |
|
start | Optional left-truncation time or times. The returnedrestricted mean survival will be conditioned on survival up tothis time. |
x,q,t | Vector of times. |
log,log.p | Return log density or probability. |
lower.tail | logical; if TRUE (default), probabilities are |
p | Vector of probabilities. |
n | Number of random numbers to simulate. |
Details
These functions go up to 7 spline knots, or 9 parameters.
Author(s)
Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>
Weibull distribution in proportional hazards parameterisation
Description
Density, distribution function, hazards, quantile function and randomgeneration for the Weibull distribution in its proportional hazardsparameterisation.
Usage
dweibullPH(x, shape, scale = 1, log = FALSE)pweibullPH(q, shape, scale = 1, lower.tail = TRUE, log.p = FALSE)qweibullPH(p, shape, scale = 1, lower.tail = TRUE, log.p = FALSE)hweibullPH(x, shape, scale = 1, log = FALSE)HweibullPH(x, shape, scale = 1, log = FALSE)rweibullPH(n, shape, scale = 1)Arguments
x,q | Vector of quantiles. |
shape | Vector of shape parameters. |
scale | Vector of scale parameters. |
log,log.p | logical; if TRUE, probabilities p are given as log(p). |
lower.tail | logical; if TRUE (default), probabilities are |
p | Vector of probabilities. |
n | number of observations. If |
Details
The Weibull distribution in proportional hazards parameterisation with‘shape’ parameter a and ‘scale’ parameter m has density given by
f(x) = a m x^{a-1} exp(- m x^a)
cumulative distribution functionF(x) = 1 - exp( -m x^a ), survivorfunctionS(x) = exp( -m x^a ), cumulative hazardm x^a andhazarda m x^{a-1}.
dweibull in base R has the alternative 'accelerated failuretime' (AFT) parameterisation with shape a and scale b. The shape parametera is the same in both versions. The scale parameters are related asb = m^{-1/a}, equivalently m = b^-a.
In survival modelling, covariates are typically included through a linearmodel on the log scale parameter. Thus, in the proportional hazards model,the coefficients in such a model onm are interpreted as log hazardratios.
In the AFT model, covariates onb are interpreted as timeacceleration factors. For example, doubling the value of a covariate withcoefficientbeta=log(2) would give half the expected survival time.These coefficients are related to the log hazard ratios\gamma as\beta = -\gamma / a.
Value
dweibullPH gives the density,pweibullPH gives thedistribution function,qweibullPH gives the quantile function,rweibullPH generates random deviates,HweibullPH retuns thecumulative hazard andhweibullPH the hazard.
Author(s)
Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>
See Also
Aalen-Johansen nonparametric estimates comparable to a fitted flexsurvmixmodel
Description
Given a fitted flexsurvmix model, return the Aalen-Johansen estimates of theprobability of occupying each state at a series of times covering theobserved data. State 1 represents not having experienced any of thecompeting events, while state 2 and any further states correspond to havingexperienced each of the competing events respectively. These estimates canbe compared with the fitted probabilities returned byp_flexsurvmix to check the fit of aflexsurvmix model.
Usage
ajfit(x, newdata = NULL, tidy = TRUE)Arguments
x | Fitted model returned by |
newdata | Data frame of alternative covariate values to check fit for.Only factor covariates are supported. |
tidy | If |
Details
This is only supported for models with no covariates or models containingonly factor covariates.
For models with factor covariates, the Aalen-Johansen estimates are computedfor the subsets of the data defined innewdata. Ifnewdata isnot supplied, then this function returns state occupancy probabilities forall possible combinations of the factor levels.
The Aalen-Johansen estimates are computed usingsurvfit from thesurvival package (Therneau2020).
References
Therneau T (2020). _A Package for Survival Analysis in R_. Rpackage version 3.2-3, <URL: https://CRAN.R-project.org/package=survival>.
Forms a tidy data frame for plotting the fit of parametric mixturemulti-state models against nonparametric estimates
Description
This computes Aalen-Johansen estimates of the probability of occupying eachstate at a series of times, usingajfit. The equivalentestimates from the parametric model are then produced usingp_flexsurvmix, and concatenated with the nonparametricestimates to form a tidy data frame. This data frame can then simply beplotted usingggplot.
Usage
ajfit_flexsurvmix(x, maxt = NULL, startname = "Start", B = NULL)Arguments
x | Fitted model returned by |
maxt | Maximum time to produce parametric estimates for. By defaultthis is the maximum event time in the data, the maximum time we havenonparametric estimates for. |
startname | Label to give the state corresponding to "no event happenedyet". By default this is |
B | Number of simulation replications to use to calculate a confidenceinterval for the parametric estimates in |
Check the fit of Markov flexible parametric multi-state models againstnonparametric estimates.
Description
Computes both parametric and comparable Aalen-Johansen nonparametricestimates from a flexible parametric multi-state model, and returns themtogether in a tidy data frame. Only models with no covariates, or onlyfactor covariates, are supported. If there are factor covariates, then thenonparametric estimates are computed for subgroups defined by combinationsof the covariates. Another restriction of this function is that alltransitions must have the same covariates on them.
Usage
ajfit_fmsm(x, maxt = NULL, newdata = NULL)Arguments
x | Object returned by |
maxt | Maximum time to compute parametric estimates to. |
newdata | Data frame defining the subgroups to consider. This musthave a column for each covariate in the model. If omitted, then allpotential subgroups defined by combinations of factor covariates areincluded. Continuous covariates are not supported. |
Value
Tidy data frame containing both Aalen-Johansen and parametricestimates of state occupancy over time, and by subgroup if subgroups areincluded.
Augment data with information from a flexsurv model object
Description
Augment accepts a model object and a dataset and adds information about each observation in the dataset. Most commonly, this includes predicted values in the.fitted column, residuals in the.resid column, and standard errors for the fitted values in a.se.fit column. New columns always begin with a . prefix to avoid overwriting columns in the original dataset.
Usage
## S3 method for class 'flexsurvreg'augment( x, data = NULL, newdata = NULL, type.predict = "response", type.residuals = "response", ...)Arguments
x | Output from |
data | A |
newdata | A |
type.predict | Character indicating type of prediction to use. Passed to the |
type.residuals | Character indicating type of residuals to use. Passed to the type argument of |
... | Additional arguments. Not currently used. |
Details
If neither ofdata ornewdata are specified, thenmodel.frame(x) will be used. It is worth noting thatmodel.frame(x) will include aSurv object and not the original time-to-event variables used when fitting theflexsurvreg object. If the original data is desired, specifydata.
Value
Atibble containingdata ornewdata and possible additional columns:
.fittedFitted values of model.se.fitStandard errors of fitted values.residResiduals (not present ifnewdataspecified)
Examples
fit <- flexsurvreg(formula = Surv(futime, fustat) ~ age, data = ovarian, dist = "exp")augment(fit, data = ovarian)Natural cubic spline basis
Description
Compute a basis for a natural cubic spline, by default using theparameterisation described by Royston and Parmar (2002). Used forflexible parametric survival models.
Usage
basis(knots, x, spline = "rp")Arguments
knots | Vector of knot locations in increasing order, including theboundary knots at the beginning and end. |
x | Vector of ordinates to compute the basis for. |
spline |
|
Details
The exact formula for the basis is given inflexsurvspline.
Value
A matrix with one row for each ordinate and one column for eachknot.
basis returns the basis, anddbasis returns its derivativewith respect tox.
fss anddfss are the same, but with the order of thearguments swapped around for consistency with similar functions in other Rpackages.
Author(s)
Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>
References
Royston, P. and Parmar, M. (2002). Flexible parametricproportional-hazards and proportional-odds models for censored survivaldata, with application to prognostic modelling and estimation of treatmenteffects. Statistics in Medicine 21(1):2175-2197.
Wang W, Yan J (2021). Shape-Restricted Regression Splines with RPackage splines2. Journal of Data Science, 19(3), 498-517.
See Also
Breast cancer survival data
Description
Survival times of 686 patients with primary node positive breast cancer.
Usage
bcFormat
A data frame with 686 rows.
censrec | (numeric) | 1=dead, 0=censored |
rectime | (numeric) | Time of death or censoring in days |
group | (numeric) | Prognostic group:"Good","Medium" or"Poor", |
| from a regression model developed by Sauerbrei and Royston (1999). | ||
Source
German Breast Cancer Study Group, 1984-1989. Used as a referencedataset for the spline-based survival model of Royston and Parmar (2002),implemented here inflexsurvspline. Originally provided withthestpm (Royston 2001, 2004) andstpm2 (Lambert 2009, 2010)Stata modules.
References
Royston, P. and Parmar, M. (2002). Flexible parametricproportional-hazards and proportional-odds models for censored survivaldata, with application to prognostic modelling and estimation of treatmenteffects. Statistics in Medicine 21(1):2175-2197.
Sauerbrei, W. and Royston, P. (1999). Building multivariable prognostic anddiagnostic models: transformation of the predictors using fractionalpolynomials. Journal of the Royal Statistical Society, Series A 162:71-94.
See Also
Bootstrap confidence intervals for flexsurv output functions
Description
Calculate a confidence interval for a model output by repeatedly replacing the parameters in a fitted model object with a draw from the multivariate normal distribution of the maximum likelihood estimates, then recalculating the output function.
Usage
bootci.fmsm( x, B, fn, cl = 0.95, attrs = NULL, cores = NULL, sample = FALSE, ...)Arguments
x | Output from |
B | Number of parameter draws to use |
fn | Function to bootstrap the results of. It must have an argument named |
cl | Width of symmetric confidence interval, by default 0.95 |
attrs | Any attributes of the value returned from |
cores | Number of cores to use for parallel processing. |
sample | If |
... | Additional arguments to pass to |
Value
A matrix with two rows, giving the upper and lower confidence limits respectively. Each row is a vector of the same length as the unlisted result of the function corresponding tofncall.
Examples
## How to use bootci.fmsm## Write a function with one argument called x giving a fitted model,## and returning some results of the model. The results may be in any form. tmat <- rbind(c(NA,1,2),c(NA,NA,3),c(NA,NA,NA))bexp <- flexsurvreg(Surv(Tstart, Tstop, status) ~ trans, data=bosms3, dist="exp")summfn <- function(x, t){ resp <- flexsurv::pmatrix.fs(x, trans=tmat, t=t) rest <- flexsurv::totlos.fs(x, trans=tmat, t=t) list(resp, rest)}## Use bootci.fmsm to obtain the confidence interval## The matrix columns are in the order of the unlisted results of the original## summfn. You will have to rearrange them into the format that you want.## If summfn has any extra arguments, in this case \code{t}, make sure they are## passed through via the ... argument to bootci.fmsmbootci.fmsm(bexp, B=3, fn=summfn, t=10)bootci.fmsm(bexp, B=3, fn=summfn, t=5)Bronchiolitis obliterans syndrome after lung transplants
Description
A dataset containing histories of bronchiolitis obliterans syndrome (BOS)from lung transplant recipients. BOS is a chronic decline in lung function,often observed after lung transplantation.
Format
A data frame containing a sequence of observed or censoredtransitions to the next stage of severity or death. It is groupedby patient and includes histories of 204 patients. All patientsstart in state 1 (no BOS) at six months after transplant, and maysubsequently develop BOS or die.
bosms3 contains the data for a three-state model: no BOS, BOS ordeath.bosms4 uses a four-state representation: no BOS, mild BOS,moderate/severe BOS or death.
id | (numeric) | Patient identification number |
from | (numeric) | Observedstarting state of the transition |
to | (numeric) | Observedor potential ending state of the transition |
Tstart | (numeric) | Time at the start of the interval |
Tstop | (numeric) | Time at the end of the interval |
time | (numeric) | Time differenceTstart-Tstop |
status | (numeric) | 1 if the transition to stateto was observed, or0 if the transition to stateto was censored (for example, if thepatient was observed to move to a competing state) |
trans | (factor) | Number of the transitionfrom-to in the set ofallntrans allowed transitions, numbered from 1 tontrans. |
Details
The entry time of each patient into each stage of BOS was estimated byclinicians, based on their history of lung function measurements and acuterejection and infection episodes. BOS is only assumed to occur beyond sixmonths after transplant. In the first six months the function of eachpatient's new lung stabilises. Subsequently BOS is diagnosed by comparingthe lung function against the "baseline" value.
The same data are provided in themsm package, but in thenative format ofmsm to allow Markov models to be fitted.Inflexsurv, much more flexible models can be fitted.
Source
Papworth Hospital, U.K.
References
Heng. D. et al. (1998). Bronchiolitis Obliterans Syndrome:Incidence, Natural History, Prognosis, and Risk Factors. Journal of Heartand Lung Transplantation 17(12)1255–1263.
Extract model coefficients from fitted flexible survival models
Description
Extract model coefficients from fitted flexible survival models. Thispresents all parameter estimates, transformed to the real line if necessary.For example, shape or scale parameters, which are constrained to bepositive, are returned on the log scale.
Usage
## S3 method for class 'flexsurvreg'coef(object, ...)Arguments
object | Output from |
... | Further arguments passed to or from other methods. Currentlyunused. |
Details
This matches the behaviour ofcoef.default for standard R modelfamilies such asglm, where intercepts in regressionmodels are presented on the same scale as the covariate effects. Note thatany parameter in a distribution fitted byflexsurvreg orflexsurvreg may be an intercept in a regression model.
Value
This returns themod$res.t[,"est"] component of the fittedmodel objectmod. Seeflexsurvreg,flexsurvspline for full documentation of all components.
Author(s)
C. H. Jacksonchris.jackson@mrc-bsu.cam.ac.uk
See Also
Cox-Snell residuals from a parametric survival model
Description
Cox-Snell residuals from a parametric survival model
Usage
coxsnell_flexsurvreg(x)Arguments
x | Object returned by |
Value
A data frame with a column calledest giving the Cox-Snell residual, defined as the fitted cumulative hazard at each data point.fitted cumulative hazard at the given observed data point, and other columns indicating the observation time,observed event status, and covariate values defining the data at this point.
The cumulative hazardsest should form a censored sample from an Exponential(1). Therefore to check the fit of the model, plot a nonparametric estimate of the cumulativehazard curve against a diagonal line through the origin, which is the theoretical cumulativehazard trajectory of the Exponential(1).
Examples
fitg <- flexsurvreg(formula = Surv(futime, fustat) ~ age, data = ovarian, dist = "gengamma") cs <- coxsnell_flexsurvreg(fitg) ## Model appears to fit well, with some small sample noise surv <- survfit(Surv(cs$est, ovarian$fustat) ~ 1) plot(surv, fun="cumhaz") abline(0, 1, col="red")Flexible parametric mixture models for times to competing events
Description
In a mixture model for competing events, an individual can experience one ofa set of different events. We specify a model for the probability that theywill experience each event before the others, and a model for the time tothe event conditionally on that event occurring first.
Usage
flexsurvmix( formula, data, event, dists, pformula = NULL, anc = NULL, partial_events = NULL, initp = NULL, inits = NULL, fixedpars = NULL, dfns = NULL, method = "direct", em.control = NULL, optim.control = NULL, aux = NULL, sr.control = survreg.control(), integ.opts, hess.control = NULL, ...)Arguments
formula | Survival model formula. The left hand side is a Alternatively, A list of formulae may also be used to indicate that for particularindividuals, different events may be observed in different ways, withdifferent censoring mechanisms. Each list component specifies the dataand censoring scheme for that mixture component. For example, suppose we are studying people admitted to hospital,and thecompeting states are death in hospital and discharge from hospital. Attime t we know that a particular individual is still alive, but we do notknow whether they are still in hospital, or have been discharged. In thiscase, if the individual were to die in hospital, their death time would beright censored at t. If the individual will be (or has been) dischargedbefore death, their discharge time is completely unknown, thusinterval-censored on (0,Inf). Therefore, we need to store different eventtime and status variables in the data for different alternative events.This is specified here as
where for this individual, The "dot" notation commonly used to indicate "all remaining variables" in aformula is not supported in |
data | Data frame containing variables mentioned in |
event | Variable in the data that specifies which of the alternativeevents is observed for which individual. If the individual's follow-up isright-censored, or if the event is otherwise unknown, this variable musthave the value Ideally this should be a factor, since the mixture components can then beeasily identified in the results with a name instead of a number. If thisis not already a factor, it is coerced to one. Then the levels of thefactor define the required order for the components of the list arguments |
dists | Vector specifying the parametric distribution to use for eachcomponent. The same distributions are supported as in |
pformula | Formula describing covariates to include on the componentmembership proabilities by multinomial logistic regression. The firstcomponent is treated as the baseline. The "dot" notation commonly used to indicate "all remaining variables" in aformula is not supported. |
anc | List of component-specific lists, of length equal to the numberof components. Each component-specific list is a list of formulaerepresenting covariate effects on parameters of the distribution. If there are covariates for one component but not others, then a listcontaining one null formula on the location parameter should be suppliedfor the component with no covariates, e.g Covariates on the location parameter may also be supplied here instead ofin |
partial_events | List specifying the factor levels of For example, suppose there are three alternative events called
|
initp | Initial values for component membership probabilities. Bydefault, these are assumed to be equal for each component. |
inits | List of component-specific vectors. Each component-specificvector contains the initial values for the parameters of thecomponent-specific model, as would be supplied as the |
fixedpars | Indexes of parameters to fix at their initial values andnot optimise. Arranged in the order: baseline mixing probabilities,covariates on mixing probabilities, time-to-event parameters by mixingcomponent. Within mixing components, time-to-event parameters are orderedin the same way as in If Not currently supported when using the EM algorithm. |
dfns | List of lists of user-defined distribution functions, one foreach mixture component. Each list component is specified as the |
method | Method for maximising the likelihood. Either |
em.control | List of settings to control EM algorithm fitting. Theonly options currently available are
For example, |
optim.control | List of options to pass as the |
aux | A named list of other arguments to pass to custom distributionfunctions. This is used, for example, by |
sr.control | For the models which use |
integ.opts | List of named arguments to pass to
|
hess.control | List of options to control covariance matrix computation. Available options are:
The Hessian is positive definite, thus invertible, at the maximumlikelihood. If the Hessian computed after optimisation convergence can'tbe inverted, this is either because the converged result is not themaximum likelihood (e.g. it could be a "saddle point"), or because thenumerical methods used to obtain the Hessian were inaccurate. If yoususpect that the Hessian was computed wrongly enough that it is notinvertible, but not wrongly enough that the nearest valid inverse would bean inaccurate estimate of the covariance matrix, then these tolerancevalues can be modified (reducing |
... | Optional arguments to the general-purpose optimisation routine |
Details
This differs from the more usual "competing risks" models, where we specify"cause-specific hazards" describing the time to each competing event. Thistime will not be observed for an individual if one of the competing eventshappens first. The event that happens first is defined by the minimum ofthe times to the alternative events.
Theflexsurvmix function fits a mixture model to data consisting of asingle time to an event for each individual, and an indicator for what typeof event occurs for that individual. The time to event may be observed orcensored, just as inflexsurvreg, and the type of event may beknown or unknown. In a typical application, where we follow up a set ofindividuals until they experience an event or a maximum follow-up time isreached, the event type is known if the time is observed, and the event typeis unknown when follow-up ends and the time is right-censored.
The model is fitted by maximum likelihood, either directly or by using anexpectation-maximisation (EM) algorithm, by wrappingflexsurvreg to compute the likelihood or to implement the Eand M steps.
Some worked examples are given in the package vignette about multi-statemodelling, which can be viewed by runningvignette("multistate", package="flexsurv").
Value
List of objects containing information about the fitted model. Theimportant one isres, a data frame containing the parameterestimates and associated information.
References
Jackson, C. H. and Tom, B. D. M. and Kirwan, P. D. and Mandal, S.and Seaman, S. R. and Kunzmann, K. and Presanis, A. M. and De Angelis, D. (2022)A comparison of two frameworks for multi-state modelling, applied to outcomesafter hospital admissions with COVID-19. Statistical Methods in Medical Research31(9) 1656-1674.
Larson, M. G., & Dinse, G. E. (1985). A mixture model for theregression analysis of competing risks data. Journal of the RoyalStatistical Society: Series C (Applied Statistics), 34(3), 201-211.
Lau, B., Cole, S. R., & Gange, S. J. (2009). Competing risk regressionmodels for epidemiologic data. American Journal of Epidemiology, 170(2),244-256.
Flexible parametric regression for time-to-event data
Description
Parametric modelling or regression for time-to-event data. Several built-indistributions are available, and users may supply their own.
Usage
flexsurvreg( formula, anc = NULL, data, weights, bhazard, rtrunc, subset, na.action, dist, inits, fixedpars = NULL, dfns = NULL, aux = NULL, cl = 0.95, integ.opts = NULL, sr.control = survreg.control(), hessian = TRUE, hess.control = NULL, ...)Arguments
formula | A formula expression in conventional R linear modellingsyntax. The response must be a survival object as returned by the
If there are no covariates, specify If the right hand side is specified as By default, covariates are placed on the “location” parameter of thedistribution, typically the "scale" or "rate" parameter, through a linearmodel, or a log-linear model if this parameter must be positive. Thisgives an accelerated failure time model or a proportional hazards model(see Covariates can be placed on other (“ancillary”) parameters by using thename of the parameter as a “function” in the formula. For example, in aWeibull model, the following expresses the scale parameter in terms of ageand a treatment variable
However, if the names of the ancillary parameters clash with any realfunctions that might be used in formulae (such as
| |||||||||||||||||||||||||||||||||||||||||
anc | An alternative and safer way to model covariates on ancillaryparameters, that is, parameters other than the main location parameter ofthe distribution. This is a named list of formulae, with the name of eachcomponent giving the parameter to be modelled. The model above can alsobe defined as:
| |||||||||||||||||||||||||||||||||||||||||
data | A data frame in which to find variables supplied in | |||||||||||||||||||||||||||||||||||||||||
weights | Optional numeric variable giving weights for eachindividual in the data. The fitted model is then defined bymaximising the weighted sum of the individual-specific log-likelihoods. | |||||||||||||||||||||||||||||||||||||||||
bhazard | Optional variable giving expected hazards for relativesurvival models. The model is described by Nelson et al. (2007).
If For relative survival models, the log-likelihood returned by | |||||||||||||||||||||||||||||||||||||||||
rtrunc | Optional variable giving individual-specific right-truncationtimes. Used for analysing data with "retrospective ascertainment". Forexample, suppose we want to estimate the distribution of the time fromonset of a disease to death, but have only observed cases known to havedied by the current date. In this case, times from onset to death forindividuals in the data are right-truncated by the current date minus theonset date. Predicted survival times for new cases can then be describedby an un-truncated version of the fitted distribution. These models can suffer from weakly identifiable parameters andbadly-behaved likelihood functions, and it is advised to compareconvergence for different initial values by supplying different | |||||||||||||||||||||||||||||||||||||||||
subset | Vector of integers or logicals specifying the subset of theobservations to be used in the fit. | |||||||||||||||||||||||||||||||||||||||||
na.action | a missing-data filter function, applied after any 'subset'argument has been used. Default is | |||||||||||||||||||||||||||||||||||||||||
dist | Typically, one of the strings in the first column of thefollowing table, identifying a built-in distribution. This table alsoidentifies the location parameters, and whether covariates on theseparameters represent a proportional hazards (PH) or accelerated failuretime (AFT) model. In an accelerated failure time model, the covariatespeeds up or slows down the passage of time. So if the coefficient(presented on the log scale) is log(2), then doubling the covariate valuewould give half the expected survival time.
Alternatively, Very flexible spline-based distributions can also be fitted with The parameterisations of the built-in distributions used here are the sameas in their built-in distribution functions: A package vignette "Distributions reference" lists the survivor functionsand covariate effect parameterisations used by each built-in distribution. For the Weibull, exponential and log-normal distributions, The Weibull parameterisation is different from that in Similarly in the exponential distribution, the rate, rather than the mean,is modelled on covariates. The object | |||||||||||||||||||||||||||||||||||||||||
inits | An optional numeric vector giving initial values for eachunknown parameter. These are numbered in the order: baseline parameters(in the order they appear in the distribution function, e.g. shape beforescale in the Weibull), covariate effects on the location parameter,covariate effects on the remaining parameters. This is the same order asthe printed estimates in the fitted model. If not specified, default initial values are chosen from a simple summaryof the survival or censoring times, for example the mean is often used toinitialize scale parameters. See the object | |||||||||||||||||||||||||||||||||||||||||
fixedpars | Vector of indices of parameters whose values will be fixedat their initial values during the optimisation. The indices are orderedas in | |||||||||||||||||||||||||||||||||||||||||
dfns | An alternative way to define a custom survival distribution (seesection “Custom distributions” below). A list whose components mayinclude
If | |||||||||||||||||||||||||||||||||||||||||
aux | A named list of other arguments to pass to custom distributionfunctions. This is used, for example, by | |||||||||||||||||||||||||||||||||||||||||
cl | Width of symmetric confidence intervals for maximum likelihoodestimates, by default 0.95. | |||||||||||||||||||||||||||||||||||||||||
integ.opts | List of named arguments to pass to
| |||||||||||||||||||||||||||||||||||||||||
sr.control | For the models which use | |||||||||||||||||||||||||||||||||||||||||
hessian | Calculate the covariances and confidence intervals for theparameters. Defaults to | |||||||||||||||||||||||||||||||||||||||||
hess.control | List of options to control covariance matrix computation. Available options are:
The Hessian is positive definite, thus invertible, at the maximumlikelihood. If the Hessian computed after optimisation convergence can'tbe inverted, this is either because the converged result is not themaximum likelihood (e.g. it could be a "saddle point"), or because thenumerical methods used to obtain the Hessian were inaccurate. If yoususpect that the Hessian was computed wrongly enough that it is notinvertible, but not wrongly enough that the nearest valid inverse would bean inaccurate estimate of the covariance matrix, then these tolerancevalues can be modified (reducing | |||||||||||||||||||||||||||||||||||||||||
... | Optional arguments to the general-purpose optimisation routine |
Details
Parameters are estimated by maximum likelihood using the algorithmsavailable in the standard Roptim function. Parametersdefined to be positive are estimated on the log scale. Confidence intervalsare estimated from the Hessian at the maximum, and transformed back to theoriginal scale of the parameters.
The usage offlexsurvreg is intended to be similar tosurvreg in thesurvival package.
Value
A list of class"flexsurvreg" containing information aboutthe fitted model. Components of interest to users may include:
call | A copy of the function call, for use in post-processing. |
dlist | List defining the survival distribution used. |
res | Matrix of maximum likelihood estimates and confidence limits,with parameters on their natural scales. |
res.t | Matrix of maximumlikelihood estimates and confidence limits, with parameters alltransformed to the real line (using a log transform for all built-inmodels where this is necessary). The |
coefficients | The transformed maximum likelihoodestimates, as in |
loglik | Log-likelihood. This will differ from Stata, where the sumof the log uncensored survival times is added to the log-likelihood insurvival models, to remove dependency on the time scale. For relative survival models specified with |
logliki | Vector of individual contributions to the log-likelihood |
AIC | Akaike's information criterion (-2*log likelihood + 2*number ofestimated parameters) |
cov | Covariance matrix of the parameters, onthe real-line scale (e.g. log scale), which can be extracted with |
data | Data used in the model fit. To extractthis in the standard R formats, use use |
Custom distributions
flexsurvreg is intended to beeasy to extend to handle new distributions. To define a new distributionfor use inflexsurvreg, construct a list with the followingelements:
"name"A string naming the distribution. If thisis called
"dist", for example, then there must be visible in theworking environment, at least, eithera) a function called
ddistwhich defines the probability density,or
b) a function called
hdistwhich defines the hazard.Ideally, in case a) there should also be a function called
pdistwhich defines the probability distribution or cumulative density, and incase b) there should be a function calledHdistdefining thecumulative hazard. If these additional functions are not provided,flexsurv attempts to automatically create them by numericallyintegrating the density or hazard function. However, model fitting willbe much slower, or may not even work at all, if the analytic versions ofthese functions are not available.The functions must accept vector arguments (representing different times,or alternative values for each parameter) and return the results as avector. The function
Vectorizemay be helpful for doingthis: see the example below.These functions may be in an add-on package (see below for an example) ormay be user-written. If they are user-written they must be defined in theglobal environment, or supplied explicitly through thedfnsargumenttoflexsurvreg. The latter may be useful if the functions arecreated dynamically (as in the source offlexsurvspline) and thusnot visible through R's scoping rules.Arguments other than parameters must be named in the conventional way –for example
xfor the first argument of the density function orhazard, as indnorm(x, ...)andqfor the firstargument of the probability function. Density functions should also havean argumentlog, after the parameters, which whenTRUE,computes the log density, using a numerically stable additive formula ifpossible.Additional functions with names beginning with
"DLd"and"DLS"may be defined to calculate the derivatives of the log densityand log survival probability, with respect to the parameters of thedistribution. The parameters are expressed on the real line, for exampleafter log transformation if they are defined as positive. The firstargument must be namedt, representing the time, and the remainingarguments must be named as the parameters of the density function. Thefunction must return a matrix with rows corresponding to times, and columnscorresponding to the parameters of the distribution. The derivatives areused, if available, to speed up the model fitting withoptim."pars"Vector of strings naming the parameters of thedistribution. These must be the same names as the arguments of the densityand probability functions.
"location"Name of the main parameter governing the mean ofthe distribution. This is the default parameter on which covariates areplaced in the
formulasupplied toflexsurvreg."transforms"List of Rfunctions which transform the range of values taken by each parameter ontothe real line. For example,
c(log, log)for a distribution with twopositive parameters."inv.transforms"List of R functions defining thecorresponding inverse transformations. Note these must be lists, even forsingle parameter distributions they should be supplied as, e.g.
c(exp)orlist(exp)."inits"A function of theobserved survival times
t(including right-censoring times, andusing the halfway point for interval-censored times) which returns a vectorof reasonable initial values for maximum likelihood estimation of eachparameter. For example,function(t){ c(1, mean(t)) }will alwaysinitialize the first of two parameters at 1, and the second (a scaleparameter, for instance) at the mean oft.
For example, suppose we want to use an extreme value survival distribution.This is available in the CRAN packageeha, which providesconventionally-defined density and probability functions calledeha::dEV andeha::pEV. See the Examples belowfor the custom list in this case, and the subsequent command to fit themodel.
Author(s)
Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>
References
Jackson, C. (2016). flexsurv: A Platform for ParametricSurvival Modeling in R. Journal of Statistical Software, 70(8), 1-33.doi:10.18637/jss.v070.i08
Cox, C. (2008) The generalizedF distribution: An umbrella forparametric survival analysis. Statistics in Medicine 27:4301-4312.
Cox, C., Chu, H., Schneider, M. F. and Muñoz, A. (2007) Parametric survivalanalysis and taxonomy of hazard functions for the generalized gammadistribution. Statistics in Medicine 26:4252-4374
Jackson, C. H. and Sharples, L. D. and Thompson, S. G. (2010) Survivalmodels in health economic evaluations: balancing fit and parsimony toimprove prediction. International Journal of Biostatistics 6(1):Article 34.
Nelson, C. P., Lambert, P. C., Squire, I. B., & Jones, D. R. (2007).Flexible parametric models for relative survival, with application incoronary heart disease. Statistics in medicine, 26(30), 5486-5498.
See Also
flexsurvspline for flexible survival modelling usingthe spline model of Royston and Parmar.
plot.flexsurvreg andlines.flexsurvreg to plotfitted survival, hazards and cumulative hazards from models fitted byflexsurvreg andflexsurvspline.
Examples
## Compare generalized gamma fit with Weibullfitg <- flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian, dist="gengamma")fitgfitw <- flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian, dist="weibull")fitwplot(fitg)lines(fitw, col="blue", lwd.ci=1, lty.ci=1)## Identical AIC, probably not enough data in this simple example for a## very flexible model to be worthwhile.## Custom distribution## make "dEV" and "pEV" from eha package (if installed)## available to the working environmentif (require("eha")) {custom.ev <- list(name="EV", pars=c("shape","scale"), location="scale", transforms=c(log, log), inv.transforms=c(exp, exp), inits=function(t){ c(1, median(t)) })fitev <- flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian, dist=custom.ev)fitevlines(fitev, col="purple", col.ci="purple")}## Custom distribution: supply the hazard function onlyhexp2 <- function(x, rate=1){ rate } # exponential distributionhexp2 <- Vectorize(hexp2)custom.exp2 <- list(name="exp2", pars=c("rate"), location="rate", transforms=c(log), inv.transforms=c(exp), inits=function(t)1/mean(t))flexsurvreg(Surv(futime, fustat) ~ 1, data = ovarian, dist=custom.exp2)flexsurvreg(Surv(futime, fustat) ~ 1, data = ovarian, dist="exp")## should give same answerFlexible parametric models for right-truncated, uncensored data defined by times of initial and final events.
Description
This function estimates the distribution of the time between an initial and final event, in situations where individuals are only observed if they have experienced both events before a certain time, thus they are right-truncated at this time. The time of the initial event provides information about the time from initial to final event, given the truncated observation scheme, and initial events are assumed to occur with an exponential growth rate.
Usage
flexsurvrtrunc( t, tinit, rtrunc, tmax, data = NULL, method = "joint", dist, theta = NULL, fixed.theta = TRUE, inits = NULL, fixedpars = NULL, dfns = NULL, integ.opts = NULL, cl = 0.95, optim_control = list())Arguments
t | Vector of time differences between an initial and final event for a set of individuals. | |||||||||||||||||||||||||||||||||||||||||
tinit | Absolute time of the initial event for each individual. | |||||||||||||||||||||||||||||||||||||||||
rtrunc | Individual-specific right truncation points on the same scale as | |||||||||||||||||||||||||||||||||||||||||
tmax | Maximum possible time between initial and final events that could have been observed. This is only used in | |||||||||||||||||||||||||||||||||||||||||
data | Data frame containing | |||||||||||||||||||||||||||||||||||||||||
method | If | |||||||||||||||||||||||||||||||||||||||||
dist | Typically, one of the strings in the first column of thefollowing table, identifying a built-in distribution. This table alsoidentifies the location parameters, and whether covariates on theseparameters represent a proportional hazards (PH) or accelerated failuretime (AFT) model. In an accelerated failure time model, the covariatespeeds up or slows down the passage of time. So if the coefficient(presented on the log scale) is log(2), then doubling the covariate valuewould give half the expected survival time.
Alternatively, Very flexible spline-based distributions can also be fitted with The parameterisations of the built-in distributions used here are the sameas in their built-in distribution functions: A package vignette "Distributions reference" lists the survivor functionsand covariate effect parameterisations used by each built-in distribution. For the Weibull, exponential and log-normal distributions, The Weibull parameterisation is different from that in Similarly in the exponential distribution, the rate, rather than the mean,is modelled on covariates. The object | |||||||||||||||||||||||||||||||||||||||||
theta | Initial value (or fixed value) for the exponential growth rate | |||||||||||||||||||||||||||||||||||||||||
fixed.theta | Should | |||||||||||||||||||||||||||||||||||||||||
inits | Initial values for the parameters of the parametric survival distributon. If not supplied, a heuristic is used. as is done in | |||||||||||||||||||||||||||||||||||||||||
fixedpars | Integer indices of the parameters of the survival distribution that should be fixed to their values supplied in | |||||||||||||||||||||||||||||||||||||||||
dfns | An alternative way to define a custom survival distribution (seesection “Custom distributions” below). A list whose components mayinclude
If | |||||||||||||||||||||||||||||||||||||||||
integ.opts | List of named arguments to pass to
| |||||||||||||||||||||||||||||||||||||||||
cl | Width of symmetric confidence intervals for maximum likelihoodestimates, by default 0.95. | |||||||||||||||||||||||||||||||||||||||||
optim_control | List to supply as the |
Details
Covariates are not currently supported.
Note thatflexsurvreg, with anrtrunc argument, can fit models for a similar kind of data, but those models ignore the information provided by the time of the initial event.
A nonparametric estimator of survival under right-truncation is also provided insurvrtrunc. See Seaman et al. (2020) for a full comparison of the alternative models.
References
Seaman, S., Presanis, A. and Jackson, C. (2020) Estimating a Time-to-EventDistribution from Right-Truncated Data in an Epidemic: a Review of Methods
See Also
Examples
set.seed(1) ## simulate time to initial eventX <- rexp(1000, 0.2)## simulate time between initial and final eventT <- rgamma(1000, 2, 10) ## right-truncate to keep only those with final event time## before tmaxtmax <- 40obs <- X + T < tmax rtrunc <- tmax - Xdat <- data.frame(X, T, rtrunc)[obs,]flexsurvrtrunc(t=T, rtrunc=rtrunc, tinit=X, tmax=40, data=dat, dist="gamma", theta=0.2)flexsurvrtrunc(t=T, rtrunc=rtrunc, tinit=X, tmax=40, data=dat, dist="gamma", theta=0.2, fixed.theta=FALSE)flexsurvrtrunc(t=T, rtrunc=rtrunc, tinit=X, tmax=40, data=dat, dist="gamma", theta=0.2, inits=c(1, 8))flexsurvrtrunc(t=T, rtrunc=rtrunc, tinit=X, tmax=40, data=dat, dist="gamma", theta=0.2, method="final")flexsurvrtrunc(t=T, rtrunc=rtrunc, tinit=X, tmax=40, data=dat, dist="gamma", fixed.theta=TRUE)flexsurvrtrunc(t=T, rtrunc=rtrunc, tinit=X, tmax=40, data=dat, dist="weibull", fixed.theta=TRUE)flexsurvrtrunc(t=T, rtrunc=rtrunc, tinit=X, tmax=40, data=dat, dist="lnorm", fixed.theta=TRUE)flexsurvrtrunc(t=T, rtrunc=rtrunc, tinit=X, tmax=40, data=dat, dist="gengamma", fixed.theta=TRUE)flexsurvrtrunc(t=T, rtrunc=rtrunc, tinit=X, tmax=40, data=dat, dist="gompertz", fixed.theta=TRUE)Flexible survival regression using the Royston/Parmar spline model.
Description
Flexible parametric modelling of time-to-event data using the spline modelof Royston and Parmar (2002).
Usage
flexsurvspline( formula, data, weights, bhazard, rtrunc, subset, k = 0, knots = NULL, bknots = NULL, scale = "hazard", timescale = "log", spline = "rp", ...)Arguments
formula | A formula expression in conventional R linear modellingsyntax. The response must be a survival object as returned by the
specifies a model where the log cumulative hazard (by default, see If there are no covariates, specify Time-varying covariate effects can be specified using the method describedin Therefore a model with one internal spline knot, where the equivalents ofthe Weibull shape and scale parameters, but not the higher-order term
or alternatively (and more safely, see
|
data | A data frame in which to find variables supplied in |
weights | Optional variable giving case weights. |
bhazard | Optional variable giving expected hazards for relativesurvival models. |
rtrunc | Optional variable giving individual right-truncation times (see |
subset | Vector of integers or logicals specifying the subset of theobservations to be used in the fit. |
k | Number of knots in the spline. The default |
knots | Locations of knots on the axis of log time (or time, see |
bknots | Locations of boundary knots, on the axis of log time (ortime, see |
scale | If If If |
timescale | If |
spline |
|
... | Any other arguments to be passed to or through |
Details
This function works as a wrapper aroundflexsurvreg bydynamically constructing a custom distribution usingdsurvspline,psurvspline andunroll.function.
In the spline-based survival model of Royston and Parmar (2002), atransformationg(S(t,z)) of the survival function is modelled as anatural cubic spline function of log timex = \log(t)plus linear effects of covariatesz.
g(S(t,z)) = s(x, \bm{\gamma}) + \bm{\beta}^T \mathbf{z}
The proportional hazards model (scale="hazard") definesg(S(t,\mathbf{z})) = \log(-\log(S(t,\mathbf{z}))) =\log(H(t,\mathbf{z})), thelog cumulative hazard.
The proportional odds model (scale="odds") definesg(S(t,\mathbf{z})) =\log(S(t,\mathbf{z})^{-1} - 1), the logcumulative odds.
The probit model (scale="normal") definesg(S(t,\mathbf{z})) = -\Phi^{-1}(S(t,\mathbf{z})), where\Phi^{-1}() is the inverse normaldistribution functionqnorm.
With no knots, the spline reduces to a linear function, and these modelsare equivalent to Weibull, log-logistic and lognormal models respectively.
The spline coefficients\gamma_j: j=1, 2 \ldots, which are called the "ancillary parameters" above, may also bemodelled as linear functions of covariates\mathbf{z}, as
\gamma_j(\mathbf{z}) = \gamma_{j0} + \gamma_{j1}z_1 + \gamma_{j2}z_2+ ...
giving a model where the effects of covariates are arbitrarily flexiblefunctions of time: a non-proportional hazards or odds model.
Natural cubic splines are cubic splines constrained to be linear beyondboundary knotsk_{min},k_{max}. The spline function isdefined as
s(x,\boldsymbol{\gamma}) = \gamma_0 + \gamma_1 x + \gamma_2 v_1(x) + \ldots +
\gamma_{m+1} v_m(x)
wherev_j(x) is thejth basis function
v_j(x) = (x - k_j)^3_+ - \lambda_j(x - k_{min})^3_+ - (1 -
\lambda_j) (x - k_{max})^3_+
\lambda_j = \frac{k_{max} - k_j}{k_{max} - k_{min}}
and(x - a)_+ = max(0, x - a).
Value
A list of class"flexsurvreg" with the same elements asdescribed inflexsurvreg, and including extra componentsdescribing the spline model. See in particular:
k | Number of knots. |
knots | Location of knots on the log timeaxis. |
scale | The |
res | Matrix of maximum likelihood estimates and confidence limits.Spline coefficients are labelled Coefficients In the Weibull model, for example, In the log-logistic model with shape In the log-normal model with log-scale mean |
loglik | Themaximised log-likelihood. This will differ from Stata, where the sum ofthe log uncensored survival times is added to the log-likelihood insurvival models, to remove dependency on the time scale. |
Author(s)
Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>
References
Royston, P. and Parmar, M. (2002). Flexible parametricproportional-hazards and proportional-odds models for censored survivaldata, with application to prognostic modelling and estimation of treatmenteffects. Statistics in Medicine 21(1):2175-2197.
Wang W, Yan J (2021). Shape-Restricted Regression Splines with RPackage splines2. Journal of Data Science, 19(3), 498-517.
Jackson, C. (2016). flexsurv: A Platform for Parametric Survival Modelingin R. Journal of Statistical Software, 70(8), 1-33.doi:10.18637/jss.v070.i08
See Also
flexsurvreg for flexible survival modelling usinggeneral parametric distributions.
plot.flexsurvreg andlines.flexsurvreg to plotfitted survival, hazards and cumulative hazards from models fitted byflexsurvspline andflexsurvreg.
Examples
## Best-fitting model to breast cancer data from Royston and Parmar (2002)## One internal knot (2 df) and cumulative odds scalespl <- flexsurvspline(Surv(recyrs, censrec) ~ group, data=bc, k=1, scale="odds")## Fitted survivalplot(spl, lwd=3, ci=FALSE)## Simple Weibull model fits much less wellsplw <- flexsurvspline(Surv(recyrs, censrec) ~ group, data=bc, k=0, scale="hazard")lines(splw, col="blue", ci=FALSE)## Alternative way of fitting the Weibull## Not run: splw2 <- flexsurvreg(Surv(recyrs, censrec) ~ group, data=bc, dist="weibull")## End(Not run)Constructor for a mixture multi-state model based on flexsurvmix
Description
Constructor for a mixture multi-state model based on flexsurvmix
Usage
fmixmsm(...)Arguments
... | Named arguments. Each argument should be a fitted model asreturned by |
Value
A list offlexsurvmix objects, with the followingattribute(s):
pathways A list of all potential pathways until absorption, formodels without cycles. For models with cycles this will have an elementhas_cycle=TRUE, plus the pathways discovered before the functionfound the cycle and gave up.
Construct a multi-state model from a set of parametric survival models
Description
Construct a multi-state model from a set of parametric survival models
Usage
fmsm(..., trans)Arguments
... | Objects returned by |
trans | A matrix of integers specifying which models correspond towhich transitions. The |
Value
A list containing the objects given in..., and withattributes"trans" and"statenames" defining the transitionstructure matrix and state names, and with list components named todescribe the transitions they correspond to. If any of the arguments in... are named, then these are used to define the transition names,otherwise default names are chosen based on the state names.
Evaluate baseline time-to-event distribution parameters given covariate values in a flexsurvmix model
Description
Evaluate baseline time-to-event distribution parameters given covariate values in a flexsurvmix model
Usage
get_basepars(x, newdata, event)Arguments
x | Fitted model object |
newdata | Data frame of alternative covariate values |
event | Event |
Glance at a flexsurv model object
Description
Glance accepts a model object and returns a tibble with exactly one row of model summaries.
Usage
## S3 method for class 'flexsurvreg'glance(x, ...)Arguments
x | Output from |
... | Not currently used. |
Value
A one-rowtibble containing columns:
NNumber of observations used in fittingeventsNumber of eventscensoredNumber of censored eventstriskTotal length of time-at-risk (i.e. follow-up)dfDegrees of freedom (i.e. number of estimated parameters)logLikLog-likelihoodAICAkaike's "An Information Criteria"BICBayesian Information Criteria
Examples
fitg <- flexsurvreg(formula = Surv(futime, fustat) ~ age, data = ovarian, dist = "gengamma")glance(fitg)Hazard and cumulative hazard functions
Description
Hazard and cumulative hazard functions for distributions which are builtinto flexsurv, and whose distribution functions are in base R.
Usage
hexp(x, rate = 1, log = FALSE)Hexp(x, rate = 1, log = FALSE)hgamma(x, shape, rate = 1, log = FALSE)Hgamma(x, shape, rate = 1, log = FALSE)hlnorm(x, meanlog = 0, sdlog = 1, log = FALSE)Hlnorm(x, meanlog = 0, sdlog = 1, log = FALSE)hweibull(x, shape, scale = 1, log = FALSE)Hweibull(x, shape, scale = 1, log = FALSE)Arguments
x | Vector of quantiles |
rate | Rate parameter (exponential and gamma) |
log | Compute log hazard or log cumulative hazard |
shape | Shape parameter (Weibull and gamma) |
meanlog | Mean on the log scale (log normal) |
sdlog | Standard deviation on the log scale (log normal) |
scale | Scale parameter (Weibull) |
Details
For the exponential and the Weibull these are available analytically, andso are programmed here in numerically stable and efficient forms.
For the gamma and log-normal, these are simply computed as minus the log ofthe survivor function (cumulative hazard) or the ratio of the density andsurvivor function (hazard), so are not expected to be robust to extremevalues or quick to compute.
Value
Hazard (functions beginning 'h') or cumulative hazard (functionsbeginning 'H').
Author(s)
Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>
See Also
dexp,dweibull,dgamma,dlnorm,dgompertz,dgengamma,dgenf
Hazard ratio as a function of time from a parametric survival model
Description
Hazard ratio as a function of time from a parametric survival model
Usage
hr_flexsurvreg( x, newdata = NULL, t = NULL, start = 0, ci = TRUE, B = 1000, cl = 0.95, na.action = na.pass)Arguments
x | Object returned by |
newdata | A data frame with two rows, each specifying a set of covariate values.The hazard ratio is calculated as hazard(z2)/hazard(z1), where z1 is the first row of
|
t | Times to calculate fitted values for. By default, these are thesorted unique observation (including censoring) times in the data - forleft-truncated datasets these are the "stop" times. |
start | Optional left-truncation time or times. The returnedsurvival, hazard or cumulative hazard will be conditioned on survival up tothis time. Predicted times returned with A vector of the same length as |
ci | Set to |
B | Number of simulations from the normal asymptotic distribution ofthe estimates used to calculate confidence intervals or standard errors.Decrease for greaterspeed at the expense of accuracy, or set |
cl | Width of symmetric confidence intervals, relative to 1. |
na.action | Function determining what should be done with missing values in |
Value
A data frame with estimate and confidence limits for the hazard ratio, andone row for each of the times requested int.
Add fitted flexible survival curves to a plot
Description
Add fitted survival (or hazard or cumulative hazard) curves from aflexsurvreg model fit to an existing plot.
Usage
## S3 method for class 'flexsurvreg'lines( x, newdata = NULL, X = NULL, type = "survival", t = NULL, est = TRUE, ci = NULL, B = 1000, cl = 0.95, col = "red", lty = 1, lwd = 2, col.ci = NULL, lty.ci = 2, lwd.ci = 1, ...)Arguments
x | Output from |
newdata | Covariate values to produce fitted curves for, as a dataframe, as described in |
X | Covariate values to produce fitted curves for, as a matrix, asdescribed in |
type |
|
t | Vector of times to plot fitted values for. |
est | Plot fitted curves ( |
ci | Plot confidence intervals for fitted curves. |
B | Number of simulations controlling accuracy of confidenceintervals, as used in |
cl | Width of confidence intervals, by default 0.95 for 95%intervals. |
col | Colour of the fitted curve(s). |
lty | Line type of the fitted curve(s). |
lwd | Line width of the fitted curve(s). |
col.ci | Colour of the confidence limits, defaulting to the same asfor the fitted curve. |
lty.ci | Line type of the confidence limits. |
lwd.ci | Line width of the confidence limits, defaulting to the sameas for the fitted curve. |
... | Other arguments to be passed to the generic |
Details
Equivalent toplot.flexsurvreg(...,add=TRUE).
Author(s)
C. H. Jacksonchris.jackson@mrc-bsu.cam.ac.uk
See Also
Log likelihood from a flexsurvreg model
Description
Log likelihood from a flexsurvreg model
Usage
## S3 method for class 'flexsurvreg'logLik(object, ...)Arguments
object | A fitted model object of class |
... | Other arguments (currently unused). |
Value
Log-likelihood (numeric) with additional attributesdf (degrees of freedom, or numberof parameters that were estimated), and number of observationsnobs (including observedevents and censored observations).
Mean and restricted mean survival functions
Description
Mean and restricted mean survival time functions for distributions which are builtinto flexsurv.
Usage
mean_exp(rate = 1)rmst_exp(t, rate = 1, start = 0)mean_gamma(shape, rate = 1)rmst_gamma(t, shape, rate = 1, start = 0)rmst_genf(t, mu, sigma, Q, P, start = 0)mean_genf(mu, sigma, Q, P)rmst_genf.orig(t, mu, sigma, s1, s2, start = 0)mean_genf.orig(mu, sigma, s1, s2)rmst_gengamma(t, mu = 0, sigma = 1, Q, start = 0)mean_gengamma(mu = 0, sigma = 1, Q)rmst_gengamma.orig(t, shape, scale = 1, k, start = 0)mean_gengamma.orig(shape, scale = 1, k)rmst_gompertz(t, shape, rate = 1, start = 0)mean_gompertz(shape, rate = 1)mean_llogis(shape = 1, scale = 1)rmst_llogis(t, shape = 1, scale = 1, start = 0)mean_lnorm(meanlog = 0, sdlog = 1)rmst_lnorm(t, meanlog = 0, sdlog = 1, start = 0)mean_weibull(shape, scale = 1)rmst_weibull(t, shape, scale = 1, start = 0)rmst_weibullPH(t, shape, scale = 1, start = 0)mean_weibullPH(shape, scale = 1)Arguments
rate | Rate parameter (exponential and gamma) |
t | Vector of times to which restricted mean survival time is evaluated |
start | Optional left-truncation time or times. The returnedrestricted mean survival will be conditioned on survival up tothis time. |
shape | Shape parameter (Weibull, gamma, log-logistic, generalized gamma [orig],generalized F [orig]) |
mu | Mean on the log scale (generalized gamma, generalized F) |
sigma | Standard deviation on the log scale (generalized gamma, generalized F) |
Q | Vector of first shape parameters (generalized gamma, generalized F) |
P | Vector of second shape parameters (generalized F) |
s1 | Vector of first F shape parameters (generalized F [orig]) |
s2 | vector of second F shape parameters (generalized F [orig]) |
scale | Scale parameter (Weibull, log-logistic, generalized gamma [orig],generalized F [orig]) |
k | vector of shape parameters (generalized gamma [orig]). |
meanlog | Mean on the log scale (log normal) |
sdlog | Standard deviation on the log scale (log normal) |
Details
For the exponential, Weibull, log-logistic, lognormal, and gamma, mean survival isprovided analytically. Restricted mean survival for the exponential distributionis also provided analytically. Mean and restricted means for other distributionsare calculated via numeric integration.
Value
mean survival (functions beginning 'mean') or restricted mean survival(functions beginning 'rmst_').
Author(s)
Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>
See Also
dexp,dweibull,dgamma,dlnorm,dgompertz,dgengamma,dgenf
Mean times to events from a flexsurvmix model
Description
This returns the mean of each event-specific parametric time-to-eventdistribution in the mixture model, which is the mean time to eventconditionally on that event being the one that happens.
Usage
mean_flexsurvmix(x, newdata = NULL, B = NULL)Arguments
x | Fitted model object returned from |
newdata | Data frame or list of covariate values. If omitted for amodel with covariates, a default is used, defined by all combinations offactors if the only covariates in the model are factors, or all covariatevalues of zero if there are any non-factor covariates in the model. |
B | Number of simulations to use to compute 95% confidence intervals,based on the asymptotic multivariate normal distribution of the basicparameter estimates. If |
Value
Mean times to next event conditionally on each alternative event,given the specified covariate values.
Mean time to final state in a mixture multi-state model
Description
Calculate the mean time from the start of the process to a final (or"absorbing") state in a mixture multi-state model. Models with cycles arenot supported.
Usage
meanfinal_fmixmsm(x, newdata = NULL, final = FALSE, B = NULL)Arguments
x | Object returned by |
newdata | Data frame or list of covariate values. If omitted for amodel with covariates, a default is used, defined by all combinations offactors if the only covariates in the model are factors, or all covariatevalues of zero if there are any non-factor covariates in the model. |
final | If |
B | Number of simulations to use to compute 95% confidence intervals,based on the asymptotic multivariate normal distribution of the basicparameter estimates. If |
Value
A data frame of mean times to absorption, by covariate values andpathway (or by final state)
Model frame from a flexsurvmix object
Description
Returns a list of data frames, with each component containing thedata that were used for the model fit for that mixture component.
Usage
## S3 method for class 'flexsurvmix'model.frame(formula, ...)Arguments
formula | Fitted model object from |
... | Further arguments (currently unused). |
Value
A list of data frames
Extract original data fromflexsurvreg objects.
Description
Extract the data from a model fitted withflexsurvreg.
Usage
## S3 method for class 'flexsurvreg'model.frame(formula, ...)## S3 method for class 'flexsurvreg'model.matrix(object, par = NULL, ...)Arguments
formula | A fitted model object, as returned by |
... | Further arguments (not used). |
object | A fitted model object, as returned by |
par | String naming the parameter whose linear model matrix isdesired. The default value of |
Value
model.frame returns a data frame with all the originalvariables used for the model fit.
model.matrix returns a design matrix for a part of the model thatincludes covariates. The required part is indicated by the"par"argument (see above).
Author(s)
C. H. Jacksonchris.jackson@mrc-bsu.cam.ac.uk
See Also
flexsurvreg,model.frame,model.matrix.
Cumulative intensity function for parametric multi-state models
Description
Cumulative transition-specific intensity/hazard functions forfully-parametric multi-state or competing risks models, using apiecewise-constant approximation that will allow prediction using thefunctions in themstate package.
Usage
msfit.flexsurvreg( object, t, newdata = NULL, variance = TRUE, tvar = "trans", trans, B = 1000)Arguments
object | Output from The model should have been fitted to data consisting of one row for eachobserved transition and additional rows corresponding to censored times tocompeting transitions. This is the "long" format, or counting processformat, as explained in theflexsurv vignette. The model should contain a categorical covariate indicating the transition.In Alternatively, if the parameters (including covariate effects) are assumedto be different between different transitions, then a list oftransition-specific models can be formed. This list has one component foreach permitted transition in the multi-state model. This is morecomputationally efficient, particularly for larger models and datasets.See the example below, and the vignette. |
t | Vector of times. These do not need to be the same as the observedevent times, and since the model is parametric, they can be outside therange of the data. A grid of more frequent times will provide a betterapproximation to the cumulative hazard trajectory for prediction with |
newdata | A data frame specifying the values of covariates in thefitted model, other than the transition number. This must be specified ifthere are other covariates. The variable names should be the same as thosein the fitted model formula. There must be either one value per covariate(the typical situation) or |
variance | Calculate the variances and covariances of the transitioncumulative hazards ( |
tvar | Name of the categorical variable in the model formula thatrepresents the transition number. This should have been defined as a factor,with factor levels that correspond to elements of |
trans | Matrix indicating allowed transitions in the multi-statemodel, in the format understood bymstate: a matrix of integers whose |
B | Number of simulations from the normal asymptotic distribution usedto calculate variances. Decrease for greater speed at the expense ofaccuracy. |
Value
An object of class"msfit", in the same form as the objectsused in themstate package. Themsfit methodfrommstate returns the equivalent cumulative intensities for Coxregression models fitted withcoxph.
Author(s)
C. H. Jacksonchris.jackson@mrc-bsu.cam.ac.uk
References
Liesbeth C. de Wreede, Marta Fiocco, Hein Putter (2011).mstate: An R Package for the Analysis of Competing Risks andMulti-State Models.Journal of Statistical Software, 38(7), 1-30.doi:10.18637/jss.v038.i07
Mandel, M. (2013). "Simulation based confidence intervals for functionswith complicated derivatives." The American Statistician 67(2):76-81
See Also
flexsurv provides alternative functions designedspecifically for predicting from parametric multi-state models withoutcallingmstate. These includepmatrix.fs andpmatrix.simfs for the transition probability matrix, andtotlos.fs andtotlos.simfs for expected totallengths of stay in states. These are generally more efficient than goingviamstate.
Examples
## 3 state illness-death model for bronchiolitis obliterans## Compare clock-reset / semi-Markov multi-state models## Simple exponential model (reduces to Markov)bexp <- flexsurvreg(Surv(years, status) ~ trans, data=bosms3, dist="exp")tmat <- rbind(c(NA,1,2),c(NA,NA,3),c(NA,NA,NA))mexp <- msfit.flexsurvreg(bexp, t=seq(0,12,by=0.1), trans=tmat, tvar="trans", variance=FALSE)## Cox semi-parametric model within each transitionbcox <- coxph(Surv(years, status) ~ strata(trans), data=bosms3)if (require("mstate")){mcox <- mstate::msfit(bcox, trans=tmat)## Flexible parametric spline-based model bspl <- flexsurvspline(Surv(years, status) ~ trans + gamma1(trans), data=bosms3, k=3)mspl <- msfit.flexsurvreg(bspl, t=seq(0,12,by=0.1), trans=tmat, tvar="trans", variance=FALSE)## Compare fit: exponential model is OK but the spline is betterplot(mcox, lwd=1, xlim=c(0, 12), ylim=c(0,4))cols <- c("black","red","green")for (i in 1:3){ lines(mexp$Haz$time[mexp$Haz$trans==i], mexp$Haz$Haz[mexp$Haz$trans==i], col=cols[i], lwd=2, lty=2) lines(mspl$Haz$time[mspl$Haz$trans==i], mspl$Haz$Haz[mspl$Haz$trans==i], col=cols[i], lwd=3)}legend("topright", lwd=c(1,2,3), lty=c(1,2,1), c("Cox", "Exponential", "Flexible parametric"), bty="n")}## Fit a list of models, one for each transition## More computationally efficient, but only valid if parameters## are different between transitions.## Not run: bexp.list <- vector(3, mode="list")for (i in 1:3) { bexp.list[[i]] <- flexsurvreg(Surv(years, status) ~ 1, subset=(trans==i), data=bosms3, dist="exp")}## The list of models can be passed to this and other functions,## as if it were a single multi-state model. msfit.flexsurvreg(bexp.list, t=seq(0,12,by=0.1), trans=tmat)## End(Not run)Number of observations contributing to a fitted flexible survival model
Description
Number of observations contributing to a fitted flexible survival model
Usage
## S3 method for class 'flexsurvreg'nobs(object, cens = TRUE, ...)Arguments
object | Output from |
cens | Include censored observations in the number. |
... | Further arguments passed to or from other methods. Currentlyunused. |
Details
By default, this matches the behaviour of thenobs method forsurvreg objects, including both censored and uncensored observations.
If a weightedflexsurvreg analysis was done, then this function returns the sum of the weights.
Value
This returns themod$N component of the fittedmodel objectmod. Seeflexsurvreg,flexsurvspline for full documentation of all components.
Author(s)
C. H. Jacksonchris.jackson@mrc-bsu.cam.ac.uk
See Also
Simulate from the asymptotic normal distribution of parameter estimates.
Description
Produce a matrix of alternative parameter estimates under samplinguncertainty, at covariate values supplied by the user. Used bysummary.flexsurvreg for obtaining confidence intervals aroundfunctions of parameters.
Usage
normboot.flexsurvreg( x, B, newdata = NULL, X = NULL, transform = FALSE, raw = FALSE, tidy = FALSE, rawsim = NULL)Arguments
x | A fitted model from |
B | Number of samples. |
newdata | Data frame or list containing the covariate values toevaluate the parameters at. If there are covariates in the model, at leastone of |
X | Alternative (less convenient) format for covariate values: amatrix with one row, with one column for each covariate or factor contrast.Formed from all the "model matrices", one for each named parameter of thedistribution, with intercepts excluded, |
transform |
|
raw | Return samples of the baseline parameters and the covariateeffects, rather than the default of adjusting the baseline parameters forcovariates. |
tidy | If |
rawsim | allows input of raw samples from a previous run of |
Value
Ifnewdata includes only one covariate combination, a matrixwill be returned withB rows, and one column for each namedparameter of the survival distribution.
If more than one covariate combination is requested (e.g.newdata isa data frame with more than one row), then a list of matrices will bereturned, one for each covariate combination.
Author(s)
C. H. Jacksonchris.jackson@mrc-bsu.cam.ac.uk
References
Mandel, M. (2013). "Simulation based confidence intervals forfunctions with complicated derivatives." The American Statistician (inpress).
See Also
Examples
fite <- flexsurvreg(Surv(futime, fustat) ~ age, data = ovarian, dist="exp") normboot.flexsurvreg(fite, B=10, newdata=list(age=50)) normboot.flexsurvreg(fite, B=10, X=matrix(50,nrow=1)) normboot.flexsurvreg(fite, B=10, newdata=list(age=0)) ## closer to... fite$resTransition probabilities from a flexsurvmix model
Description
These quantities are variously known as transition probabilities, or stateoccupancy probabilities, or values of the "cumulative incidence" function,or values of the "subdistribution" function. They are the probabilities thatan individual has experienced an event of a particular kind by timet.
Usage
p_flexsurvmix(x, newdata = NULL, startname = "start", t = 1, B = NULL)Arguments
x | Fitted model object returned from |
newdata | Data frame or list of covariate values. If omitted for amodel with covariates, a default is used, defined by all combinations offactors if the only covariates in the model are factors, or all covariatevalues of zero if there are any non-factor covariates in the model. |
startname | Name of the state where individuals start. This considersthe model as a multi-state model where people start in this state, and maytransition to one of the competing events. |
t | Vector of times |
B | Number of simulations to use to compute 95% confidence intervals,based on the asymptotic multivariate normal distribution of the basicparameter estimates. If |
Details
Note that "cumulative incidence" is a misnomer, as "incidence" typicallymeans a hazard, and the quantities computed here are not cumulative hazards,but probabilities.
Value
A data frame with transition probabilities by time, covariate valueand destination state.
Transition-specific parameters in a flexible parametric multi-state model
Description
List of maximum likelihood estimates of transition-specific parameters ina flexible parametric multi-state model, at given covariate values.
Usage
pars.fmsm(x, trans, newdata = NULL, tvar = "trans")Arguments
x | A multi-state model fitted with
|
trans | Matrix indicating allowed transitions. See |
newdata | A data frame specifying the values of covariates in thefitted model, other than the transition number. See |
tvar | Variable in the data representing the transition type. Notrequired if |
Value
A list with one component for each permitted transition. Each componenthas one element for each parameter of the parametric distribution that generatesthe corresponding event in the multi-state model.
Author(s)
Christopher Jacksonchris.jackson@mrc-bsu.cam.ac.uk.
Fitted densities for times to events in a flexsurvmix model
Description
This returns an estimate of the probability density for the time to eachcompeting event, at a vector of times supplied by the user.
Usage
pdf_flexsurvmix(x, newdata = NULL, t = NULL)Arguments
x | Fitted model object returned from |
newdata | Data frame or list of covariate values. If omitted for amodel with covariates, a default is used, defined by all combinations offactors if the only covariates in the model are factors, or all covariatevalues of zero if there are any non-factor covariates in the model. |
t | Vector of times at which to evaluate the probability density |
Value
A data frame with each row giving the fitted densitydens fora combination of covariate values, time and competing event.
Probabilities of final states in a flexible parametric competing risks model
Description
This requires the model to be Markov, and is not valid for semi-Markovmodels, as it works by wrappingpmatrix.fs to calculate thetransition probability over a very large time. As it also works on afmsm object formed from transition-specific time-to-event models,it therefore only works on competing risks models, defined by just one startingstate with multiple destination states representing competing events. For these models, this function returns the probability governing whichcompeting event happens next. However this function simply wrapspmatrix.fs,so for other models,pmatrix.fs orpmatrix.simfs can be used with alarge forecast timet.
Usage
pfinal_fmsm(x, newdata = NULL, fromstate, maxt = 1e+05, B = 0, cores = NULL)Arguments
x | Object returned by |
newdata | Data frame of covariate values, with one column percovariate, and one row per alternative value. |
fromstate | State from which to calculate the transition probabilitystate. This should refer to the name of a row of the transition matrix |
maxt | Large time to use for forecasting final state probabilities.The transition probability from zero to this time is used. Note |
B | Number of simulations to use to calculate 95% confidence intervalsbased on the asymptotic normal distribution of the basic parameterestimates. If |
cores | Number of processor cores to use. If |
Value
A data frame with one row per covariate value and destination state,giving the state in columnstate, and probability in columnval. Additional columnslower andupper for theconfidence limits are returned ifB=0.
Plots of fitted flexible survival models
Description
Plot fitted survival, cumulative hazard or hazard from a parametric modelagainst nonparametric estimates to diagnose goodness-of-fit. Alternativelyplot a user-defined function of the model parameters against time.
Usage
## S3 method for class 'flexsurvreg'plot( x, newdata = NULL, X = NULL, type = "survival", fn = NULL, t = NULL, start = 0, est = TRUE, ci = NULL, B = 1000, cl = 0.95, col.obs = "black", lty.obs = 1, lwd.obs = 1, col = "red", lty = 1, lwd = 2, col.ci = NULL, lty.ci = 2, lwd.ci = 1, ylim = NULL, add = FALSE, ...)Arguments
x | Output from |
newdata | Data frame containing covariate values to produce fittedvalues for. See If there are only factor covariates in the model, then Kaplan-Meier (ornonparametric hazard...) curves are plotted for all distinct groups, andby default, fitted curves are also plotted for these groups. To plotKaplan-Meier and fitted curves for only a subset of groups, use If there are any continuous covariates, then a single populationKaplan-Meier curve is drawn. By default, a single fitted curve is drawnwith the covariates set to their mean values in the data - for categoricalcovariates, the means of the 0/1 indicator variables are taken. |
X | Alternative way to supply covariate values, as a model matrix.See |
type |
Ignored if |
fn | Custom function of the parameters to summarise against time. Thefirst two arguments of the function must be |
t | Vector of times to plot fitted values for, see |
start | Left-truncation points, see |
est | Plot fitted curves ( |
ci | Plot confidence intervals for fitted curves. By default, this is |
B | Number of simulations controlling accuracy of confidenceintervals, as used in |
cl | Width of confidence intervals, by default 0.95 for 95%intervals. |
col.obs | Colour of the nonparametric curve. |
lty.obs | Line type of the nonparametric curve. |
lwd.obs | Line width of the nonparametric curve. |
col | Colour of the fitted parametric curve(s). |
lty | Line type of the fitted parametric curve(s). |
lwd | Line width of the fitted parametric curve(s). |
col.ci | Colour of the fitted confidence limits, defaulting to thesame as for the fitted curve. |
lty.ci | Line type of the fitted confidence limits. |
lwd.ci | Line width of the fitted confidence limits. |
ylim | y-axis limits: vector of two elements. |
add | If |
... | Other options to be passed to |
Note
Some standard plot arguments such as"xlim","xlab" may notwork. This function was designed as a quick check of model fit. Userswanting publication-quality graphs are advised to set up an empty plot withthe desired axes first (e.g. withplot(...,type="n",...)), then usesuitablelines functions to add lines.
If case weights were used to fit the model, these are used when producingnonparametric estimates of survival and cumulative hazard, but not forthe hazard estimates.
Author(s)
C. H. Jacksonchris.jackson@mrc-bsu.cam.ac.uk
See Also
Plot standardized metrics from a fitted flexsurv model
Description
Plot standardized metrics such as the marginal survival, restricted mean survival and hazard, based on a fitted flexsurv model.
Usage
## S3 method for class 'standsurv'plot(x, contrast = FALSE, ci = FALSE, expected = FALSE, ...)Arguments
x | A standsurv object returned by |
contrast | Should contrasts of standardized metrics be plotted. Defaultsto FALSE |
ci | Should confidence intervals be plotted (if calculated in |
expected | Should the marginal expected survival / hazard also be plotted? This can only be invoked if |
... | Not currently used |
Value
A ggplot showing the standardized metric calculated bystandsurv over time. Modification of the plot ispossible by adding further ggplot objects, see Examples.
Examples
## Use bc dataset, with an age variable appended## mean age is higher in those with smaller observed survival times newbc <- bcnewbc$age <- rnorm(dim(bc)[1], mean = 65-scale(newbc$recyrs, scale=FALSE), sd = 5)## Fit a Weibull flexsurv model with group and age as covariatesweib_age <- flexsurvreg(Surv(recyrs, censrec) ~ group+age, data=newbc, dist="weibull")## Calculate standardized survival and the difference in standardized survival## for the three levels of group across a grid of survival timesstandsurv_weib_age <- standsurv(weib_age, at = list(list(group="Good"), list(group="Medium"), list(group="Poor")), t=seq(0,7, length=100), contrast = "difference", ci=TRUE, boot = TRUE, B=10, seed=123)plot(standsurv_weib_age)plot(standsurv_weib_age) + ggplot2::theme_bw() + ggplot2::ylab("Survival") + ggplot2::xlab("Time (years)") + ggplot2::guides(color=ggplot2::guide_legend(title="Prognosis"), fill=ggplot2::guide_legend(title="Prognosis"))plot(standsurv_weib_age, contrast=TRUE, ci=TRUE) + ggplot2::ylab("Difference in survival")Plot nonparametric estimates of survival from right-truncated data.
Description
plot.survrtrunc creates a new plot, whilelines.survrtrunc adds lines to an exising plot.
Usage
## S3 method for class 'survrtrunc'plot(x, ...)## S3 method for class 'survrtrunc'lines(x, ...)Arguments
x | Object of class |
... | Other arguments to be passed to |
Transition probability matrix from a fully-parametric, time-inhomogeneousMarkov multi-state model
Description
The transition probability matrix for time-inhomogeneous Markov multi-statemodels fitted to time-to-event data withflexsurvreg. Thishasr,s entry giving the probability that an individual is in states at timet, given they are in stater at time0.
Usage
pmatrix.fs( x, trans = NULL, t = 1, newdata = NULL, condstates = NULL, ci = FALSE, tvar = "trans", sing.inf = 1e+10, B = 1000, cl = 0.95, tidy = FALSE, ...)Arguments
x | A model fitted with
|
trans | Matrix indicating allowed transitions. See |
t | Time or vector of times to predict state occupancy probabilitiesfor. |
newdata | A data frame specifying the values of covariates in thefitted model, other than the transition number. See |
condstates | xInstead of the unconditional probability of being in state This is used, for example, in competing risks situations, e.g. if the competing states are death or recovery from a disease, and we want to compute the probability a patient has died, given they have died or recovered. If these are absorbing states, then as |
ci | Return a confidence interval calculated by simulating from theasymptotic normal distribution of the maximum likelihood estimates. Turnedoff by default, since this is computationally intensive. If turned on,users should increase |
tvar | Variable in the data representing the transition type. Notrequired if |
sing.inf | If there is a singularity in the observed hazard, forexample a Weibull distribution with |
B | Number of simulations from the normal asymptotic distribution usedto calculate variances. Decrease for greater speed at the expense ofaccuracy. |
cl | Width of symmetric confidence intervals, relative to 1. |
tidy | If TRUE then return the results as a tidy data frame |
... | Arguments passed to |
Details
This is computed by solving the Kolmogorov forward differential equationnumerically, using the methods in thedeSolve package. Theequation is
\frac{dP(t)}{dt} = P(t) Q(t)
whereP(t) is the transition probability matrix for timet, andQ(t) is the transition hazard or intensity as a function oft.The initial condition isP(0) = I.
Note that the packagemsm has a similar methodpmatrix.msm.pmatrix.fs should give the same results aspmatrix.msm whenboth of these conditions hold:
the time-to-event distribution is exponential for alltransitions, thus the
flexsurvregmodel was fitted withdist="exp"and the model is time-homogeneous.themsmmodel was fitted with
exacttimes=TRUE, thus all the event times areknown, and there are no time-dependent covariates.
msm only allows exponential or piecewise-exponential time-to-eventdistributions, whileflexsurvreg allows more flexible models.msm however was designed in particular for panel data, where theprocess is observed only at arbitrary times, thus the times of transitionare unknown, which makes flexible models difficult.
This function is only valid for Markov ("clock-forward") multi-statemodels, though no warning or error is currently given if the model is notMarkov. Seepmatrix.simfs for the equivalent for semi-Markov("clock-reset") models.
Value
The transition probability matrix, ift is of length 1. Ift is longer, return a list of matrices, or a data frame iftidy is TRUE.
Ifci=TRUE, each element has attributes"lower" and"upper" giving matrices of the corresponding confidence limits.These are formatted for printing but may be extracted usingattr().
Author(s)
Christopher Jacksonchris.jackson@mrc-bsu.cam.ac.uk.
See Also
pmatrix.simfs,totlos.fs,msfit.flexsurvreg.
Examples
# BOS example in vignette, and in msfit.flexsurvregbexp <- flexsurvreg(Surv(Tstart, Tstop, status) ~ trans, data=bosms3, dist="exp")tmat <- rbind(c(NA,1,2),c(NA,NA,3),c(NA,NA,NA))# more likely to be dead (state 3) as time moves on, or if start with# BOS (state 2)pmatrix.fs(bexp, t=c(5,10), trans=tmat)Transition probability matrix from a fully-parametric, semi-Markovmulti-state model
Description
The transition probability matrix for semi-Markov multi-state models fittedto time-to-event data withflexsurvreg. This hasr,sentry giving the probability that an individual is in states at timet, given they are in stater at time0.
Usage
pmatrix.simfs( x, trans, t = 1, newdata = NULL, ci = FALSE, tvar = "trans", tcovs = NULL, M = 1e+05, B = 1000, cl = 0.95, cores = NULL, tidy = FALSE)Arguments
x | A model fitted with
|
trans | Matrix indicating allowed transitions. See |
t | Time to predict state occupancy probabilities for. This can be a single number or a vector of different numbers. |
newdata | A data frame specifying the values of covariates in thefitted model, other than the transition number. See |
ci | Return a confidence interval calculated by simulating from theasymptotic normal distribution of the maximum likelihood estimates. Thisis turned off by default, since two levels of simulation are required. Ifturned on, users should adjust |
tvar | Variable in the data representing the transition type. Notrequired if |
tcovs | Predictable time-dependent covariates such as age, see |
M | Number of individuals to simulate in order to approximate thetransition probabilities. Users should adjust this to obtain the requiredprecision. |
B | Number of simulations from the normal asymptotic distribution usedto calculate confidence limits. Decrease for greater speed at the expense ofaccuracy. |
cl | Width of symmetric confidence intervals, relative to 1. |
cores | Number of processor cores used when calculating confidence limits by repeated simulation. The default uses single-core processing. |
tidy | If |
Details
This is computed by simulating a large number of individualsM usingthe maximum likelihood estimates of the fitted model and the functionsim.fmsm. Therefore this requires a random sampling functionfor the parametric survival model to be available: see the "Details"section ofsim.fmsm. This will be available for all built-indistributions, though users may need to write this for custom models.
Note the random sampling method forflexsurvspline models iscurrently very inefficient, so that looping over theM individualswill be very slow.
pmatrix.fs is a more efficient method based on solving theKolmogorov forward equation numerically, which requires the multi-statemodel to be Markov. No error or warning is given if runningpmatrix.simfs with a Markov model, but this is still invalid.
Value
The transition probability matrix. Ifci=TRUE, there areattributes"lower" and"upper" giving matrices of thecorresponding confidence limits. These are formatted for printing but maybe extracted usingattr().
Author(s)
Christopher Jacksonchris.jackson@mrc-bsu.cam.ac.uk.
See Also
pmatrix.fs,sim.fmsm,totlos.simfs,msfit.flexsurvreg.
Examples
# BOS example in vignette, and in msfit.flexsurvregbexp <- flexsurvreg(Surv(years, status) ~ trans, data=bosms3, dist="exp")tmat <- rbind(c(NA,1,2),c(NA,NA,3),c(NA,NA,NA))# more likely to be dead (state 3) as time moves on, or if start with# BOS (state 2)pmatrix.simfs(bexp, t=5, trans=tmat)pmatrix.simfs(bexp, t=10, trans=tmat)# these results should converge to those in help(pmatrix.fs), as M# increases here and ODE solving precision increases there, since with# an exponential distribution, the semi-Markov model is the same as the# Markov model.Probability of each pathway taken through a mixture multi-state model
Description
Probability of each pathway taken through a mixture multi-state model
Usage
ppath_fmixmsm(x, newdata = NULL, final = FALSE, B = NULL)Arguments
x | Object returned by |
newdata | Data frame or list of covariate values. If omitted for amodel with covariates, a default is used, defined by all combinations offactors if the only covariates in the model are factors, or all covariatevalues of zero if there are any non-factor covariates in the model. |
final | If |
B | Number of simulations to use to compute 95% confidence intervals,based on the asymptotic multivariate normal distribution of the basicparameter estimates. If |
Value
Data frame of pathway probabilities by covariate value and pathway.
Predictions from flexible survival models
Description
Predict outcomes from flexible survival models at the covariate valuesspecified innewdata.
Usage
## S3 method for class 'flexsurvreg'predict( object, newdata, type = "response", times, start = 0, conf.int = FALSE, conf.level = 0.95, se.fit = FALSE, p = c(0.1, 0.9), ...)Arguments
object | Output from |
newdata | Data frame containing covariate values at which to producefitted values. There must be a column for every covariate in the modelformula used to fit If |
type | Character vector for the type of predictions desired.
|
times | Vector of time horizons at which to compute fitted values.Only applies when If not specified, predictions for For |
start | Optional left-truncation time or times. The returnedsurvival, hazard, or cumulative hazard will be conditioned on survival upto this time. |
conf.int | Logical. Should confidence intervals be returned?Default is |
conf.level | Width of symmetric confidence intervals, relative to 1. |
se.fit | Logical. Should standard errors of fitted values be returned?Default is |
p | Vector of quantiles at which to return fitted values when |
... | Not currently used. |
Value
Atibble with same number of rows asnewdata and in the same order. If multiple predictions arerequested, atibble containing a single list-columnof data frames.
For the list-column of data frames - the dimensions of each data framewill be identical. Rows are added for each value oftimes orp requested.
This function is a wrapper aroundsummary.flexsurvreg,designed to helpflexsurv to integrate with the "tidymodels"ecosystem, in particular thecensored package.summary.flexsurvreg returns the same results but in a moreconventional format.
Author(s)
Matthew T. Warkentin (https://github.com/mattwarkentin)
See Also
summary.flexsurvreg,residuals.flexsurvreg
Examples
fitg <- flexsurvreg(formula = Surv(futime, fustat) ~ age, data = ovarian, dist = "gengamma")## Simplest prediction: mean or median, for covariates defined by original datasetpredict(fitg)predict(fitg, type = "quantile", p = 0.5)## Simple prediction for user-defined covariate valuespredict(fitg, newdata = data.frame(age = c(40, 50, 60)))predict(fitg, type = "quantile", p = 0.5, newdata = data.frame(age = c(40,50,60)))## Predict multiple quantiles and unnestrequire(tidyr)pr <- predict(fitg, type = "survival", times = c(600, 800))tidyr::unnest(pr, .pred)Probabilities of competing events from a flexsurvmix model
Description
Probabilities of competing events from a flexsurvmix model
Usage
probs_flexsurvmix(x, newdata = NULL, B = NULL)Arguments
x | Fitted model object returned from |
newdata | Data frame or list of covariate values. If omitted for amodel with covariates, a default is used, defined by all combinations offactors if the only covariates in the model are factors, or all covariatevalues of zero if there are any non-factor covariates in the model. |
B | Number of simulations to use to compute 95% confidence intervals,based on the asymptotic multivariate normal distribution of the basicparameter estimates. If |
Value
A data frame containing the probability that each of the competingevents will occur next, by event and by any covariate values specified innewdata.
Quantiles of the distribution of the time until reaching a final state in amixture multi-state model
Description
Calculate the quantiles of the time from the start of the process to eachpossible final (or "absorbing") state in a mixture multi-state model.Models with cycles are not supported.
Usage
qfinal_fmixmsm( x, newdata = NULL, final = FALSE, B = NULL, n = 10000, probs = c(0.025, 0.5, 0.975))Arguments
x | Object returned by |
newdata | Data frame or list of covariate values. If omitted for amodel with covariates, a default is used, defined by all combinations offactors if the only covariates in the model are factors, or all covariatevalues of zero if there are any non-factor covariates in the model. |
final | If |
B | Number of simulations to use to compute 95% confidence intervals,based on the asymptotic multivariate normal distribution of the basicparameter estimates. If |
n | Number of individual-level simulations to use to characterise thetime-to-event distributions |
probs | Quantiles to calculate, by default, |
Value
Data frame of quantiles of the time to final state by pathway andcovariate value, or by final state and covariate value.
Generic function to find quantiles of a distribution
Description
Generic function to find the quantiles of a distribution, given theequivalent probability distribution function.
Usage
qgeneric(pdist, p, matargs = NULL, scalarargs = NULL, ...)Arguments
pdist | Probability distribution function, for example, |
p | Vector of probabilities to find the quantiles for. |
matargs | Character vector giving the elements of |
scalarargs | Character vector naming scalar arguments of the distribution function that cannot be vectorised. This is used for the arguments |
... | The remaining arguments define parameters of the distribution This may also contain the standard arguments If the distribution is bounded above or below, then this should containarguments |
Details
This function is used by default for custom distributions for which aquantile function is not provided.
It works by finding the root of the equationh(q) = pdist(q) - p = 0.Starting from the interval(-1, 1), the interval width is expanded by50% untilh() is of opposite sign at either end. The root is thenfound usinguniroot.
This assumes a suitably smooth, continuous distribution.
Value
Vector of quantiles of the distribution atp.
Author(s)
Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>
Examples
qnorm(c(0.025, 0.975), 0, 1)qgeneric(pnorm, c(0.025, 0.975), mean=0, sd=1) # must name the argumentsQuantiles of time-to-event distributions in a flexsurvmix model
Description
This returns the quantiles of each event-specific parametric time-to-eventdistribution in the mixture model, which describes the time to the eventconditionally on that event being the one that happens.
Usage
quantile_flexsurvmix(x, newdata = NULL, B = NULL, probs = c(0.025, 0.5, 0.975))Arguments
x | Fitted model object returned from |
newdata | Data frame or list of covariate values. If omitted for amodel with covariates, a default is used, defined by all combinations offactors if the only covariates in the model are factors, or all covariatevalues of zero if there are any non-factor covariates in the model. |
B | Number of simulations to use to compute 95% confidence intervals,based on the asymptotic multivariate normal distribution of the basicparameter estimates. If |
probs | Vector of alternative quantiles, by default |
Objects exported from other packages
Description
These objects are imported from other packages. Follow the linksbelow to see their documentation.
Calculate residuals for flexible survival models
Description
Calculates residuals forflexsurvreg orflexsurvspline model fits.
Usage
## S3 method for class 'flexsurvreg'residuals(object, type = "response", ...)Arguments
object | Output from |
type | Character string for the type of residual desired. Currently only |
... | Not currently used. |
Details
Residuals oftype = "response" are calculated as the naive difference between the observed survival and the covariate-specific predicted mean survival frompredict.flexsurvreg, ignoring whether the event time is observed or censored.
type="coxsnell" returns the Cox-Snell residual, defined as the estimated cumulative hazard at each data point. To check the fit of theA more fully featured utility for this is provided in the functioncoxsnell_flexsurvreg.
Value
Numeric vector with the same length asnobs(object).
See Also
Examples
fitg <- flexsurvreg(formula = Surv(futime, fustat) ~ age, data = ovarian, dist = "gengamma")residuals(fitg, type="response")Restricted mean times to events from a flexsurvmix model
Description
This returns the restricted mean of each event-specific parametric time-to-eventdistribution in the mixture model, which is the mean time to eventconditionally on that event being the one that happens, and conditionallyon the event time being less than some time horizontot.
Usage
rmst_flexsurvmix(x, newdata = NULL, tot = Inf, B = NULL)Arguments
x | Fitted model object returned from |
newdata | Data frame or list of covariate values. If omitted for amodel with covariates, a default is used, defined by all combinations offactors if the only covariates in the model are factors, or all covariatevalues of zero if there are any non-factor covariates in the model. |
tot | Time horizon to compute the restricted mean until. |
B | Number of simulations to use to compute 95% confidence intervals,based on the asymptotic multivariate normal distribution of the basicparameter estimates. If |
Value
Restricted mean times to next event conditionally on each alternative event,given the specified covariate values.
Generic function to find restricted mean survival time for some distribution
Description
Generic function to find the restricted mean of a distribution, given theequivalent probability distribution function, using numeric integration.
Usage
rmst_generic(pdist, t, start = 0, matargs = NULL, scalarargs = NULL, ...)Arguments
pdist | Probability distribution function, for example, |
t | Vector of times at which rmst is evaluated |
start | Optional left-truncation time or times. The returnedrestricted mean survival will be conditioned on survival up tothis time. |
matargs | Character vector giving the elements of |
scalarargs | Character vector naming scalar arguments of the distribution function that cannot be vectorised. This is used, for example, for the arguments |
... | The remaining arguments define parameters of the distribution |
Details
This function is used by default for custom distributions for which anrmst function is not provided.
This assumes a suitably smooth, continuous distribution.
Value
Vector of restricted mean survival times of the distribution atp.
Author(s)
Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>
Examples
rmst_lnorm(500, start=250, meanlog=7.4225, sdlog = 1.1138)rmst_generic(plnorm, 500, start=250, meanlog=7.4225, sdlog = 1.1138)# must name the argumentsSimulate paths through a fully parametric semi-Markov multi-state model
Description
Simulate changes of state and transition times from a semi-Markovmulti-state model fitted usingflexsurvreg.
Usage
sim.fmsm( x, trans = NULL, t, newdata = NULL, start = 1, M = 10, tvar = "trans", tcovs = NULL, tidy = FALSE)Arguments
x | A model fitted with Alternatively |
trans | Matrix indicating allowed transitions. See |
t | Time, or vector of times for each of the |
newdata | A data frame specifying the values of covariates in thefitted model, other than the transition number. See |
start | Starting state, or vector of starting states for each of the |
M | Number of individual trajectories to simulate. |
tvar | Variable in the data representing the transition type. Notrequired if |
tcovs | Names of "predictable" time-dependent covariates in |
tidy | If |
Details
sim.fmsm relies on the presence of a function to sample randomnumbers from the parametric survival distribution used in the fitted modelx, for examplerweibull for Weibull models. Ifx was fitted using a custom distribution, calleddist say,then there must be a function called (something like)rdist eitherin the working environment, or supplied through thedfns argument toflexsurvreg. This must be in the same format as standard Rfunctions such asrweibull, with first argumentn, andremaining arguments giving the parameters of the distribution. It must bevectorised with respect to the parameter arguments.
This function is only valid for semi-Markov ("clock-reset") models, thoughno warning or error is currently given if the model is not of this type. Anequivalent for time-inhomogeneous Markov ("clock-forward") models hascurrently not been implemented.
Value
Iftidy=TRUE, a data frame with one row for each simulated transition, giving the individual IDid, start statestart, end stateend, transition labeltrans, time of the transition since the start of the process (time), and time since the previous transition (delay).
Iftidy=FALSE, a list of two matrices namedst andt. The rows ofeach matrix represent simulated individuals. The columns oftcontain the times when the individual changes state, to the correspondingstates inst.
The first columns will always contain the starting states and the startingtimes. The last column oft represents either the time when theindividual moves to an absorbing state, or right-censoring in a transientstate at the time given in thet argument tosim.fmsm.
Author(s)
Christopher Jacksonchris.jackson@mrc-bsu.cam.ac.uk.
See Also
Examples
bexp <- flexsurvreg(Surv(years, status) ~ trans, data=bosms3, dist="exp")tmat <- rbind(c(NA,1,2),c(NA,NA,3),c(NA,NA,NA))sim.fmsm(bexp, M=10, t=5, trans=tmat)Simulate and summarise final outcomes from a flexible parametric multi-statemodel
Description
Estimates the probability of each final outcome ("absorbing" state), and themean and quantiles of the time to that outcome for people who experience it,by simulating a large sample of individuals from the model. This can be usedfor both Markov and semi-Markov models.
Usage
simfinal_fmsm( x, newdata = NULL, probs = c(0.025, 0.5, 0.975), t = 1000, M = 1e+05, B = 0, cores = NULL)Arguments
x | Object returned by |
newdata | Data frame of covariate values, with one column percovariate, and one row per alternative value. |
probs | Quantiles to calculate, by default, |
t | Maximum time to simulate to, passed to |
M | Number of individuals to simulate. |
B | Number of simulations to use to calculate 95% confidence intervalsbased on the asymptotic normal distribution of the basic parameterestimates. If |
cores | Number of processor cores to use. If |
Details
For a competing risks model, i.e. a model defined by just one starting stateand multiple destination states representing competing events, this returnsthe probability governing the next event that happens, and the distribution of the time to each event conditionally on that event happening.
Value
A tidy data frame with rows for each combination of covariate valuesand quantity of interest. The quantity of interest is identified in thecolumnquantity, and the value of the quantity is inval,with additional columnslower andupper giving 95%confidence intervals for the quantity, ifB>0.
Reformat simulated multi-state data with one row per simulated transition
Description
Reformat simulated multi-state data with one row per simulated transition
Usage
simfs_bytrans(simfs)Arguments
simfs | Output from |
Value
Data frame with four columns giving transition start state, transition end state, transition name and the time taken by the transition.
Simulate times to competing events from a mixture multi-state model
Description
Simulate times to competing events from a mixture multi-state model
Usage
simt_flexsurvmix(x, newdata = NULL, n)Arguments
x | Fitted model object returned from |
newdata | Data frame or list of covariate values. If omitted for amodel with covariates, a default is used, defined by all combinations offactors if the only covariates in the model are factors, or all covariatevalues of zero if there are any non-factor covariates in the model. |
n | Number of simulations |
Value
Data frame withn*m rows and a column for each competingevent, wherem is the number of alternative covariate values, thatis the number of rows ofnewdata. The simulated time representsthe time to that event conditionally on that event being the one thatoccurs. This function doesn't simulate which event occurs.
Simulate censored time-to-event data from a fitted flexsurvreg model
Description
Simulate censored time-to-event data from a fitted flexsurvreg model
Usage
## S3 method for class 'flexsurvreg'simulate( object, nsim = 1, seed = NULL, newdata = NULL, start = NULL, censtime = NULL, tidy = FALSE, ...)Arguments
object | Object returned by |
nsim | Number of simulations per row in |
seed | Random number seed. This is returned with the result of thisfunction, as described in |
newdata | Data frame defining alternative sets of covariate values to simulate with.If omitted, this defaults to the data originally used to fit the model. |
start | Optional left-truncation time or times. The returnedsurvival, hazard or cumulative hazard will be conditioned on survival up tothis time. Predicted times returned with A vector of the same length as |
censtime | A right-censoring time, or vector of times matching the rowsof |
tidy | If If In either case, the simulated time and indicator for whether the time isan event time (rather than a time of right-censoring) are returned indifferent columns. |
... | Other arguments (not currently used). |
Value
A data frame, with format determined by whethertidy was specified.
Examples
fit <- flexsurvreg(formula = Surv(futime, fustat) ~ rx, data = ovarian, dist="weibull")fit2 <- flexsurvspline(formula = Surv(futime, fustat) ~ rx, data = ovarian, k=3)nd = data.frame(rx=1:2)simulate(fit, seed=1002, newdata=nd)simulate(fit, seed=1002, newdata=nd, start=500)simulate(fit2, nsim=3, seed=1002, newdata=nd)simulate(fit2, nsim=3, seed=1002, newdata=nd, start=c(500,1000))Marginal survival and hazards of fitted flexsurvreg models
Description
Returns a tidy data.frame of marginal survival probabilities, or hazards, restricted mean survival, or quantiles of the marginal survival functionat user-defined time points and covariate patterns.Standardization is performed over any undefined covariates in the model. The user provides the data to standardize over. Contrasts can be calculated resulting in estimates of the average treatment effect or the average treatment effect in the treated if a treated subset of the data are supplied.
Usage
standsurv( object, newdata = NULL, at = list(list()), atreference = 1, type = "survival", t = NULL, ci = FALSE, se = FALSE, boot = FALSE, B = NULL, cl = 0.95, trans = "log", contrast = NULL, trans.contrast = NULL, seed = NULL, rmap, ratetable, scale.ratetable = 365.25, n.gauss.quad = 100, quantiles = 0.5, interval = c(1e-08, 500))Arguments
object | Output from |
newdata | Data frame containing covariate values to produce marginalvalues for. If not specified then the fitted model data.frame is used.There must be a column for every covariate in the model formulafor which the user wishes to standardize over. These are in the same formatas the original data, with factors as a single variable, not 0/1 contrasts.Any covariates that are to be fixed should be specified in |
at | A list of scenarios in which specific covariates are fixed to certain values. Each element of |
atreference | The reference scenario for making contrasts. Default is 1(i.e. the first element of |
type |
|
t | Times to calculate marginal values at. |
ci | Should confidence intervals be calculated? Defaults to FALSE |
se | Should standard errors be calculated? Defaults to FALSE |
boot | Should bootstrapping be used to calculate standard error and confidence intervals? Defaults to FALSE, in which case the delta method is used |
B | Number of bootstrap simulations from the normal asymptotic distribution of the estimates used to calculate confidence intervals or standard errors. Decrease for greater speed at the expense of accuracy. Only specify if |
cl | Width of symmetric confidence intervals, relative to 1. |
trans | Transformation to apply when calculating standard errors via thedelta method to obtain confidence intervals. The default transformation is "log". Other possible names are "none", "loglog", "logit". |
contrast | Contrasts between standardized measures defined by |
trans.contrast | Transformation to apply when calculating standard errorsfor contrasts via the delta method to obtain confidence intervals. The defaulttransformation is "none" for differences in survival, hazard, quantiles, or RMST, and "log" for ratios of survival, hazard, quantiles or RMST. |
seed | The random seed to use (for bootstrapping confidence intervals) |
rmap | An list that maps data set names to expected ratetable names. This must be specified if all-cause survival and hazards are required afterfitting a relative survival model. This can also be specified if expectedrates are required for plotting purposes. See the details section below. |
ratetable | A table of expected event rates (see |
scale.ratetable | Transformation from the time scale of the fitted flexsurv model to the time scale in |
n.gauss.quad | Number of Gaussian quadrature points used for integrating the all-cause survival function when calculating RMST in a relative survival framework (default = 100) |
quantiles | If |
interval | Interval of survival times for quantile root finding. Default is c(1e-08, 500). |
Details
The syntax ofstandsurv follows closely that of Stata'sstandsurv command written by Paul Lambert and Michael Crowther. The function calculates standardized (marginal) measures including standardizedsurvival functions, standardized restricted mean survival times, quantilesand the hazard of standardized survival. The standardized survival is defined as
S_s(t|X=x) = E(S(t|X=x,Z)) = \frac{1}{N} \sum_{i=1}^N S(t|X=x,Z=z_i)
The hazard of the standardized survival is a weighted average of individual hazard functions at time t, weighted by the survivalfunction at this time:
h_s(t|X=x) = \frac{\sum_{i=1}^N S(t|X=x,Z=z_i)h(t|X=x,Z=z_i)}{\sum_{i=1}^N S(t|X=x,Z=z_i)}
Marginal expected survival and hazards can be calculated by providing a population-based lifetable of class ratetable inratetable and a mapping between stratification factors in the lifetable and the user datasetusingrmap. If these stratification factors are not in the fittedsurvival model then the user must specify them innewdata along withthe covariates of the model. The marginal expected survival is calculated using the "Ederer" method that assumes no censoring as this is most relevant approach for forecasting (seesurvexp). A worked example is given below.
Marginal all-cause survival and hazards can be calculated after fitting arelative survival model, which utilise the expected survival from a populationratetable. See Rutherford et al. (Chapter 6) for further details.
Value
Atibble containing one row for each time-point. The column naming convention isat{i} for the ith scenariowith corresponding confidence intervals (if specified) namedat{i}_lciandat{i}_uci. Contrasts are namedcontrast{k}_{j} for the comparison of the kth versus the jthat scenario.
In addition tidy long-format data.frames are returned in the attributesstandsurv_at andstandsurv_contrast. These can be passed toggplot for plotting purposes (seeplot.standsurv).
Author(s)
Michael Sweeting <mikesweeting79@gmail.com>
References
Paul Lambert, 2021. "STANDSURV: Stata module to compute standardized (marginal) survival and related functions," Statistical Software Components S458991, Boston College Department of Economics. https://ideas.repec.org/c/boc/bocode/s458991.html
Rutherford, MJ, Lambert PC, Sweeting MJ, Pennington B, Crowther MJ, Abrams KR,Latimer NR. 2020. "NICE DSU Technical Support Document 21: Flexible Methods for Survival Analysis" https://nicedsu.sites.sheffield.ac.uk/tsds/flexible-methods-for-survival-analysis-tsd
Examples
## mean age is higher in those with smaller observed survival times newbc <- bcset.seed(1)newbc$age <- rnorm(dim(bc)[1], mean = 65-scale(newbc$recyrs, scale=FALSE), sd = 5)## Fit a Weibull flexsurv model with group and age as covariatesweib_age <- flexsurvreg(Surv(recyrs, censrec) ~ group+age, data=newbc, dist="weibull") ## Calculate standardized survival and the difference in standardized survival## for the three levels of group across a grid of survival times standsurv_weib_age <- standsurv(weib_age, at = list(list(group="Good"), list(group="Medium"), list(group="Poor")), t=seq(0,7, length.out=100), contrast = "difference", ci=FALSE)standsurv_weib_age## Calculate hazard of standardized survival and the marginal hazard ratio## for the three levels of group across a grid of survival times## 10 bootstraps for confidence intervals (this should be larger)## Not run: haz_standsurv_weib_age <- standsurv(weib_age, at = list(list(group="Good"), list(group="Medium"), list(group="Poor")), t=seq(0,7, length.out=100), type="hazard", contrast = "ratio", boot = TRUE, B=10, ci=TRUE)haz_standsurv_weib_age plot(haz_standsurv_weib_age, ci=TRUE)## Hazard ratio plot shows a decreasing marginal HR ## Whereas the conditional HR is constant (model is a PH model)plot(haz_standsurv_weib_age, contrast=TRUE, ci=TRUE)## Calculate standardized survival from a Weibull model together with expected## survival matching to US lifetables# age at diagnosis in days. This is required to match to US ratetable, whose# timescale is measured in daysnewbc$agedays <- floor(newbc$age * 365.25) ## Create some random diagnosis dates centred on 01/01/2010 with SD=1 year## These will be used to match to expected rates in the lifetablenewbc$diag <- as.Date(floor(rnorm(dim(newbc)[1], mean = as.Date("01/01/2010", "%d/%m/%Y"), sd=365)), origin="1970-01-01")## Create sex (assume all are female)newbc$sex <- factor("female")standsurv_weib_expected <- standsurv(weib_age, at = list(list(group="Good"), list(group="Medium"), list(group="Poor")), t=seq(0,7, length.out=100), rmap=list(sex = sex, year = diag, age = agedays), ratetable = survival::survexp.us, scale.ratetable = 365.25, newdata = newbc)## Plot marginal survival with expected survival superimposed plot(standsurv_weib_expected, expected=TRUE)## End(Not run)Summaries of fitted flexible survival models
Description
Return fitted survival, cumulative hazard or hazard at a series of timesfrom a fittedflexsurvreg orflexsurvsplinemodel.
Usage
## S3 method for class 'flexsurvreg'summary( object, newdata = NULL, X = NULL, type = "survival", fn = NULL, t = NULL, quantiles = 0.5, start = 0, cross = TRUE, ci = TRUE, se = FALSE, B = 1000, cl = 0.95, tidy = FALSE, na.action = na.pass, ...)Arguments
object | Output from |
newdata | Data frame containing covariate values to produce fittedvalues for. Or a list that can be coerced to such a data frame. Theremust be a column for every covariate in the model formula, and one row forevery combination of covariates the fitted values are wanted for. Theseare in the same format as the original data, with factors as a singlevariable, not 0/1 contrasts. If this is omitted, if there are any continuous covariates, then a singlesummary is provided with all covariates set to their mean values in thedata - for categorical covariates, the means of the 0/1 indicator variablesare taken. If there are only factor covariates in the model, then alldistinct groups are used by default. |
X | Alternative way of defining covariate values to produce fittedvalues for. Since version 0.4, Columns of For “factor” (categorical) covariates, the values of the contrastsrepresenting factor levels (as returned by the |
type |
Ignored if |
fn | Custom function of the parameters to summarise against time.This has optional first two arguments |
t | Times to calculate fitted values for. By default, these are thesorted unique observation (including censoring) times in the data - forleft-truncated datasets these are the "stop" times. |
quantiles | If |
start | Optional left-truncation time or times. The returnedsurvival, hazard or cumulative hazard will be conditioned on survival up tothis time. Predicted times returned with A vector of the same length as |
cross | If If |
ci | Set to |
se | Set to |
B | Number of simulations from the normal asymptotic distribution ofthe estimates used to calculate confidence intervals or standard errors.Decrease for greaterspeed at the expense of accuracy, or set |
cl | Width of symmetric confidence intervals, relative to 1. |
tidy | If |
na.action | Function determining what should be done with missing values in |
... | Further arguments passed to or from other methods. Currentlyunused. |
Details
Time-dependent covariates are not currently supported. The covariatevalues are assumed to be constant through time for each fitted curve.
Value
Iftidy=FALSE, a list with one component for each uniquecovariate value (if there are only categorical covariates) or one component(if there are no covariates or any continuous covariates). Each of thesecomponents is a matrix with one row for each time int, giving theestimated survival (or cumulative hazard, or hazard) and 95% confidencelimits. These list components are named with the covariate names andvalues which define them.
Iftidy=TRUE, a data frame is returned instead. This is formed bystacking the above list components, with additional columns to identify thecovariate values that each block corresponds to.
If there are multiple summaries, an additional list component namedX contains a matrix with the exact values of contrasts (dummycovariates) defining each summary.
Theplot.flexsurvreg function can be used to quickly plotthese model-based summaries against empirical summaries such asKaplan-Meier curves, to diagnose model fit.
Confidence intervals are obtained by sampling randomly from the asymptoticnormal distribution of the maximum likelihood estimates and then taking quantiles(see, e.g. Mandel (2013)).
Author(s)
C. H. Jacksonchris.jackson@mrc-bsu.cam.ac.uk
References
Mandel, M. (2013). "Simulation based confidence intervals forfunctions with complicated derivatives." The American Statistician (inpress).
See Also
Summarise quantities of interest from fitted flexsurvrtrunc models
Description
This function extracts quantities of interest from the untruncated version of a model with individual-specific right truncation points fitted byflexsurvrtrunc. Note that covariates arecurrently not supported byflexsurvrtrunc.
Usage
## S3 method for class 'flexsurvrtrunc'summary( object, type = "survival", fn = NULL, t = NULL, quantiles = 0.5, ci = TRUE, se = FALSE, B = 1000, cl = 0.95, ...)Arguments
object | Output from |
type |
Ignored if |
fn | Custom function of the parameters to summarise against time.This has optional first argument |
t | Times to calculate fitted values for. By default, these are thesorted unique observation (including censoring) times in the data - forleft-truncated datasets these are the "stop" times. |
quantiles | If |
ci | Set to |
se | Set to |
B | Number of simulations from the normal asymptotic distribution ofthe estimates used to calculate confidence intervals or standard errors.Decrease for greaterspeed at the expense of accuracy, or set |
cl | Width of symmetric confidence intervals, relative to 1. |
... | Further arguments passed to or from other methods. Currentlyunused. |
Nonparametric estimator of survival from right-truncated, uncensored data
Description
Estimates the survivor function from right-truncated, uncensored data byreversing time, interpreting the data as left-truncated, applying theKaplan-Meier / Lynden-Bell estimator and transforming back.
Usage
survrtrunc(t, rtrunc, tmax, data = NULL, eps = 0.001, conf.int = 0.95)Arguments
t | Vector of observed times from an initial event to a final event. |
rtrunc | Individual-specific right truncation points, so that eachindividual's survival time |
tmax | Maximum possible time to event that could have been observed. |
data | Data frame to find |
eps | Small number that is added to |
conf.int | Confidence level, defaulting to 0.95. |
Details
Note that this does not estimate the untruncated survivor function - insteadit estimates the survivor function truncated above at a time defined by themaximum possible time that might have been observed in the data.
DefineX as the time of the initial event,Y as the time of thefinal event, then we wish to determine the distribution ofT = Y- X.
Observations are only recorded ifY \leq t_{max}. Then thedistribution ofT in the resulting sample is right-truncated byrtrunc = t_{max} - X.
Equivalently, the distribution oft_{max} - T is left-truncated, sinceit is only observed ift_{max} - T \geq X. Then the standardKaplan-Meier type estimator as implemented insurvfit is used (as described by Lynden-Bell, 1971)and the results transformed back.
This situation might happen in a disease epidemic, whereX is the dateof disease onset for an individual,Y is the date of death, and wewish to estimate the distribution of the timeT from onset to death,given we have only observed people who have died by the datet_{max}.
If the estimated survival is unstable at the highest times, then considerreplacingtmax by a slightly lower value, then if necessary, removingindividuals witht > tmax, so that the estimand is changed to thesurvivor function truncated over a slightly narrower interval.
Value
A list with components:
time Time points where the estimated survival changes.
surv Estimated survival attime, truncated above attmax.
se.surv Standard error of survival.
std.err Standard error of -log(survival). Named this way for consistency withsurvfit.
lower Lower confidence limits for survival.
upper Upper confidence limits for survival.
References
D. Lynden-Bell (1971) A method of allowing for known observationalselection in small samples applied to 3CR quasars. Monthly Notices of theRoyal Astronomical Society, 155:95–118.
Seaman, S., Presanis, A. and Jackson, C. (2020) Review of methods forestimating distribution of time to event from right-truncated data.
Examples
## simulate some event time dataset.seed(1)X <- rweibull(100, 2, 10)T <- rweibull(100, 2, 10)## truncate abovetmax <- 20obs <- X + T < tmaxrtrunc <- tmax - Xdat <- data.frame(X, T, rtrunc)[obs,]sf <- survrtrunc(T, rtrunc, data=dat, tmax=tmax)plot(sf, conf.int=TRUE)## Kaplan-Meier estimate ignoring truncation is biasedsfnaive <- survfit(Surv(T) ~ 1, data=dat)lines(sfnaive, conf.int=TRUE, lty=2, col="red")## truncate above the maximum observed timetmax <- max(X + T) + 10obs <- X + T < tmaxrtrunc <- tmax - Xdat <- data.frame(X, T, rtrunc)[obs,]sf <- survrtrunc(T, rtrunc, data=dat, tmax=tmax)plot(sf, conf.int=TRUE)## estimates identical to the standard Kaplan-Meiersfnaive <- survfit(Surv(T) ~ 1, data=dat)lines(sfnaive, conf.int=TRUE, lty=2, col="red")Tidy a flexsurv model object
Description
Tidy summarizes information about the components of the model into a tidy data frame.
Usage
## S3 method for class 'flexsurvreg'tidy( x, conf.int = FALSE, conf.level = 0.95, pars = "all", transform = "none", ...)Arguments
x | Output from |
conf.int | Logical. Should confidence intervals be returned? Default is |
conf.level | The confidence level to use for the confidence interval if |
pars | One of |
transform | Character vector of transformations to apply to requested Users can specify one or both types of transformations:
See |
... | Not currently used. |
Details
flexsurvreg models estimate two types of coefficients, baseline distribution parameters, and covariate effects which act on the baseline distribution. By design,flexsurvreg returns distribution parameters on the same scale as is found in the relevantd/p/q/r functions. Covariate effects are returned on the log-scale, which represents either log-time ratios (accelerated failure time models) or log-hazard ratios for proportional hazard models. By default,tidy() will return baseline distribution parameters on their natural scale and covariate effects on the log-scale.
To transform the baseline distribution parameters to the real-value number line (the scale used for estimation), pass the character argument"baseline.real" totransform. To get time ratios or hazard ratios, pass"coefs.exp" totransform. These transformations may be done together by submitting both arguments as a character vector.
Value
Atibble containing the columns:term,estimate,std.error,statistic,p.value,conf.low, andconf.high, by default.
statistic andp.value are only provided for covariate effects (NA for baseline distribution parameters). These are computed as Wald-type test statistics with p-values from a standard normal distribution.
Examples
fitg <- flexsurvreg(formula = Surv(futime, fustat) ~ age, data = ovarian, dist = "gengamma")tidy(fitg)tidy(fitg, pars = "coefs", transform = "coefs.exp")Tidy a standsurv object.
Description
This function is used internally bystandsurv and tidydata.frames are automatically returned by the function.
Usage
## S3 method for class 'standsurv'tidy(x, ...)Arguments
x | A standsurv object. |
... | Not currently used. |
Value
Returns additional tidy data.frames (tibbles)stored as attributes named standpred_at and standpred_contrast.
Total length of stay in particular states for a fully-parametric,time-inhomogeneous Markov multi-state model
Description
The matrix whoser,s entry is the expected amount of time spent instates for a time-inhomogeneous, continuous-time Markov multi-stateprocess that starts in stater, up to a maximum timet. This isdefined as the integral of the corresponding transition probability up tothat time.
Usage
totlos.fs( x, trans = NULL, t = 1, newdata = NULL, ci = FALSE, tvar = "trans", sing.inf = 1e+10, B = 1000, cl = 0.95, ...)Arguments
x | A model fitted with
|
trans | Matrix indicating allowed transitions. See |
t | Time or vector of times to predict up to. Must be finite. |
newdata | A data frame specifying the values of covariates in thefitted model, other than the transition number. See |
ci | Return a confidence interval calculated by simulating from theasymptotic normal distribution of the maximum likelihood estimates. Turnedoff by default, since this is computationally intensive. If turned on,users should increase |
tvar | Variable in the data representing the transition type. Notrequired if |
sing.inf | If there is a singularity in the observed hazard, forexample a Weibull distribution with |
B | Number of simulations from the normal asymptotic distribution usedto calculate variances. Decrease for greater speed at the expense ofaccuracy. |
cl | Width of symmetric confidence intervals, relative to 1. |
... | Arguments passed to |
Details
This is computed by solving a second order extension of the Kolmogorovforward differential equation numerically, using the methods in thedeSolve package. The equation is expressed as a linearsystem
\frac{dT(t)}{dt} = P(t)
\frac{dP(t)}{dt} = P(t) Q(t)
and solved forT(t) andP(t) simultaneously, whereT(t)is the matrix of total lengths of stay,P(t) is the transitionprobability matrix for timet, andQ(t) is the transitionhazard or intensity as a function oft. The initial conditions areT(0) = 0 andP(0) = I.
Note that the packagemsm has a similar methodtotlos.msm.totlos.fs should give the same results astotlos.msm whenboth of these conditions hold:
the time-to-event distribution is exponential for alltransitions, thus the
flexsurvregmodel was fitted withdist="exp", and is time-homogeneous.themsm model wasfitted with
exacttimes=TRUE, thus all the event times are known, andthere are no time-dependent covariates.
msm only allows exponential or piecewise-exponential time-to-eventdistributions, whileflexsurvreg allows more flexible models.msm however was designed in particular for panel data, where theprocess is observed only at arbitrary times, thus the times of transitionare unknown, which makes flexible models difficult.
This function is only valid for Markov ("clock-forward") multi-statemodels, though no warning or error is currently given if the model is notMarkov. Seetotlos.simfs for the equivalent for semi-Markov("clock-reset") models.
Value
The matrix of lengths of stayT(t), ift is of length1, or a list of matrices ift is longer.
Ifci=TRUE, each element has attributes"lower" and"upper" giving matrices of the corresponding confidence limits.These are formatted for printing but may be extracted usingattr().
The result also has an attributeP giving the transition probabilitymatrices, since these are unavoidably computed as a side effect. These aresuppressed for printing, but can be extracted withattr(...,"P").
Author(s)
Christopher Jacksonchris.jackson@mrc-bsu.cam.ac.uk.
See Also
totlos.simfs,pmatrix.fs,msfit.flexsurvreg.
Examples
# BOS example in vignette, and in msfit.flexsurvregbexp <- flexsurvreg(Surv(Tstart, Tstop, status) ~ trans, data=bosms3, dist="exp")tmat <- rbind(c(NA,1,2),c(NA,NA,3),c(NA,NA,NA))# predict 4 years spent without BOS, 3 years with BOS, before death# As t increases, this should convergetotlos.fs(bexp, t=10, trans=tmat)totlos.fs(bexp, t=1000, trans=tmat)totlos.fs(bexp, t=c(5,10), trans=tmat)# Answers should match results in help(totlos.simfs) up to Monte Carlo# error there / ODE solving precision here, since with an exponential# distribution, the "semi-Markov" model there is the same as the Markov# model hereExpected total length of stay in specific states, from a fully-parametric,semi-Markov multi-state model
Description
The expected total time spent in each state for semi-Markov multi-statemodels fitted to time-to-event data withflexsurvreg. Thisis defined by the integral of the transition probability matrix, thoughthis is not analytically possible and is computed by simulation.
Usage
totlos.simfs( x, trans, t = 1, start = 1, newdata = NULL, ci = FALSE, tvar = "trans", tcovs = NULL, group = NULL, M = 1e+05, B = 1000, cl = 0.95, cores = NULL)Arguments
x | A model fitted with
|
trans | Matrix indicating allowed transitions. See |
t | Maximum time to predict to. |
start | Starting state. |
newdata | A data frame specifying the values of covariates in thefitted model, other than the transition number. See |
ci | Return a confidence interval calculated by simulating from theasymptotic normal distribution of the maximum likelihood estimates. Thisis turned off by default, since two levels of simulation are required. Ifturned on, users should adjust |
tvar | Variable in the data representing the transition type. Notrequired if |
tcovs | Predictable time-dependent covariates such as age, see |
group | Optional grouping for the states. For example, if there arefour states, and |
M | Number of individuals to simulate in order to approximate thetransition probabilities. Users should adjust this to obtain the requiredprecision. |
B | Number of simulations from the normal asymptotic distribution usedto calculate confidence limits. Decrease for greater speed at the expense ofaccuracy. |
cl | Width of symmetric confidence intervals, relative to 1. |
cores | Number of processor cores used when calculating confidence limits by repeated simulation. The default uses single-core processing. |
Details
This is computed by simulating a large number of individualsM usingthe maximum likelihood estimates of the fitted model and the functionsim.fmsm. Therefore this requires a random sampling functionfor the parametric survival model to be available: see the "Details"section ofsim.fmsm. This will be available for all built-indistributions, though users may need to write this for custom models.
Note the random sampling method forflexsurvspline models iscurrently very inefficient, so that looping overM will be veryslow.
The equivalent function for time-inhomogeneous Markov models istotlos.fs. Note neither of these functions give errors orwarnings if used with the wrong type of model, but the results will beinvalid.
Value
The expected total time spent in each state (or group of statesgiven bygroup) up to timet, and corresponding confidenceintervals if requested.
Author(s)
Christopher Jacksonchris.jackson@mrc-bsu.cam.ac.uk.
See Also
pmatrix.simfs,sim.fmsm,msfit.flexsurvreg.
Examples
# BOS example in vignette, and in msfit.flexsurvregbexp <- flexsurvreg(Surv(years, status) ~ trans, data=bosms3, dist="exp")tmat <- rbind(c(NA,1,2),c(NA,NA,3),c(NA,NA,NA))# predict 4 years spent without BOS, 3 years with BOS, before death# As t increases, this should convergetotlos.simfs(bexp, t=10, trans=tmat)totlos.simfs(bexp, t=1000, trans=tmat)Convert a function with matrix arguments to a function with vectorarguments.
Description
Given a function with matrix arguments, construct an equivalent functionwhich takes vector arguments defined by the columns of the matrix. The newfunction simply usescbind on the vector arguments to make a matrix,and calls the old one.
Usage
unroll.function(mat.fn, ...)Arguments
mat.fn | A function with any number of arguments, some of which arematrices. |
... | A series of other arguments. Their names define whicharguments of
will make a new function Calling
should give the same answer as
|
Value
The new function, with vector arguments.
Usage inflexsurv
This is used byflexsurvspline to allow spline models, whichhave an arbitrary number of parameters, to be fitted usingflexsurvreg.
The “custom distributions” facility offlexsurvregexpects the user-supplied probability density and distributionfunctions to have one explicitly named argument for each scalarparameter, and given R vectorisation, each of those argumentscould be supplied as a vector of alternative parameter values.
However, spline models have a varying number of scalar parameters,determined by the number of knots in the spline.dsurvspline andpsurvspline have anargument calledgamma. This can be supplied as a matrix,with number of columnsn determined by the number of knots(plus 2), and rows referring to alternative parameter values. Thefollowing statements are used in the source offlexsurvspline:
dfn <-unroll.function(dsurvspline, gamma=0:(nk-1)) pfn <-unroll.function(psurvspline, gamma=0:(nk-1))
to convert these into functions with argumentsgamma0,gamma1,...,gamman, corresponding to the columnsofgamma, wheren = nk-1, and with other argumentsin the same format.
Author(s)
Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>
See Also
Examples
fn <- unroll.function(ncol, x=1:3)fn(1:3, 1:3, 1:3) # equivalent to...ncol(cbind(1:3,1:3,1:3))Variance-covariance matrix from a flexsurvreg model
Description
Variance-covariance matrix from a flexsurvreg model
Usage
## S3 method for class 'flexsurvreg'vcov(object, ...)Arguments
object | A fitted model object of class |
... | Other arguments (currently unused). |
Value
Variance-covariance matrix of the estimated parameters, onthe scale that they were estimated on (for positive parametersthis is the log scale).