Movatterモバイル変換

Type:

Package

Title:

Flexible Parametric Survival and Multi-State Models

Version:

2.3.2

Date:

2024-08-16

Description:

Flexible parametric models for time-to-event data, including the Royston-Parmar spline model, generalized gamma and generalized F distributions. Any user-defined parametric distribution can be fitted, given at least an R function defining the probability density or hazard. There are also tools for fitting and predicting from fully parametric multi-state models, based on either cause-specific hazards or mixture models.

License:

GPL-2 |GPL-3 [expanded from: GPL (≥ 2)]

Depends:

survival, R (≥ 2.15.0)

Imports:

assertthat, deSolve, generics, magrittr, mstate (≥ 0.2.10),Matrix, muhaz, mvtnorm, numDeriv, quadprog, Rcpp (≥ 0.11.5),rlang, rstpm2, purrr, statmod, tibble, tidyr, dplyr,tidyselect, ggplot2

Encoding:

UTF-8

Suggests:

splines2, flexsurvcure, survminer, lubridate, rmarkdown,colorspace, eha, knitr, msm, testthat, TH.data, broom, covr

URL:

https://github.com/chjackson/flexsurv,http://chjackson.github.io/flexsurv/

BugReports:

https://github.com/chjackson/flexsurv/issues

VignetteBuilder:

knitr

LazyData:

yes

LinkingTo:

Rcpp

RoxygenNote:

7.3.1

NeedsCompilation:

yes

Packaged:

2024-08-16 15:05:50 UTC; Chris

Author:

Christopher Jackson [aut, cre], Paul Metcalfe [ctb], Jordan Amdahl [ctb], Matthew T. Warkentin [ctb], Michael Sweeting [ctb], Kevin Kunzmann [ctb]

Maintainer:

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

Repository:

CRAN

Date/Publication:

2024-08-17 05:50:02 UTC

flexsurv: Flexible parametric survival and multi-state models

Description

flexsurv: Flexible parametric models for time-to-event data, including thegeneralized gamma, the generalized F and the Royston-Parmar spline model,and extensible to user-defined distributions.

Details

flexsurvreg fits parametric models for time-to-event(survival) data. Data may be right-censored, and/or left-censored, and/orleft-truncated. Several built-in parametric distributions are available.Any user-defined parametric model can also be employed by supplying a listwith basic information about the distribution, including the density orhazard and ideally also the cumulative distribution or hazard.

Covariates can be included using a linear model on any parameter of thedistribution, log-transformed to the real line if necessary. Thistypically defines an accelerated failure time or proportional hazardsmodel, depending on the distribution and parameter.

flexsurvspline fits the flexible survival model of Roystonand Parmar (2002) in which the log cumulative hazard is modelled as anatural cubic spline function of log time. Covariates can be included onany of the spline parameters, giving either a proportional hazards model oran arbitrarily-flexible time-dependent effect. Alternative proportionalodds or probit parameterisations are available.

Output from the models can be presented as survivor, cumulative hazard andhazard functions (summary.flexsurvreg). These can be plottedagainst nonparametric estimates (plot.flexsurvreg) to assessgoodness-of-fit. Any other user-defined function of the parameters may besummarised in the same way.

Multi-state models for time-to-event data can also be fitted with the samefunctions. Predictions from those models can then be made using thefunctionspmatrix.fs,pmatrix.simfs,totlos.fs,totlos.simfs, orsim.fmsm, or alternatively bymsfit.flexsurvregfollowed bymssample orprobtrans from the packagemstate.

Distribution (“dpqr”) functions for the generalized gamma and Fdistributions are given inGenGamma,GenF(preferred parameterisations) andGenGamma.orig,GenF.orig (original parameterisations).flexsurv also includes the standard Gompertz distributionwith unrestricted shape parameter, seeGompertz.

User guide

Theflexsurv user guide vignette explains themethods in detail, and gives several worked examples. A further vignetteflexsurv-examples gives a few more complicated examples, and usersare encouraged to submit their own.

Author(s)

Christopher Jacksonchris.jackson@mrc-bsu.cam.ac.uk

References

Jackson, C. (2016). flexsurv: A Platform for ParametricSurvival Modeling in R. Journal of Statistical Software, 70(8), 1-33.doi:10.18637/jss.v070.i08

Royston, P. and Parmar, M. (2002). Flexible parametricproportional-hazards and proportional-odds models for censored survivaldata, with application to prognostic modelling and estimation of treatmenteffects. Statistics in Medicine 21(1):2175-2197.

Cox, C. (2008). The generalizedF distribution: An umbrella forparametric survival analysis. Statistics in Medicine 27:4301-4312.

Cox, C., Chu, H., Schneider, M. F. and Muñoz, A. (2007). Parametricsurvival analysis and taxonomy of hazard functions for the generalizedgamma distribution. Statistics in Medicine 26:4252-4374

helper function to safely convert a Hessian matrix to covariance matrix

Description

helper function to safely convert a Hessian matrix to covariance matrix

Usage

.hess_to_cov(hessian, tol.solve = 1e-09, tol.evalues = 1e-05, ...)

Arguments

hessian

hessian matrix to convert to covariance matrix (must be evaluated at MLE)

tol.solve

tolerance used for solve()

tol.evalues

accepted tolerance for negative eigenvalues of the covariance matrix

...

arguments passed to Matrix::nearPD

Numerical evaluation of the hessian of a function using numDeriv::hessian

Description

We perform a quick check about the expected runtime and adjust theprecision accordingly.

Usage

.hessian(f, x, seconds.warning = 60, default.r = 6, min.r = 2, ...)

Arguments

f

function to compute Hessian for

x

location to evaluate Hessian at

seconds.warning

time threshold in seconds to trigger message andreduce the number of iterations for Richardson extrapolation ofnumDeriv::hessian

default.r

default number of iterations (high-precision recommendationof numDeriv)

min.r

minial number of iteration, must be at least 2,

...

further arguments passed to method.args of numDeriv::hessian

Akaike's information criterion from a flexible parametric multistate model

Description

Defined as the sum of the AICs of the transition-specific models.

Usage

## S3 method for class 'fmsm'AIC(object, ..., k = 2)

Arguments

object

Object returned byfmsm representing a multistate model.

...

Further arguments (currently unused).

k

Penalty applied to number of parameters (defaults to the standard 2).

Value

The sum of the AICs of the transition-specific models.

Second-order Akaike information criterion

Description

Generic function for the second-order Akaike information criterion.The only current implementation of this inflexsurv isinAICc.flexsurvreg, see there for more details.

Usage

AICc(object, ...)AICC(object, ...)

Arguments

object

Fitted model object.

...

Other arguments (currently unused).

Details

This can be spelt either asAICC orAICc.

Second-order Akaike information criterion

Description

Second-order or "corrected" Akaike information criterion, oftenknown as AICc (see, e.g. Burnham and Anderson 2002). This isdefined as -2 log-likelihood +(2*p*n)/(n - p -1), whereasthe standard AIC is defined as -2 log-likelihood +2*p,wherep is the number of parameters andn is thesample size. The correction is intended to adjust AIC forsmall-sample bias, hence it only makes a difference to the resultfor smalln.

Usage

## S3 method for class 'flexsurvreg'AICc(object, cens = TRUE, ...)## S3 method for class 'flexsurvreg'AICC(object, cens = TRUE, ...)

Arguments

object

Fitted model returned byflexsurvreg(orflexsurvspline).

cens

Include censored observations in the sample size term(n) used in this calculation. SeeBIC.flexsurvreg for a discussion of the issueswith defining the sample size.

...

Other arguments (currently unused).

Details

This can be spelt either asAICC orAICc.

Value

The second-order AIC of the fitted model.

References

Burnham, K. P., Anderson, D. R. (2002) Model Selection and Multimodel Inference: a practical information-theoretic approach. Second edition. Springer: New York.

Bayesian Information Criterion (BIC) for comparison of flexsurvreg models

Description

Bayesian Information Criterion (BIC) for comparison of flexsurvreg models

Usage

## S3 method for class 'flexsurvreg'BIC(object, cens = TRUE, ...)

Arguments

object

Fitted model returned byflexsurvreg(orflexsurvspline).

cens

Include censored observations in the sample size term(n) used in the calculation of BIC.

...

Other arguments (currently unused).

Details

There is no "official" definition of what the sample sizeshould be for the use of BIC in censored survival analysis. BICis based on an approximation to Bayesian model comparison usingBayes factors and an implicit vague prior. Informally, thesample size describes the number of "units" giving rise to adistinct piece of information (Kass and Raftery 1995). Howevercensored observations provide less information than observedevents, in principle. The default used here is the number ofindividuals, for consistency with more familiar kinds ofstatistical modelling. However if censoring is heavy, then thenumber of events may be a better represent the amount ofinformation. Following these principles, the best approximationwould be expected to be somewere in between.

AIC and BIC are intended to measure different things. Briefly,AIC measures predictive ability, whereas BIC is expected to choosethe true model from a set of models where one of them is thetruth. Therefore BIC chooses simpler models for all but thetiniest sample sizes (log(n)>2,n>7). AIC might be preferred in thetypical situation where"all models are wrong but some are useful". AIC also gives similarresults to cross-validation (Stone 1977).

Value

The BIC of the fitted model. This is minus twice the log likelihood plusp*log(n), wherep is the number of parameters andn is the samplesize of the data. Ifweights was supplied toflexsurv, the sample size is defined as the sum of theweights.

References

Kass, R. E., & Raftery, A. E. (1995). Bayesfactors. Journal of the American Statistical Association,90(430), 773-795.

Stone, M. (1977). An asymptotic equivalence of choiceof model by cross‐validation and Akaike's criterion. Journal ofthe Royal Statistical Society: Series B (Methodological), 39(1),44-47.

Generalized F distribution

Description

Density, distribution function, hazards, quantile function andrandom generation for the generalized F distribution, using thereparameterisation by Prentice (1975).

Usage

dgenf(x, mu = 0, sigma = 1, Q, P, log = FALSE)pgenf(q, mu = 0, sigma = 1, Q, P, lower.tail = TRUE, log.p = FALSE)Hgenf(x, mu = 0, sigma = 1, Q, P)hgenf(x, mu = 0, sigma = 1, Q, P)qgenf(p, mu = 0, sigma = 1, Q, P, lower.tail = TRUE, log.p = FALSE)rgenf(n, mu = 0, sigma = 1, Q, P)

Arguments

x,q

Vector of quantiles.

mu

Vector of location parameters.

sigma

Vector of scale parameters.

Q

Vector of first shape parameters.

P

Vector of second shape parameters.

log,log.p

logical; if TRUE, probabilities p are given as log(p).

lower.tail

logical; if TRUE (default), probabilities areP(X\le x), otherwise,P(X > x).

p

Vector of probabilities.

n

number of observations. Iflength(n) > 1, the length istaken to be the number required.

Details

Ify \sim F(2s_1, 2s_2), andw = \log(y) thenx = \exp(w\sigma + \mu) has the original generalized F distribution withlocation parameter\mu, scale parameter\sigma>0and shape parameterss_1,s_2.

In this more stable version described by Prentice (1975),s_1,s_2 are replaced by shape parametersQ,P, withP>0, and

s_1 = 2(Q^2 + 2P + Q\delta)^{-1}, \quad s_2 = 2(Q^2 + 2P -Q\delta)^{-1}

equivalently

Q = \left(\frac{1}{s_1} -\frac{1}{s_2}\right)\left(\frac{1}{s_1} + \frac{1}{s_2}\right)^{-1/2},\quad P = \frac{2}{s_1 + s_2}

Define\delta = (Q^2 + 2P)^{1/2}, andw = (\log(x) - \mu)\delta /\sigma,then the probability density function ofx is

f(x) =\frac{\delta (s_1/s_2)^{s_1} e^{s_1 w}}{\sigma x (1 + s_1 e^w/s_2) ^ {(s_1+ s_2)} B(s_1, s_2)}

Theoriginal parameterisation is available in this package asdgenf.orig, for the sake of completion / compatibility. Withthe above definitions,

dgenf(x, mu=mu, sigma=sigma, Q=Q, P=P) = dgenf.orig(x, mu=mu,sigma=sigma/delta, s1=s1, s2=s2)

The generalized F distribution withP=0 is equivalent to thegeneralized gamma distributiondgengamma, so thatdgenf(x, mu, sigma, Q, P=0) equalsdgengamma(x, mu, sigma,Q). The generalized gamma reduces further to several commondistributions, as described in theGenGamma help page.

The generalized F distribution includes the log-logistic distribution (seeeha::dllogis) as a further special case:

dgenf(x, mu=mu, sigma=sigma, Q=0, P=1) =eha::dllogis(x,shape=sqrt(2)/sigma, scale=exp(mu))

The range of hazard trajectories available under this distribution arediscussed in detail by Cox (2008). Jackson et al. (2010) give anapplication to modelling oral cancer survival for use in a health economicevaluation of screening.

Value

dgenf gives the density,pgenf gives the distributionfunction,qgenf gives the quantile function,rgenf generatesrandom deviates,Hgenf retuns the cumulative hazard andhgenfthe hazard.

Note

The parametersQ andP are usually calledq andp in the literature - they were made upper-case in these R functionsto avoid clashing with the conventional argumentsq in theprobability function andp in the quantile function.

Author(s)

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

References

R. L. Prentice (1975). Discrimination among some parametricmodels. Biometrika 62(3):607-614.

Cox, C. (2008). The generalizedF distribution: An umbrella forparametric survival analysis. Statistics in Medicine 27:4301-4312.

Jackson, C. H. and Sharples, L. D. and Thompson, S. G. (2010). Survivalmodels in health economic evaluations: balancing fit and parsimony toimprove prediction. International Journal of Biostatistics 6(1):Article34.

Generalized F distribution (original parameterisation)

Description

Density, distribution function, quantile function and randomgeneration for the generalized F distribution, using the lessflexible original parameterisation described by Prentice (1975).

Usage

dgenf.orig(x, mu = 0, sigma = 1, s1, s2, log = FALSE)pgenf.orig(q, mu = 0, sigma = 1, s1, s2, lower.tail = TRUE, log.p = FALSE)Hgenf.orig(x, mu = 0, sigma = 1, s1, s2)hgenf.orig(x, mu = 0, sigma = 1, s1, s2)qgenf.orig(p, mu = 0, sigma = 1, s1, s2, lower.tail = TRUE, log.p = FALSE)rgenf.orig(n, mu = 0, sigma = 1, s1, s2)

Arguments

x,q

vector of quantiles.

mu

Vector of location parameters.

sigma

Vector of scale parameters.

s1

Vector of first F shape parameters.

s2

vector of second F shape parameters.

log,log.p

logical; if TRUE, probabilities p are given as log(p).

lower.tail

logical; if TRUE (default), probabilities areP(X\le x), otherwise,P(X > x).

p

vector of probabilities.

n

number of observations. Iflength(n) > 1, the length istaken to be the number required.

Details

Ify \sim F(2s_1, 2s_2), andw = \log(y) thenx = \exp(w\sigma + \mu) has the original generalized F distribution withlocation parameter\mu, scale parameter\sigma>0and shape parameterss_1>0,s_2>0. The probability densityfunction ofx is

f(x | \mu, \sigma, s_1, s_2) = \frac{(s_1/s_2)^{s_1} e^{s_1 w}}{\sigma x (1 + s_1 e^w/s_2) ^ {(s_1 + s_2)} B(s_1, s_2)}

wherew = (\log(x) - \mu)/\sigma, andB(s_1,s_2) = \Gamma(s_1)\Gamma(s_2)/\Gamma(s_1+s_2) is the beta function.

Ass_2 \rightarrow \infty, the distributionofx tends towards an original generalized gammadistribution with the following parameters:

dgengamma.orig(x, shape=1/sigma, scale=exp(mu) /s1^sigma, k=s1)

SeeGenGamma.orig for how this includes severalother common distributions as special cases.

The alternative parameterisation of the generalized Fdistribution, originating from Prentice (1975) and given in thispackage asGenF, is preferred for statisticalmodelling, since it is more stable ass_1 tends toinfinity, and includes a further new class of distributions withnegative first shape parameter. The original is provided here forthe sake of completion and compatibility.

Value

dgenf.orig gives the density,pgenf.orig gives thedistribution function,qgenf.orig gives the quantile function,rgenf.orig generates random deviates,Hgenf.orig retuns thecumulative hazard andhgenf.orig the hazard.

Author(s)

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

References

R. L. Prentice (1975). Discrimination among some parametricmodels. Biometrika 62(3):607-614.

Generalized gamma distribution

Description

Density, distribution function, hazards, quantile function andrandom generation for the generalized gamma distribution, usingthe parameterisation originating from Prentice (1974). Also knownas the (generalized) log-gamma distribution.

Usage

dgengamma(x, mu = 0, sigma = 1, Q, log = FALSE)pgengamma(q, mu = 0, sigma = 1, Q, lower.tail = TRUE, log.p = FALSE)Hgengamma(x, mu = 0, sigma = 1, Q)hgengamma(x, mu = 0, sigma = 1, Q)qgengamma(p, mu = 0, sigma = 1, Q, lower.tail = TRUE, log.p = FALSE)rgengamma(n, mu = 0, sigma = 1, Q)

Arguments

x,q

vector of quantiles.

mu

Vector of “location” parameters.

sigma

Vector of “scale” parameters. Note the inconsistentmeanings of the term “scale” - this parameter is analogous to the(log-scale) standard deviation of the log-normal distribution, “sdlog” indlnorm, rather than the “scale” parameter of the gammadistributiondgamma. Constrained to be positive.

Q

Vector of shape parameters.

log,log.p

logical; if TRUE, probabilities p are given as log(p).

lower.tail

logical; if TRUE (default), probabilities areP(X\le x), otherwise,P(X > x).

p

vector of probabilities.

n

number of observations. Iflength(n) > 1, the length istaken to be the number required.

Details

If\gamma \sim Gamma(Q^{-2}, 1) , andw =log(Q^2 \gamma) / Q, thenx = \exp(\mu + \sigmaw) follows the generalized gamma distribution withprobability density function

f(x | \mu, \sigma, Q) = \frac{|Q|(Q^{-2})^{Q^{-2}}}{\sigma x\Gamma(Q^{-2})} \exp(Q^{-2}(Qw - \exp(Qw)))

This parameterisation is preferred to the originalparameterisation of the generalized gamma by Stacy (1962) since itis more numerically stable near toQ=0 (the log-normaldistribution), and allowsQ<=0. The original is availablein this package asdgengamma.orig, for the sake ofcompletion and compatibility with other software - this isimplicitly restricted toQ>0 (ork>0 in the originalnotation). The parameters ofdgengamma anddgengamma.orig are related as follows.

dgengamma.orig(x, shape=shape, scale=scale, k=k) =

dgengamma(x, mu=log(scale) + log(k)/shape, sigma=1/(shape*sqrt(k)),Q=1/sqrt(k))

The generalized gamma distribution simplifies to the gamma,log-normal and Weibull distributions with the followingparameterisations:

`dgengamma(x, mu, sigma, Q=0)`	`=`	`dlnorm(x, mu, sigma)`
`dgengamma(x, mu, sigma, Q=1)`	`=`	`dweibull(x, shape=1/sigma, scale=exp(mu))`
`dgengamma(x, mu, sigma, Q=sigma)`	`=`	`dgamma(x,shape=1/sigma^2, rate=exp(-mu) / sigma^2)`

The properties of thegeneralized gamma and its applications to survival analysis are discussedin detail by Cox (2007).

The generalized F distributionGenF extends the generalizedgamma to four parameters.

Value

dgengamma gives the density,pgengamma gives thedistribution function,qgengamma gives the quantile function,rgengamma generates random deviates,Hgengamma retuns thecumulative hazard andhgengamma the hazard.

Author(s)

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

References

Prentice, R. L. (1974). A log gamma model and its maximumlikelihood estimation. Biometrika 61(3):539-544.

Farewell, V. T. and Prentice, R. L. (1977). A study ofdistributional shape in life testing. Technometrics 19(1):69-75.

Lawless, J. F. (1980). Inference in the generalized gamma and loggamma distributions. Technometrics 22(3):409-419.

Cox, C., Chu, H., Schneider, M. F. and Muñoz, A. (2007).Parametric survival analysis and taxonomy of hazard functions forthe generalized gamma distribution. Statistics in Medicine26:4252-4374

Stacy, E. W. (1962). A generalization of the gamma distribution.Annals of Mathematical Statistics 33:1187-92

Generalized gamma distribution (original parameterisation)

Description

Density, distribution function, hazards, quantile function andrandom generation for the generalized gamma distribution, usingthe original parameterisation from Stacy (1962).

Usage

dgengamma.orig(x, shape, scale = 1, k, log = FALSE)pgengamma.orig(q, shape, scale = 1, k, lower.tail = TRUE, log.p = FALSE)Hgengamma.orig(x, shape, scale = 1, k)hgengamma.orig(x, shape, scale = 1, k)qgengamma.orig(p, shape, scale = 1, k, lower.tail = TRUE, log.p = FALSE)rgengamma.orig(n, shape, scale = 1, k)

Arguments

x,q

vector of quantiles.

shape

vector of “Weibull” shape parameters.

scale

vector of scale parameters.

k

vector of “Gamma” shape parameters.

log,log.p

logical; if TRUE, probabilities p are given as log(p).

lower.tail

logical; if TRUE (default), probabilities areP(X\le x), otherwise,P(X > x).

p

vector of probabilities.

n

number of observations. Iflength(n) > 1, the length istaken to be the number required.

Details

Ifw \sim Gamma(k,1), thenx =\exp(w/shape + \log(scale))follows the original generalised gamma distribution with theparameterisation given here (Stacy 1962). Definingshape=b>0,scale=a>0,x hasprobability density

f(x | a, b, k) = \frac{b}{\Gamma(k)} \frac{x^{bk - 1}}{a^{bk}}

\exp(-(x/a)^b)

The original generalized gamma distribution simplifies to thegamma, exponential and Weibull distributions with the followingparameterisations:

`dgengamma.orig(x, shape, scale, k=1)`	`=`	`dweibull(x, shape, scale)`
`dgengamma.orig(x,shape=1, scale, k)`	`=`	`dgamma(x, shape=k,scale)`
`dgengamma.orig(x, shape=1, scale, k=1)`	`=`	`dexp(x, rate=1/scale)`

Also as k tends to infinity, it tends to the log normal (as indlnorm) with the following parameters (Lawless,1980):

dlnorm(x, meanlog=log(scale) + log(k)/shape,sdlog=1/(shape*sqrt(k)))

For more stable behaviour as the distribution tends to the log-normal, analternative parameterisation was developed by Prentice (1974). This isgiven indgengamma, and is now preferred for statisticalmodelling. It is also more flexible, including a further new class ofdistributions with negative shapek.

The generalized F distributionGenF.orig, and its similaralternative parameterisationGenF, extend the generalizedgamma to four parameters.

Value

dgengamma.orig gives the density,pgengamma.origgives the distribution function,qgengamma.orig gives the quantilefunction,rgengamma.orig generates random deviates,Hgengamma.orig retuns the cumulative hazard andhgengamma.orig the hazard.

Author(s)

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

References

Stacy, E. W. (1962). A generalization of the gammadistribution. Annals of Mathematical Statistics 33:1187-92.

Prentice, R. L. (1974). A log gamma model and its maximum likelihoodestimation. Biometrika 61(3):539-544.

Lawless, J. F. (1980). Inference in the generalized gamma and log gammadistributions. Technometrics 22(3):409-419.

The Gompertz distribution

Description

Density, distribution function, hazards, quantile function and randomgeneration for the Gompertz distribution with unrestricted shape.

Usage

dgompertz(x, shape, rate = 1, log = FALSE)pgompertz(q, shape, rate = 1, lower.tail = TRUE, log.p = FALSE)qgompertz(p, shape, rate = 1, lower.tail = TRUE, log.p = FALSE)rgompertz(n, shape = 1, rate = 1)hgompertz(x, shape, rate = 1, log = FALSE)Hgompertz(x, shape, rate = 1, log = FALSE)

Arguments

x,q

vector of quantiles.

shape,rate

vector of shape and rate parameters.

log,log.p

logical; if TRUE, probabilities p are given as log(p).

lower.tail

logical; if TRUE (default), probabilities areP(X\le x), otherwise,P(X > x).

p

vector of probabilities.

n

number of observations. Iflength(n) > 1, the length istaken to be the number required.

Details

The Gompertz distribution withshape parametera andrate parameterb has probability density function

f(x | a, b) = be^{ax}\exp(-b/a (e^{ax} - 1))

and hazard

h(x | a, b) = b e^{ax}

The hazard is increasing for shapea>0 and decreasing fora<0.Fora=0 the Gompertz is equivalent to the exponential distributionwith constant hazard and rateb.

The probability distribution function is

F(x | a, b) = 1 - \exp(-b/a(e^{ax} - 1))

Thus ifa is negative, lettingx tend to infinity shows thatthere is a non-zero probability\exp(b/a) of livingforever. On these occasionsqgompertz andrgompertz willreturnInf.

Value

dgompertz gives the density,pgompertz gives thedistribution function,qgompertz gives the quantile function,hgompertz gives the hazard function,Hgompertz gives thecumulative hazard function, andrgompertz generates random deviates.

Note

Some implementations of the Gompertz restricta to be strictlypositive, which ensures that the probability of survival decreases to zeroasx increases to infinity. The more flexible implementation givenhere is consistent withstreg in Stata.

The functionseha::dgompertz and similar available in thepackageeha label the parameters the other way round, so that what iscalled theshape there is called therate here, and what iscalled1 / scale there is called theshape here. Theterminology here is consistent with the exponentialdexp andWeibulldweibull distributions in R.

Author(s)

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

References

Stata Press (2007) Stata release 10 manual: Survival analysisand epidemiological tables.

The log-logistic distribution

Description

Density, distribution function, hazards, quantile function andrandom generation for the log-logistic distribution.

Usage

dllogis(x, shape = 1, scale = 1, log = FALSE)pllogis(q, shape = 1, scale = 1, lower.tail = TRUE, log.p = FALSE)qllogis(p, shape = 1, scale = 1, lower.tail = TRUE, log.p = FALSE)rllogis(n, shape = 1, scale = 1)hllogis(x, shape = 1, scale = 1, log = FALSE)Hllogis(x, shape = 1, scale = 1, log = FALSE)

Arguments

x,q

vector of quantiles.

shape,scale

vector of shape and scale parameters.

log,log.p

logical; if TRUE, probabilities p are given aslog(p).

lower.tail

logical; if TRUE (default), probabilities areP(X \le x), otherwise,P(X > x).

p

vector of probabilities.

n

number of observations. Iflength(n) > 1, thelength is taken to be the number required.

Details

The log-logistic distribution withshape parametera>0 andscale parameterb>0 has probabilitydensity function

f(x | a, b) = (a/b) (x/b)^{a-1} / (1 + (x/b)^a)^2

and hazard

h(x | a, b) = (a/b) (x/b)^{a-1} / (1 + (x/b)^a)

forx>0. The hazard is decreasing for shapea\leq 1, and unimodal fora > 1.

The probability distribution function is

F(x | a, b) = 1 - 1/ (1 + (x/b)^a)

Ifa > 1, the mean isb c / sin(c), and ifa > 2the variance isb^2 * (2*c/sin(2*c) - c^2/sin(c)^2), wherec = \pi/a, otherwise these are undefined.

Value

dllogis gives the density,pllogis gives thedistribution function,qllogis gives the quantile function,hllogis gives the hazard function,Hllogis gives thecumulative hazard function, andrllogis generates randomdeviates.

Note

Various different parameterisations of this distribution areused. In the one used here, the interpretation of the parametersis the same as in the standard Weibull distribution(dweibull). Like the Weibull, the survivor functionis a transformation of(x/b)^a from the non-negative real line to [0,1],but with a different link function. Covariates onbrepresent time acceleration factors, or ratios of expectedsurvival.

The same parameterisation is also used ineha::dllogis in theeha package.

Author(s)

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

References

Stata Press (2007) Stata release 10 manual: Survival analysisand epidemiological tables.

Royston/Parmar spline survival distribution

Description

Probability density, distribution, quantile, random generation, hazard, cumulative hazard, mean and restricted mean functions for the Royston/Parmarspline model. These functions have all parameters of the distribution collectedtogether in a single argumentgamma. For the equivalent functions withone argument per parameter, seeSurvsplinek.

Usage

dsurvspline(  x,  gamma,  beta = 0,  X = 0,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp",  offset = 0,  log = FALSE)psurvspline(  q,  gamma,  beta = 0,  X = 0,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp",  offset = 0,  lower.tail = TRUE,  log.p = FALSE)qsurvspline(  p,  gamma,  beta = 0,  X = 0,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp",  offset = 0,  lower.tail = TRUE,  log.p = FALSE)rsurvspline(  n,  gamma,  beta = 0,  X = 0,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp",  offset = 0)Hsurvspline(  x,  gamma,  beta = 0,  X = 0,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp",  offset = 0)hsurvspline(  x,  gamma,  beta = 0,  X = 0,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp",  offset = 0)rmst_survspline(  t,  gamma,  beta = 0,  X = 0,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp",  offset = 0,  start = 0)mean_survspline(  gamma,  beta = 0,  X = 0,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp",  offset = 0)

Arguments

x,q,t

Vector of times.

gamma

Parameters describing the baseline spline function, asdescribed inflexsurvspline. This may be supplied as avector with number of elements equal to the length ofknots, inwhich case the parameters are common to all times. Alternatively a matrixmay be supplied, with rows corresponding to different times, and columnscorresponding toknots.

beta

Vector of covariate effects. Not supported and ignored since version 2.3, and this argument will be removed in 2.4.

X

Matrix of covariate values. Not supported and ignored since version 2.3, and this argument will be removed in 2.4.

knots

Locations of knots on the axis of log time, supplied inincreasing order. Unlike inflexsurvspline, these includethe two boundary knots. If there are no additional knots, the boundarylocations are not used. If there are one or more additional knots, theboundary knots should be at or beyond the minimum and maximum values of thelog times. Inflexsurvspline these are exactly at theminimum and maximum values.

This may in principle be supplied as a matrix, in the same way as forgamma, but in most applications the knots will be fixed.

scale

"hazard","odds", or"normal", asdescribed inflexsurvspline. With the default of no knots inaddition to the boundaries, this model reduces to the Weibull, log-logisticand log-normal respectively. The scale must be common to all times.

timescale

"log" or"identity" as described inflexsurvspline.

spline

"rp" to use the natural cubic spline basisdescribed in Royston and Parmar."splines2ns" to use thealternative natural cubic spline basis from thesplines2package (Wang and Yan 2021), which may be better behaved due tothe basis being orthogonal.

offset

An extra constant to add to the linear predictor\eta. Not supported and ignored since version 2.3, and this argument will be removed in 2.4.

log,log.p

Return log density or probability.

lower.tail

logical; if TRUE (default), probabilities areP(X\le x), otherwise,P(X > x).

p

Vector of probabilities.

n

Number of random numbers to simulate.

start

Optional left-truncation time or times. The returnedrestricted mean survival will be conditioned on survival up tothis time.

Value

dsurvspline gives the density,psurvspline gives thedistribution function,hsurvspline gives the hazard andHsurvspline gives the cumulative hazard, as described inflexsurvspline.

qsurvspline gives the quantile function, which is computed by crudenumerical inversion (usingqgeneric).

rsurvspline generates random survival times by usingqsurvspline on a sample of uniform random numbers. Due to thenumerical root-finding involved inqsurvspline, it is slow comparedto typical random number generation functions.

Author(s)

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

References

Wang W, Yan J (2021). Shape-Restricted Regression Splines with RPackage splines2. Journal of Data Science, 19(3), 498-517.

Examples

## reduces to the weibullregscale <- 0.786; cf <- 1.82a <- 1/regscale; b <- exp(cf)dweibull(1, shape=a, scale=b)dsurvspline(1, gamma=c(log(1 / b^a), a)) # should be the same## reduces to the log-normalmeanlog <- 1.52; sdlog <- 1.11dlnorm(1, meanlog, sdlog) dsurvspline(1, gamma = c(-meanlog/sdlog, 1/sdlog), scale="normal")# should be the same

Royston/Parmar spline survival distribution functions with one argument per parameter

Description

Probability density, distribution, quantile, random generation, hazard, cumulative hazard, mean and restricted mean functions for the Royston/Parmarspline model, with one argument per parameter. For the equivalent functions with all parameters collected together in a single argument, seeSurvspline.

Usage

mean_survspline0(  gamma0,  gamma1,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp")mean_survspline1(  gamma0,  gamma1,  gamma2,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp")mean_survspline2(  gamma0,  gamma1,  gamma2,  gamma3,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp")mean_survspline3(  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp")mean_survspline4(  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp")mean_survspline5(  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp")mean_survspline6(  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  gamma7,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp")mean_survspline7(  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  gamma7,  gamma8,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp")rmst_survspline0(  t,  gamma0,  gamma1,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp",  start = 0)rmst_survspline1(  t,  gamma0,  gamma1,  gamma2,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp",  start = 0)rmst_survspline2(  t,  gamma0,  gamma1,  gamma2,  gamma3,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp",  start = 0)rmst_survspline3(  t,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp",  start = 0)rmst_survspline4(  t,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp",  start = 0)rmst_survspline5(  t,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp",  start = 0)rmst_survspline6(  t,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  gamma7,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp",  start = 0)rmst_survspline7(  t,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  gamma7,  gamma8,  knots = c(-10, 10),  scale = "hazard",  timescale = "log",  spline = "rp",  start = 0)dsurvspline0(  x,  gamma0,  gamma1,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  log = FALSE)dsurvspline1(  x,  gamma0,  gamma1,  gamma2,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  log = FALSE)dsurvspline2(  x,  gamma0,  gamma1,  gamma2,  gamma3,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  log = FALSE)dsurvspline3(  x,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  log = FALSE)dsurvspline4(  x,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  log = FALSE)dsurvspline5(  x,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  log = FALSE)dsurvspline6(  x,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  gamma7,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  log = FALSE)dsurvspline7(  x,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  gamma7,  gamma8,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  log = FALSE)psurvspline0(  q,  gamma0,  gamma1,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  lower.tail = TRUE,  log.p = FALSE)psurvspline1(  q,  gamma0,  gamma1,  gamma2,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  lower.tail = TRUE,  log.p = FALSE)psurvspline2(  q,  gamma0,  gamma1,  gamma2,  gamma3,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  lower.tail = TRUE,  log.p = FALSE)psurvspline3(  q,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  lower.tail = TRUE,  log.p = FALSE)psurvspline4(  q,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  lower.tail = TRUE,  log.p = FALSE)psurvspline5(  q,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  lower.tail = TRUE,  log.p = FALSE)psurvspline6(  q,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  gamma7,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  lower.tail = TRUE,  log.p = FALSE)psurvspline7(  q,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  gamma7,  gamma8,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  lower.tail = TRUE,  log.p = FALSE)qsurvspline0(  p,  gamma0,  gamma1,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  lower.tail = TRUE,  log.p = FALSE)qsurvspline1(  p,  gamma0,  gamma1,  gamma2,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  lower.tail = TRUE,  log.p = FALSE)qsurvspline2(  p,  gamma0,  gamma1,  gamma2,  gamma3,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  lower.tail = TRUE,  log.p = FALSE)qsurvspline3(  p,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  lower.tail = TRUE,  log.p = FALSE)qsurvspline4(  p,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  lower.tail = TRUE,  log.p = FALSE)qsurvspline5(  p,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  lower.tail = TRUE,  log.p = FALSE)qsurvspline6(  p,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  gamma7,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  lower.tail = TRUE,  log.p = FALSE)qsurvspline7(  p,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  gamma7,  gamma8,  knots,  scale = "hazard",  timescale = "log",  spline = "rp",  lower.tail = TRUE,  log.p = FALSE)rsurvspline0(  n,  gamma0,  gamma1,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")rsurvspline1(  n,  gamma0,  gamma1,  gamma2,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")rsurvspline2(  n,  gamma0,  gamma1,  gamma2,  gamma3,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")rsurvspline3(  n,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")rsurvspline4(  n,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")rsurvspline5(  n,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")rsurvspline6(  n,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  gamma7,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")rsurvspline7(  n,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  gamma7,  gamma8,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")hsurvspline0(  x,  gamma0,  gamma1,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")hsurvspline1(  x,  gamma0,  gamma1,  gamma2,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")hsurvspline2(  x,  gamma0,  gamma1,  gamma2,  gamma3,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")hsurvspline3(  x,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")hsurvspline4(  x,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")hsurvspline5(  x,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")hsurvspline6(  x,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  gamma7,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")hsurvspline7(  x,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  gamma7,  gamma8,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")Hsurvspline0(  x,  gamma0,  gamma1,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")Hsurvspline1(  x,  gamma0,  gamma1,  gamma2,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")Hsurvspline2(  x,  gamma0,  gamma1,  gamma2,  gamma3,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")Hsurvspline3(  x,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")Hsurvspline4(  x,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")Hsurvspline5(  x,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")Hsurvspline6(  x,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  gamma7,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")Hsurvspline7(  x,  gamma0,  gamma1,  gamma2,  gamma3,  gamma4,  gamma5,  gamma6,  gamma7,  gamma8,  knots,  scale = "hazard",  timescale = "log",  spline = "rp")

Arguments

gamma0,gamma1,gamma2,gamma3,gamma4,gamma5,gamma6,gamma7,gamma8

Parameters describing the baseline spline function, asdescribed inflexsurvspline.

knots

This may in principle be supplied as a matrix, in the same way as forgamma, but in most applications the knots will be fixed.

scale

timescale

"log" or"identity" as described inflexsurvspline.

spline

start

Optional left-truncation time or times. The returnedrestricted mean survival will be conditioned on survival up tothis time.

x,q,t

Vector of times.

log,log.p

Return log density or probability.

lower.tail

logical; if TRUE (default), probabilities areP(X\le x), otherwise,P(X > x).

p

Vector of probabilities.

n

Number of random numbers to simulate.

Details

These functions go up to 7 spline knots, or 9 parameters.

Author(s)

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

Weibull distribution in proportional hazards parameterisation

Description

Density, distribution function, hazards, quantile function and randomgeneration for the Weibull distribution in its proportional hazardsparameterisation.

Usage

dweibullPH(x, shape, scale = 1, log = FALSE)pweibullPH(q, shape, scale = 1, lower.tail = TRUE, log.p = FALSE)qweibullPH(p, shape, scale = 1, lower.tail = TRUE, log.p = FALSE)hweibullPH(x, shape, scale = 1, log = FALSE)HweibullPH(x, shape, scale = 1, log = FALSE)rweibullPH(n, shape, scale = 1)

Arguments

x,q

Vector of quantiles.

shape

Vector of shape parameters.

scale

Vector of scale parameters.

log,log.p

logical; if TRUE, probabilities p are given as log(p).

lower.tail

logical; if TRUE (default), probabilities areP(X\le x), otherwise,P(X > x).

p

Vector of probabilities.

n

number of observations. Iflength(n) > 1, the length istaken to be the number required.

Details

The Weibull distribution in proportional hazards parameterisation with‘shape’ parameter a and ‘scale’ parameter m has density given by

f(x) = a m x^{a-1} exp(- m x^a)

cumulative distribution functionF(x) = 1 - exp( -m x^a ), survivorfunctionS(x) = exp( -m x^a ), cumulative hazardm x^a andhazarda m x^{a-1}.

dweibull in base R has the alternative 'accelerated failuretime' (AFT) parameterisation with shape a and scale b. The shape parametera is the same in both versions. The scale parameters are related asb = m^{-1/a}, equivalently m = b^-a.

In survival modelling, covariates are typically included through a linearmodel on the log scale parameter. Thus, in the proportional hazards model,the coefficients in such a model onm are interpreted as log hazardratios.

In the AFT model, covariates onb are interpreted as timeacceleration factors. For example, doubling the value of a covariate withcoefficientbeta=log(2) would give half the expected survival time.These coefficients are related to the log hazard ratios\gamma as\beta = -\gamma / a.

Value

dweibullPH gives the density,pweibullPH gives thedistribution function,qweibullPH gives the quantile function,rweibullPH generates random deviates,HweibullPH retuns thecumulative hazard andhweibullPH the hazard.

Author(s)

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

Aalen-Johansen nonparametric estimates comparable to a fitted flexsurvmixmodel

Description

Given a fitted flexsurvmix model, return the Aalen-Johansen estimates of theprobability of occupying each state at a series of times covering theobserved data. State 1 represents not having experienced any of thecompeting events, while state 2 and any further states correspond to havingexperienced each of the competing events respectively. These estimates canbe compared with the fitted probabilities returned byp_flexsurvmix to check the fit of aflexsurvmix model.

Usage

ajfit(x, newdata = NULL, tidy = TRUE)

Arguments

x

Fitted model returned byflexsurvmix.

newdata

Data frame of alternative covariate values to check fit for.Only factor covariates are supported.

tidy

IfTRUE then a single tidy data frame is returned.Otherwise the function returns the object returned bysurvfit, or alist of these objects if we are computing subset-specific estimates.

Details

This is only supported for models with no covariates or models containingonly factor covariates.

For models with factor covariates, the Aalen-Johansen estimates are computedfor the subsets of the data defined innewdata. Ifnewdata isnot supplied, then this function returns state occupancy probabilities forall possible combinations of the factor levels.

The Aalen-Johansen estimates are computed usingsurvfit from thesurvival package (Therneau2020).

References

Therneau T (2020). _A Package for Survival Analysis in R_. Rpackage version 3.2-3, <URL: https://CRAN.R-project.org/package=survival>.

Forms a tidy data frame for plotting the fit of parametric mixturemulti-state models against nonparametric estimates

Description

This computes Aalen-Johansen estimates of the probability of occupying eachstate at a series of times, usingajfit. The equivalentestimates from the parametric model are then produced usingp_flexsurvmix, and concatenated with the nonparametricestimates to form a tidy data frame. This data frame can then simply beplotted usingggplot.

Usage

ajfit_flexsurvmix(x, maxt = NULL, startname = "Start", B = NULL)

Arguments

x

Fitted model returned byflexsurvmix.

maxt

Maximum time to produce parametric estimates for. By defaultthis is the maximum event time in the data, the maximum time we havenonparametric estimates for.

startname

Label to give the state corresponding to "no event happenedyet". By default this is"Start".

B

Number of simulation replications to use to calculate a confidenceinterval for the parametric estimates inp_flexsurvmix. Comparable intervals for the Aalen-Johansen estimates are returned if thisis set. Otherwise ifB=NULL then no intervals are returned.

Check the fit of Markov flexible parametric multi-state models againstnonparametric estimates.

Description

Computes both parametric and comparable Aalen-Johansen nonparametricestimates from a flexible parametric multi-state model, and returns themtogether in a tidy data frame. Only models with no covariates, or onlyfactor covariates, are supported. If there are factor covariates, then thenonparametric estimates are computed for subgroups defined by combinationsof the covariates. Another restriction of this function is that alltransitions must have the same covariates on them.

Usage

ajfit_fmsm(x, maxt = NULL, newdata = NULL)

Arguments

x

Object returned byfmsm representing a flexibleparametric multi-state model. This must be Markov, rather thansemi-Markov, and no check is performed for this. Note that all"competing risks" style models, with just one source state and multipledestination states, are Markov, so those are fine here.

maxt

Maximum time to compute parametric estimates to.

newdata

Data frame defining the subgroups to consider. This musthave a column for each covariate in the model. If omitted, then allpotential subgroups defined by combinations of factor covariates areincluded. Continuous covariates are not supported.

Value

Tidy data frame containing both Aalen-Johansen and parametricestimates of state occupancy over time, and by subgroup if subgroups areincluded.

Augment data with information from a flexsurv model object

Description

Augment accepts a model object and a dataset and adds information about each observation in the dataset. Most commonly, this includes predicted values in the.fitted column, residuals in the.resid column, and standard errors for the fitted values in a.se.fit column. New columns always begin with a . prefix to avoid overwriting columns in the original dataset.

Usage

## S3 method for class 'flexsurvreg'augment(  x,  data = NULL,  newdata = NULL,  type.predict = "response",  type.residuals = "response",  ...)

Arguments

x

Output fromflexsurvreg orflexsurvspline, representing a fitted survival model object.

data

Adata.frame ortibble containing the original data that was used to produce the objectx.

newdata

Adata.frame ortibble containing all the original predictors used to createx. Defaults toNULL, indicating that nothing has been passed tonewdata. Ifnewdata is specified, thedata argument will be ignored.

type.predict

Character indicating type of prediction to use. Passed to thetype argument of thepredict generic. Allowed arguments vary with model class, so be sure to read thepredict.my_class documentation.

type.residuals

Character indicating type of residuals to use. Passed to the type argument ofresiduals generic. Allowed arguments vary with model class, so be sure to read theresiduals.my_class documentation.

...

Additional arguments. Not currently used.

Details

If neither ofdata ornewdata are specified, thenmodel.frame(x) will be used. It is worth noting thatmodel.frame(x) will include aSurv object and not the original time-to-event variables used when fitting theflexsurvreg object. If the original data is desired, specifydata.

Value

Atibble containingdata ornewdata and possible additional columns:

.fitted Fitted values of model
.se.fit Standard errors of fitted values
.resid Residuals (not present ifnewdata specified)

Examples

fit <- flexsurvreg(formula = Surv(futime, fustat) ~ age, data = ovarian, dist = "exp")augment(fit, data = ovarian)

Natural cubic spline basis

Description

Compute a basis for a natural cubic spline, by default using theparameterisation described by Royston and Parmar (2002). Used forflexible parametric survival models.

Usage

basis(knots, x, spline = "rp")

Arguments

knots

Vector of knot locations in increasing order, including theboundary knots at the beginning and end.

x

Vector of ordinates to compute the basis for.

spline

Details

The exact formula for the basis is given inflexsurvspline.

Value

A matrix with one row for each ordinate and one column for eachknot.

basis returns the basis, anddbasis returns its derivativewith respect tox.

fss anddfss are the same, but with the order of thearguments swapped around for consistency with similar functions in other Rpackages.

Author(s)

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

References

Wang W, Yan J (2021). Shape-Restricted Regression Splines with RPackage splines2. Journal of Data Science, 19(3), 498-517.

Breast cancer survival data

Description

Survival times of 686 patients with primary node positive breast cancer.

Usage

bc

Format

A data frame with 686 rows.

`censrec`	(numeric)	1=dead, 0=censored
`rectime`	(numeric)	Time of death or censoring in days
`group`	(numeric)	Prognostic group:`"Good"`,`"Medium"` or`"Poor"`,
		from a regression model developed by Sauerbrei and Royston (1999).

Source

German Breast Cancer Study Group, 1984-1989. Used as a referencedataset for the spline-based survival model of Royston and Parmar (2002),implemented here inflexsurvspline. Originally provided withthestpm (Royston 2001, 2004) andstpm2 (Lambert 2009, 2010)Stata modules.

References

Sauerbrei, W. and Royston, P. (1999). Building multivariable prognostic anddiagnostic models: transformation of the predictors using fractionalpolynomials. Journal of the Royal Statistical Society, Series A 162:71-94.

Bootstrap confidence intervals for flexsurv output functions

Description

Calculate a confidence interval for a model output by repeatedly replacing the parameters in a fitted model object with a draw from the multivariate normal distribution of the maximum likelihood estimates, then recalculating the output function.

Usage

bootci.fmsm(  x,  B,  fn,  cl = 0.95,  attrs = NULL,  cores = NULL,  sample = FALSE,  ...)

Arguments

x

Output fromflexsurvreg orflexsurvspline, representing a fitted survival model object. Or a list of such objects, defining a multi-state model.

B

Number of parameter draws to use

fn

Function to bootstrap the results of. It must have an argument namedx giving a fitted flexsurv model object. This may return a value with any format, e.g. list, matrix or vector, as long as it can be converted to a numeric vector withunlist. See the example below.

cl

Width of symmetric confidence interval, by default 0.95

attrs

Any attributes of the value returned fromfn which we want confidence intervals for. These will be unlisted, if possible, and appended to the result vector.

cores

Number of cores to use for parallel processing.

sample

IfTRUE then the bootstrap sample itself is returned. IfFALSE then the quantiles of the sample are returned giving a confidence interval.

...

Additional arguments to pass tofn.

Value

A matrix with two rows, giving the upper and lower confidence limits respectively. Each row is a vector of the same length as the unlisted result of the function corresponding tofncall.

Examples

## How to use bootci.fmsm## Write a function with one argument called x giving a fitted model,## and returning some results of the model.  The results may be in any form.   tmat <- rbind(c(NA,1,2),c(NA,NA,3),c(NA,NA,NA))bexp <- flexsurvreg(Surv(Tstart, Tstop, status) ~ trans, data=bosms3, dist="exp")summfn <- function(x, t){ resp <-  flexsurv::pmatrix.fs(x, trans=tmat, t=t) rest <- flexsurv::totlos.fs(x, trans=tmat, t=t) list(resp, rest)}## Use bootci.fmsm to obtain the confidence interval## The matrix columns are in the order of the unlisted results of the original## summfn.  You will have to rearrange them into the format that you want.## If summfn has any extra arguments, in this case \code{t}, make sure they are## passed through via the ... argument to bootci.fmsmbootci.fmsm(bexp, B=3, fn=summfn, t=10)bootci.fmsm(bexp, B=3, fn=summfn, t=5)

Bronchiolitis obliterans syndrome after lung transplants

Description

A dataset containing histories of bronchiolitis obliterans syndrome (BOS)from lung transplant recipients. BOS is a chronic decline in lung function,often observed after lung transplantation.

Format

A data frame containing a sequence of observed or censoredtransitions to the next stage of severity or death. It is groupedby patient and includes histories of 204 patients. All patientsstart in state 1 (no BOS) at six months after transplant, and maysubsequently develop BOS or die.

bosms3 contains the data for a three-state model: no BOS, BOS ordeath.bosms4 uses a four-state representation: no BOS, mild BOS,moderate/severe BOS or death.

`id`	(numeric)	Patient identification number
`from`	(numeric)	Observedstarting state of the transition
`to`	(numeric)	Observedor potential ending state of the transition
`Tstart`	(numeric)	Time at the start of the interval
`Tstop`	(numeric)	Time at the end of the interval
`time`	(numeric)	Time difference`Tstart`-`Tstop`
`status`	(numeric)	1 if the transition to state`to` was observed, or0 if the transition to state`to` was censored (for example, if thepatient was observed to move to a competing state)
`trans`	(factor)	Number of the transition`from`-`to` in the set ofall`ntrans` allowed transitions, numbered from 1 to`ntrans`.

Details

The entry time of each patient into each stage of BOS was estimated byclinicians, based on their history of lung function measurements and acuterejection and infection episodes. BOS is only assumed to occur beyond sixmonths after transplant. In the first six months the function of eachpatient's new lung stabilises. Subsequently BOS is diagnosed by comparingthe lung function against the "baseline" value.

The same data are provided in themsm package, but in thenative format ofmsm to allow Markov models to be fitted.Inflexsurv, much more flexible models can be fitted.

Source

Papworth Hospital, U.K.

References

Heng. D. et al. (1998). Bronchiolitis Obliterans Syndrome:Incidence, Natural History, Prognosis, and Risk Factors. Journal of Heartand Lung Transplantation 17(12)1255–1263.

Extract model coefficients from fitted flexible survival models

Description

Extract model coefficients from fitted flexible survival models. Thispresents all parameter estimates, transformed to the real line if necessary.For example, shape or scale parameters, which are constrained to bepositive, are returned on the log scale.

Usage

## S3 method for class 'flexsurvreg'coef(object, ...)

Arguments

object

Output fromflexsurvreg orflexsurvspline, representing a fitted survival model object.

...

Further arguments passed to or from other methods. Currentlyunused.

Details

This matches the behaviour ofcoef.default for standard R modelfamilies such asglm, where intercepts in regressionmodels are presented on the same scale as the covariate effects. Note thatany parameter in a distribution fitted byflexsurvreg orflexsurvreg may be an intercept in a regression model.

Value

This returns themod$res.t[,"est"] component of the fittedmodel objectmod. Seeflexsurvreg,flexsurvspline for full documentation of all components.

Author(s)

C. H. Jacksonchris.jackson@mrc-bsu.cam.ac.uk

Cox-Snell residuals from a parametric survival model

Description

Cox-Snell residuals from a parametric survival model

Usage

coxsnell_flexsurvreg(x)

Arguments

x

Object returned byflexsurvreg orflexsurvspline representing a fitted survival model

Value

A data frame with a column calledest giving the Cox-Snell residual, defined as the fitted cumulative hazard at each data point.fitted cumulative hazard at the given observed data point, and other columns indicating the observation time,observed event status, and covariate values defining the data at this point.

The cumulative hazardsest should form a censored sample from an Exponential(1). Therefore to check the fit of the model, plot a nonparametric estimate of the cumulativehazard curve against a diagonal line through the origin, which is the theoretical cumulativehazard trajectory of the Exponential(1).

Examples

  fitg <- flexsurvreg(formula = Surv(futime, fustat) ~ age, data = ovarian, dist = "gengamma")  cs <- coxsnell_flexsurvreg(fitg)    ## Model appears to fit well, with some small sample noise   surv <- survfit(Surv(cs$est, ovarian$fustat) ~ 1)  plot(surv, fun="cumhaz")  abline(0, 1, col="red")

Flexible parametric mixture models for times to competing events

Description

In a mixture model for competing events, an individual can experience one ofa set of different events. We specify a model for the probability that theywill experience each event before the others, and a model for the time tothe event conditionally on that event occurring first.

Usage

flexsurvmix(  formula,  data,  event,  dists,  pformula = NULL,  anc = NULL,  partial_events = NULL,  initp = NULL,  inits = NULL,  fixedpars = NULL,  dfns = NULL,  method = "direct",  em.control = NULL,  optim.control = NULL,  aux = NULL,  sr.control = survreg.control(),  integ.opts,  hess.control = NULL,  ...)

Arguments

formula

Survival model formula. The left hand side is aSurvobject specified as inflexsurvreg. This may define variouskinds of censoring, as described inSurv. Any covariates onthe right hand side of this formula will be placed on the locationparameter for every component-specific distribution. Covariates on otherparameters of the component-specific distributions may be supplied usingtheanc argument.

Alternatively,formula may be a list of formulae, with onecomponent for each alternative event. This may be used to specifydifferent covariates on the location parameter for different components.

A list of formulae may also be used to indicate that for particularindividuals, different events may be observed in different ways, withdifferent censoring mechanisms. Each list component specifies the dataand censoring scheme for that mixture component.

For example, suppose we are studying people admitted to hospital,and thecompeting states are death in hospital and discharge from hospital. Attime t we know that a particular individual is still alive, but we do notknow whether they are still in hospital, or have been discharged. In thiscase, if the individual were to die in hospital, their death time would beright censored at t. If the individual will be (or has been) dischargedbefore death, their discharge time is completely unknown, thusinterval-censored on (0,Inf). Therefore, we need to store different eventtime and status variables in the data for different alternative events.This is specified here as

formula = list("discharge" = Surv(t1di, t2di, type="interval2"), "death" = Surv(t1de, status_de))

where for this individual,(t1di, t2di) = (0, Inf) and(t1de, status_de) = (t, 0).

The "dot" notation commonly used to indicate "all remaining variables" in aformula is not supported inflexsurvmix.

data

Data frame containing variables mentioned informula,event andanc.

event

Variable in the data that specifies which of the alternativeevents is observed for which individual. If the individual's follow-up isright-censored, or if the event is otherwise unknown, this variable musthave the valueNA.

Ideally this should be a factor, since the mixture components can then beeasily identified in the results with a name instead of a number. If thisis not already a factor, it is coerced to one. Then the levels of thefactor define the required order for the components of the list argumentsdists,anc,inits anddfns. Alternatively, ifthe components of the list arguments are named according to the levels ofevent, then the components can be arranged in any order.

dists

Vector specifying the parametric distribution to use for eachcomponent. The same distributions are supported as inflexsurvreg.

pformula

Formula describing covariates to include on the componentmembership proabilities by multinomial logistic regression. The firstcomponent is treated as the baseline.

The "dot" notation commonly used to indicate "all remaining variables" in aformula is not supported.

anc

List of component-specific lists, of length equal to the numberof components. Each component-specific list is a list of formulaerepresenting covariate effects on parameters of the distribution.

If there are covariates for one component but not others, then a listcontaining one null formula on the location parameter should be suppliedfor the component with no covariates, e.glist(rate=~1) if thelocation parameter is calledrate.

Covariates on the location parameter may also be supplied here instead ofinformula. Supplying them inanc allows some componentsbut not others to have covariates on their location parameter. If a covariateon the location parameter was provided informula, and there are covariates on other parameters, then a null formula should be included for the location parameter inanc, e.glist(rate=~1)

partial_events

List specifying the factor levels ofeventwhich indicate knowledge that an individual will not experience particularevents, but may experience others. The names of the list indicate codesthat indicate partial knowledge for some individuals. The list componentis a vector, which must be a subset oflevels(event) defining theevents that a person with the corresponding event code may experience.

For example, suppose there are three alternative events called"disease1","disease2" and"disease3", and for someindividuals we know that they will not experience"disease2", butthey may experience the other two events. In that case we must create anew factor level, called, for example"disease1or3", and set thevalue ofevent to be"disease1or3" for those individuals.Then we use the"partial_events" argument to tellflexsurvmix what the potential events are for individuals with thisnew factor level.

partial_events = list("disease1or3" = c("disease1","disease3"))

initp

Initial values for component membership probabilities. Bydefault, these are assumed to be equal for each component.

inits

List of component-specific vectors. Each component-specificvector contains the initial values for the parameters of thecomponent-specific model, as would be supplied as theinits argument offlexsurvreg. By default, a heuristic is used to obtaininitial values, which depends on the parametric distribution being used,but is usually based on the empirical mean and/or variance of the survivaltimes.

fixedpars

Indexes of parameters to fix at their initial values andnot optimise. Arranged in the order: baseline mixing probabilities,covariates on mixing probabilities, time-to-event parameters by mixingcomponent. Within mixing components, time-to-event parameters are orderedin the same way as inflexsurvreg.

Iffixedpars=TRUE then all parameters will be fixed and thefunction simply calculates the log-likelihood at the initial values.

Not currently supported when using the EM algorithm.

dfns

List of lists of user-defined distribution functions, one foreach mixture component. Each list component is specified as thedfns argument offlexsurvreg.

method

Method for maximising the likelihood. Either"em" forthe EM algorithm, or"direct" for direct maximisation.

em.control

List of settings to control EM algorithm fitting. Theonly options currently available are

trace set to 1 to print the parameter estimates at each iterationof the EM algorithm

reltol convergence criterion. The algorithm stops if the loglikelihood changes by a relative amount less thanreltol. Thedefault is the same as inoptim, that is,sqrt(.Machine$double.eps).

var.method method to compute the covariance matrix."louis"for the method of Louis (1982), or"direct"for direct numericalcalculation of the Hessian of the log likelihood.

optim.p.control A list that is passed as thecontrolargument tooptim in the M step for the component membershipprobability parameters. The optimisation in the M step for thetime-to-event parameters can be controlled by theoptim.controlargument toflexsurvmix.

For example,em.control = list(trace=1, reltol=1e-12).

optim.control

List of options to pass as thecontrol argumenttooptim, which is used bymethod="direct" or in theM step for the time-to-event parameters inmethod="em". Bydefault, this usesfnscale=10000 andndeps=rep(1e-06,p)wherep is the number of parameters being estimated, unless theuser specifies these options explicitly.

aux

A named list of other arguments to pass to custom distributionfunctions. This is used, for example, byflexsurvspline tosupply the knot locations and modelling scale (e.g. hazard or odds). Thiscannot be used to fix parameters of a distribution — usefixedpars for that.

sr.control

For the models which usesurvreg to find themaximum likelihood estimates (Weibull, exponential, log-normal), this listis passed as thecontrol argument tosurvreg.

integ.opts

List of named arguments to pass tointegrate, if a custom density or hazard is provided withoutits cumulative version. For example,

integ.opts = list(rel.tol=1e-12)

hess.control

List of options to control covariance matrix computation. Available options are:

numeric. IfTRUE then numerical methods are usedto compute the Hessian for models where an analytic Hessian isavailable. These models include the Weibull (both versions),exponential, Gompertz and spline models with hazard or oddsscale. The default is to use the analytic Hessian for thesemodels. For all other models, numerical methods are always usedto compute the Hessian, whether or not this option is set.

tol.solve. The tolerance used forsolvewhen inverting the Hessian (default.Machine$double.eps)

tol.evalues The accepted tolerance for negativeeigenvalues in the covariance matrix (default1e-05).

The Hessian is positive definite, thus invertible, at the maximumlikelihood. If the Hessian computed after optimisation convergence can'tbe inverted, this is either because the converged result is not themaximum likelihood (e.g. it could be a "saddle point"), or because thenumerical methods used to obtain the Hessian were inaccurate. If yoususpect that the Hessian was computed wrongly enough that it is notinvertible, but not wrongly enough that the nearest valid inverse would bean inaccurate estimate of the covariance matrix, then these tolerancevalues can be modified (reducingtol.solve or increasingtol.evalues) to allow the inverse to be computed.

...

Optional arguments to the general-purpose optimisation routineoptim. For example, the BFGS optimisation algorithm is thedefault inflexsurvreg, but this can be changed, for exampletomethod="Nelder-Mead" which can be more robust to poor initialvalues. If the optimisation fails to converge, consider normalising theproblem using, for example,control=list(fnscale = 2500), forexample, replacing 2500 by a number of the order of magnitude of thelikelihood. If 'false' convergence is reported with anon-positive-definite Hessian, then consider tightening the tolerancecriteria for convergence. If the optimisation takes a long time,intermediate steps can be printed using thetrace argument of thecontrol list. Seeoptim for details.

Details

This differs from the more usual "competing risks" models, where we specify"cause-specific hazards" describing the time to each competing event. Thistime will not be observed for an individual if one of the competing eventshappens first. The event that happens first is defined by the minimum ofthe times to the alternative events.

Theflexsurvmix function fits a mixture model to data consisting of asingle time to an event for each individual, and an indicator for what typeof event occurs for that individual. The time to event may be observed orcensored, just as inflexsurvreg, and the type of event may beknown or unknown. In a typical application, where we follow up a set ofindividuals until they experience an event or a maximum follow-up time isreached, the event type is known if the time is observed, and the event typeis unknown when follow-up ends and the time is right-censored.

The model is fitted by maximum likelihood, either directly or by using anexpectation-maximisation (EM) algorithm, by wrappingflexsurvreg to compute the likelihood or to implement the Eand M steps.

Some worked examples are given in the package vignette about multi-statemodelling, which can be viewed by runningvignette("multistate", package="flexsurv").

Value

List of objects containing information about the fitted model. Theimportant one isres, a data frame containing the parameterestimates and associated information.

References

Jackson, C. H. and Tom, B. D. M. and Kirwan, P. D. and Mandal, S.and Seaman, S. R. and Kunzmann, K. and Presanis, A. M. and De Angelis, D. (2022)A comparison of two frameworks for multi-state modelling, applied to outcomesafter hospital admissions with COVID-19. Statistical Methods in Medical Research31(9) 1656-1674.

Larson, M. G., & Dinse, G. E. (1985). A mixture model for theregression analysis of competing risks data. Journal of the RoyalStatistical Society: Series C (Applied Statistics), 34(3), 201-211.

Lau, B., Cole, S. R., & Gange, S. J. (2009). Competing risk regressionmodels for epidemiologic data. American Journal of Epidemiology, 170(2),244-256.

Flexible parametric regression for time-to-event data

Description

Parametric modelling or regression for time-to-event data. Several built-indistributions are available, and users may supply their own.

Usage

flexsurvreg(  formula,  anc = NULL,  data,  weights,  bhazard,  rtrunc,  subset,  na.action,  dist,  inits,  fixedpars = NULL,  dfns = NULL,  aux = NULL,  cl = 0.95,  integ.opts = NULL,  sr.control = survreg.control(),  hessian = TRUE,  hess.control = NULL,  ...)

Arguments

formula

A formula expression in conventional R linear modellingsyntax. The response must be a survival object as returned by theSurv function, and any covariates are given on theright-hand side. For example,

Surv(time, dead) ~ age + sex

Surv objects oftype="right","counting","interval1" or"interval2" are supported, corresponding toright-censored, left-truncated or interval-censored observations.

If there are no covariates, specify1 on the right hand side, forexampleSurv(time, dead) ~ 1.

If the right hand side is specified as. all remaining variables areincluded as covariates. For example,Surv(time, dead) ~ .corresponds toSurv(time, dead) ~ age + sex ifdata containsthe variablestime,dead,age, andsex.

By default, covariates are placed on the “location” parameter of thedistribution, typically the "scale" or "rate" parameter, through a linearmodel, or a log-linear model if this parameter must be positive. Thisgives an accelerated failure time model or a proportional hazards model(seedist below) depending on how the distribution isparameterised.

Covariates can be placed on other (“ancillary”) parameters by using thename of the parameter as a “function” in the formula. For example, in aWeibull model, the following expresses the scale parameter in terms of ageand a treatment variabletreat, and the shape parameter in terms ofsex and treatment.

Surv(time, dead) ~ age + treat + shape(sex) + shape(treat)

However, if the names of the ancillary parameters clash with any realfunctions that might be used in formulae (such asI(), orfactor()), then those functions will not work in the formula. Asafer way to model covariates on ancillary parameters is through theanc argument toflexsurvreg.

survreg users should also note that the functionstrata() is ignored, so that any covariates surrounded bystrata() are applied to the location parameter. Likewise thefunctionfrailty() is not handled.

anc

An alternative and safer way to model covariates on ancillaryparameters, that is, parameters other than the main location parameter ofthe distribution. This is a named list of formulae, with the name of eachcomponent giving the parameter to be modelled. The model above can alsobe defined as:

Surv(time, dead) ~ age + treat, anc = list(shape = ~ sex + treat)

data

A data frame in which to find variables supplied informula. If not given, the variables should be in theworking environment.

weights

Optional numeric variable giving weights for eachindividual in the data. The fitted model is then defined bymaximising the weighted sum of the individual-specific log-likelihoods.

bhazard

Optional variable giving expected hazards for relativesurvival models. The model is described by Nelson et al. (2007).

bhazard should contain a vector of values for each person inthe data.

For people with observed events,bhazard refers to thehazard at the observed event time.
For people whose event time isleft-censored or interval-censored,bhazard should contain theprobability of dying by the end of the corresponding interval,conditionally on being alive at the start.
For people whose event timeis right-censored, the value ofbhazard is ignored and does notneed to be specified.

Ifbhazard is supplied, then the parameter estimates returned byflexsurvreg and the outputs returned bysummary.flexsurvregdescribe the parametric model for relative survival.

For relative survival models, the log-likelihood returned byflexsurvreg is a partiallog-likelihood, which omits a constant term defined by the sum of thecumulative hazards at the event or censoring time for each individual. Hence this constant must be added if a full likelihood is needed.

rtrunc

Optional variable giving individual-specific right-truncationtimes. Used for analysing data with "retrospective ascertainment". Forexample, suppose we want to estimate the distribution of the time fromonset of a disease to death, but have only observed cases known to havedied by the current date. In this case, times from onset to death forindividuals in the data are right-truncated by the current date minus theonset date. Predicted survival times for new cases can then be describedby an un-truncated version of the fitted distribution.

These models can suffer from weakly identifiable parameters andbadly-behaved likelihood functions, and it is advised to compareconvergence for different initial values by supplying differentinits arguments toflexsurvreg.

subset

Vector of integers or logicals specifying the subset of theobservations to be used in the fit.

na.action

a missing-data filter function, applied after any 'subset'argument has been used. Default isoptions()$na.action.

dist

Typically, one of the strings in the first column of thefollowing table, identifying a built-in distribution. This table alsoidentifies the location parameters, and whether covariates on theseparameters represent a proportional hazards (PH) or accelerated failuretime (AFT) model. In an accelerated failure time model, the covariatespeeds up or slows down the passage of time. So if the coefficient(presented on the log scale) is log(2), then doubling the covariate valuewould give half the expected survival time.

`"gengamma"`	Generalized gamma (stable)	mu	AFT
`"gengamma.orig"`	Generalized gamma (original)	scale	AFT
`"genf"`	Generalized F (stable)	mu	AFT
`"genf.orig"`	Generalized F (original)	mu	AFT
`"weibull"`	Weibull	scale	AFT
`"gamma"`	Gamma	rate	AFT
`"exp"`	Exponential	rate	PH
`"llogis"`	Log-logistic	scale	AFT
`"lnorm"`	Log-normal	meanlog	AFT
`"gompertz"`	Gompertz	rate	PH

"exponential" and"lognormal" can be used as aliases for"exp" and"lnorm", for compatibility withsurvreg.

Alternatively,dist can be a list specifying a custom distribution.See section “Custom distributions” below for how to construct this list.

Very flexible spline-based distributions can also be fitted withflexsurvspline.

The parameterisations of the built-in distributions used here are the sameas in their built-in distribution functions:dgengamma,dgengamma.orig,dgenf,dgenf.orig,dweibull,dgamma,dexp,dlnorm,dgompertz,respectively. The functions in base R are used where available,otherwise, they are provided in this package.

A package vignette "Distributions reference" lists the survivor functionsand covariate effect parameterisations used by each built-in distribution.

For the Weibull, exponential and log-normal distributions,flexsurvreg simply works by callingsurvreg toobtain the maximum likelihood estimates, then callingoptimto double-check convergence and obtain the covariance matrix forflexsurvreg's preferred parameterisation.

The Weibull parameterisation is different from that insurvreg, instead it is consistent withdweibull. The"scale" reported bysurvreg is equivalent to1/shape as definedbydweibull and henceflexsurvreg. The firstcoefficient(Intercept) reported bysurvregis equivalent tolog(scale) indweibull andflexsurvreg.

Similarly in the exponential distribution, the rate, rather than the mean,is modelled on covariates.

The objectflexsurv.dists lists the names of the built-indistributions, their parameters, location parameter, functions used totransform the parameter ranges to and from the real line, and thefunctions used to generate initial values of each parameter forestimation.

inits

An optional numeric vector giving initial values for eachunknown parameter. These are numbered in the order: baseline parameters(in the order they appear in the distribution function, e.g. shape beforescale in the Weibull), covariate effects on the location parameter,covariate effects on the remaining parameters. This is the same order asthe printed estimates in the fitted model.

If not specified, default initial values are chosen from a simple summaryof the survival or censoring times, for example the mean is often used toinitialize scale parameters. See the objectflexsurv.dists for theexact methods used. If the likelihood surface may be uneven, it isadvised to run the optimisation starting from various different initialvalues to ensure convergence to the true global maximum.

fixedpars

Vector of indices of parameters whose values will be fixedat their initial values during the optimisation. The indices are orderedas ininits. For example, in a stable generalized Gamma model withtwo covariates, to fix the third of three generalized gamma parameters(the shapeQ, see the help forGenGamma) and thesecond covariate, specifyfixedpars = c(3, 5)

dfns

An alternative way to define a custom survival distribution (seesection “Custom distributions” below). A list whose components mayinclude"d","p","h", or"H" containing theprobability density, cumulative distribution, hazard, or cumulative hazardfunctions of the distribution. For example,

list(d=dllogis, p=pllogis).

Ifdfns is used, a customdlist must still be provided, butdllogis andpllogis need not be visible from the globalenvironment. This is useful ifflexsurvreg is called within otherfunctions or environments where the distribution functions are alsodefined dynamically.

aux

cl

Width of symmetric confidence intervals for maximum likelihoodestimates, by default 0.95.

integ.opts

List of named arguments to pass tointegrate, if a custom density or hazard is provided withoutits cumulative version. For example,

integ.opts = list(rel.tol=1e-12)

sr.control

For the models which usesurvreg to find themaximum likelihood estimates (Weibull, exponential, log-normal), this listis passed as thecontrol argument tosurvreg.

hessian

Calculate the covariances and confidence intervals for theparameters. Defaults toTRUE.

hess.control

List of options to control covariance matrix computation. Available options are:

tol.solve. The tolerance used forsolvewhen inverting the Hessian (default.Machine$double.eps)

tol.evalues The accepted tolerance for negativeeigenvalues in the covariance matrix (default1e-05).

...

Details

Parameters are estimated by maximum likelihood using the algorithmsavailable in the standard Roptim function. Parametersdefined to be positive are estimated on the log scale. Confidence intervalsare estimated from the Hessian at the maximum, and transformed back to theoriginal scale of the parameters.

The usage offlexsurvreg is intended to be similar tosurvreg in thesurvival package.

Value

A list of class"flexsurvreg" containing information aboutthe fitted model. Components of interest to users may include:

call

A copy of the function call, for use in post-processing.

dlist

List defining the survival distribution used.

res

Matrix of maximum likelihood estimates and confidence limits,with parameters on their natural scales.

res.t

Matrix of maximumlikelihood estimates and confidence limits, with parameters alltransformed to the real line (using a log transform for all built-inmodels where this is necessary). Thecoef,vcovandconfint methods forflexsurvreg objects work onthis scale.

coefficients

The transformed maximum likelihoodestimates, as inres.t. Callingcoef() on aflexsurvreg object simply returns this component.

loglik

Log-likelihood. This will differ from Stata, where the sumof the log uncensored survival times is added to the log-likelihood insurvival models, to remove dependency on the time scale.

For relative survival models specified withbhazard, this is a partial log-likelihood which omits a constant term defined by the sum of thecumulative hazards over all event or censoring times.

logliki

Vector of individual contributions to the log-likelihood

AIC

Akaike's information criterion (-2*log likelihood + 2*number ofestimated parameters)

cov

Covariance matrix of the parameters, onthe real-line scale (e.g. log scale), which can be extracted withvcov.

data

Data used in the model fit. To extractthis in the standard R formats, use usemodel.frame.flexsurvreg ormodel.matrix.flexsurvreg.

Custom distributions

flexsurvreg is intended to beeasy to extend to handle new distributions. To define a new distributionfor use inflexsurvreg, construct a list with the followingelements:

"name"

A string naming the distribution. If thisis called"dist", for example, then there must be visible in theworking environment, at least, either

a) a function calledddist which defines the probability density,

b) a function calledhdist which defines the hazard.

Ideally, in case a) there should also be a function calledpdistwhich defines the probability distribution or cumulative density, and incase b) there should be a function calledHdist defining thecumulative hazard. If these additional functions are not provided,flexsurv attempts to automatically create them by numericallyintegrating the density or hazard function. However, model fitting willbe much slower, or may not even work at all, if the analytic versions ofthese functions are not available.

The functions must accept vector arguments (representing different times,or alternative values for each parameter) and return the results as avector. The functionVectorize may be helpful for doingthis: see the example below.These functions may be in an add-on package (see below for an example) ormay be user-written. If they are user-written they must be defined in theglobal environment, or supplied explicitly through thedfns argumenttoflexsurvreg. The latter may be useful if the functions arecreated dynamically (as in the source offlexsurvspline) and thusnot visible through R's scoping rules.

Arguments other than parameters must be named in the conventional way –for examplex for the first argument of the density function orhazard, as indnorm(x, ...) andq for the firstargument of the probability function. Density functions should also havean argumentlog, after the parameters, which whenTRUE,computes the log density, using a numerically stable additive formula ifpossible.

Additional functions with names beginning with"DLd" and"DLS" may be defined to calculate the derivatives of the log densityand log survival probability, with respect to the parameters of thedistribution. The parameters are expressed on the real line, for exampleafter log transformation if they are defined as positive. The firstargument must be namedt, representing the time, and the remainingarguments must be named as the parameters of the density function. Thefunction must return a matrix with rows corresponding to times, and columnscorresponding to the parameters of the distribution. The derivatives areused, if available, to speed up the model fitting withoptim.

"pars"

Vector of strings naming the parameters of thedistribution. These must be the same names as the arguments of the densityand probability functions.

"location"

Name of the main parameter governing the mean ofthe distribution. This is the default parameter on which covariates areplaced in theformula supplied toflexsurvreg.

"transforms"

List of Rfunctions which transform the range of values taken by each parameter ontothe real line. For example,c(log, log) for a distribution with twopositive parameters.

"inv.transforms"

List of R functions defining thecorresponding inverse transformations. Note these must be lists, even forsingle parameter distributions they should be supplied as, e.g.c(exp) orlist(exp).

"inits"

A function of theobserved survival timest (including right-censoring times, andusing the halfway point for interval-censored times) which returns a vectorof reasonable initial values for maximum likelihood estimation of eachparameter. For example,function(t){ c(1, mean(t)) } will alwaysinitialize the first of two parameters at 1, and the second (a scaleparameter, for instance) at the mean oft.

For example, suppose we want to use an extreme value survival distribution.This is available in the CRAN packageeha, which providesconventionally-defined density and probability functions calledeha::dEV andeha::pEV. See the Examples belowfor the custom list in this case, and the subsequent command to fit themodel.

Author(s)

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

References

Jackson, C. (2016). flexsurv: A Platform for ParametricSurvival Modeling in R. Journal of Statistical Software, 70(8), 1-33.doi:10.18637/jss.v070.i08

Cox, C. (2008) The generalizedF distribution: An umbrella forparametric survival analysis. Statistics in Medicine 27:4301-4312.

Cox, C., Chu, H., Schneider, M. F. and Muñoz, A. (2007) Parametric survivalanalysis and taxonomy of hazard functions for the generalized gammadistribution. Statistics in Medicine 26:4252-4374

Jackson, C. H. and Sharples, L. D. and Thompson, S. G. (2010) Survivalmodels in health economic evaluations: balancing fit and parsimony toimprove prediction. International Journal of Biostatistics 6(1):Article 34.

Nelson, C. P., Lambert, P. C., Squire, I. B., & Jones, D. R. (2007).Flexible parametric models for relative survival, with application incoronary heart disease. Statistics in medicine, 26(30), 5486-5498.

Examples

## Compare generalized gamma fit with Weibullfitg <- flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian, dist="gengamma")fitgfitw <- flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian, dist="weibull")fitwplot(fitg)lines(fitw, col="blue", lwd.ci=1, lty.ci=1)## Identical AIC, probably not enough data in this simple example for a## very flexible model to be worthwhile.## Custom distribution## make "dEV" and "pEV" from eha package (if installed)##   available to the working environmentif (require("eha")) {custom.ev <- list(name="EV",                      pars=c("shape","scale"),                      location="scale",                      transforms=c(log, log),                      inv.transforms=c(exp, exp),                      inits=function(t){ c(1, median(t)) })fitev <- flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian,                    dist=custom.ev)fitevlines(fitev, col="purple", col.ci="purple")}## Custom distribution: supply the hazard function onlyhexp2 <- function(x, rate=1){ rate } # exponential distributionhexp2 <- Vectorize(hexp2)custom.exp2 <- list(name="exp2", pars=c("rate"), location="rate",                    transforms=c(log), inv.transforms=c(exp),                    inits=function(t)1/mean(t))flexsurvreg(Surv(futime, fustat) ~ 1, data = ovarian, dist=custom.exp2)flexsurvreg(Surv(futime, fustat) ~ 1, data = ovarian, dist="exp")## should give same answer

Flexible parametric models for right-truncated, uncensored data defined by times of initial and final events.

Description

This function estimates the distribution of the time between an initial and final event, in situations where individuals are only observed if they have experienced both events before a certain time, thus they are right-truncated at this time. The time of the initial event provides information about the time from initial to final event, given the truncated observation scheme, and initial events are assumed to occur with an exponential growth rate.

Usage

flexsurvrtrunc(  t,  tinit,  rtrunc,  tmax,  data = NULL,  method = "joint",  dist,  theta = NULL,  fixed.theta = TRUE,  inits = NULL,  fixedpars = NULL,  dfns = NULL,  integ.opts = NULL,  cl = 0.95,  optim_control = list())

Arguments

t

Vector of time differences between an initial and final event for a set of individuals.

tinit

Absolute time of the initial event for each individual.

rtrunc

Individual-specific right truncation points on the same scale ast, so that each individual's survival timet would not have been observed if it was greater than the corresponding element ofrtrunc. Only used inmethod="joint". Inmethod="final", the right-truncation is implicit.

tmax

Maximum possible time between initial and final events that could have been observed. This is only used inmethod="joint", and is ignored inmethod="final".

data

Data frame containingt,rtrunc andtinit.

method

If"joint" then the "joint-conditional" method is used. If"final" then the "conditional-on-final" method is used. The "conditional-on-initial" method can be implemented by usingflexsurvreg with artrunc argument. These methods are all described in Seaman et al. (2020).

dist

`"gengamma"`	Generalized gamma (stable)	mu	AFT
`"gengamma.orig"`	Generalized gamma (original)	scale	AFT
`"genf"`	Generalized F (stable)	mu	AFT
`"genf.orig"`	Generalized F (original)	mu	AFT
`"weibull"`	Weibull	scale	AFT
`"gamma"`	Gamma	rate	AFT
`"exp"`	Exponential	rate	PH
`"llogis"`	Log-logistic	scale	AFT
`"lnorm"`	Log-normal	meanlog	AFT
`"gompertz"`	Gompertz	rate	PH

"exponential" and"lognormal" can be used as aliases for"exp" and"lnorm", for compatibility withsurvreg.

Alternatively,dist can be a list specifying a custom distribution.See section “Custom distributions” below for how to construct this list.

Very flexible spline-based distributions can also be fitted withflexsurvspline.

A package vignette "Distributions reference" lists the survivor functionsand covariate effect parameterisations used by each built-in distribution.

Similarly in the exponential distribution, the rate, rather than the mean,is modelled on covariates.

theta

Initial value (or fixed value) for the exponential growth ratetheta. Defaults to 1.

fixed.theta

Shouldtheta be fixed at its initial value or estimated. This only applies tomethod="joint". Formethod="final",theta must be fixed.

inits

Initial values for the parameters of the parametric survival distributon. If not supplied, a heuristic is used. as is done inflexsurvreg.

fixedpars

Integer indices of the parameters of the survival distribution that should be fixed to their values supplied ininits. Same length asinits.

dfns

list(d=dllogis, p=pllogis).

integ.opts

List of named arguments to pass tointegrate, if a custom density or hazard is provided withoutits cumulative version. For example,

integ.opts = list(rel.tol=1e-12)

cl

Width of symmetric confidence intervals for maximum likelihoodestimates, by default 0.95.

optim_control

List to supply as thecontrol argument tooptim to control the likelihood maximisation.

Details

Covariates are not currently supported.

Note thatflexsurvreg, with anrtrunc argument, can fit models for a similar kind of data, but those models ignore the information provided by the time of the initial event.

A nonparametric estimator of survival under right-truncation is also provided insurvrtrunc. See Seaman et al. (2020) for a full comparison of the alternative models.

References

Seaman, S., Presanis, A. and Jackson, C. (2020) Estimating a Time-to-EventDistribution from Right-Truncated Data in an Epidemic: a Review of Methods

Examples

set.seed(1) ## simulate time to initial eventX <- rexp(1000, 0.2)## simulate time between initial and final eventT <- rgamma(1000, 2, 10) ## right-truncate to keep only those with final event time## before tmaxtmax <- 40obs <- X + T < tmax rtrunc <- tmax - Xdat <- data.frame(X, T, rtrunc)[obs,]flexsurvrtrunc(t=T, rtrunc=rtrunc, tinit=X, tmax=40, data=dat,                dist="gamma", theta=0.2)flexsurvrtrunc(t=T, rtrunc=rtrunc, tinit=X, tmax=40, data=dat,                dist="gamma", theta=0.2, fixed.theta=FALSE)flexsurvrtrunc(t=T, rtrunc=rtrunc, tinit=X, tmax=40, data=dat,                dist="gamma", theta=0.2, inits=c(1, 8))flexsurvrtrunc(t=T, rtrunc=rtrunc, tinit=X, tmax=40, data=dat,                dist="gamma", theta=0.2, method="final")flexsurvrtrunc(t=T, rtrunc=rtrunc, tinit=X, tmax=40, data=dat,                dist="gamma", fixed.theta=TRUE)flexsurvrtrunc(t=T, rtrunc=rtrunc, tinit=X, tmax=40, data=dat,                dist="weibull", fixed.theta=TRUE)flexsurvrtrunc(t=T, rtrunc=rtrunc, tinit=X, tmax=40, data=dat,                dist="lnorm", fixed.theta=TRUE)flexsurvrtrunc(t=T, rtrunc=rtrunc, tinit=X, tmax=40, data=dat,                dist="gengamma", fixed.theta=TRUE)flexsurvrtrunc(t=T, rtrunc=rtrunc, tinit=X, tmax=40, data=dat,                dist="gompertz", fixed.theta=TRUE)

Flexible survival regression using the Royston/Parmar spline model.

Description

Flexible parametric modelling of time-to-event data using the spline modelof Royston and Parmar (2002).

Usage

flexsurvspline(  formula,  data,  weights,  bhazard,  rtrunc,  subset,  k = 0,  knots = NULL,  bknots = NULL,  scale = "hazard",  timescale = "log",  spline = "rp",  ...)

Arguments

formula

Surv(time, dead) ~ age + sex

specifies a model where the log cumulative hazard (by default, seescale) is a linear function of the covariatesage andsex.

If there are no covariates, specify1 on the right hand side, forexampleSurv(time, dead) ~ 1.

Time-varying covariate effects can be specified using the method describedinflexsurvreg for placing covariates on ancillaryparameters. The ancillary parameters here are namedgamma1,...,gammar wherer is the number of knotsk plusone (the “degrees of freedom” as defined by Royston and Parmar). So forthe default Weibull model, there is just one ancillary parametergamma1.

Therefore a model with one internal spline knot, where the equivalents ofthe Weibull shape and scale parameters, but not the higher-order termgamma2, vary with age and sex, can be specified as:

Surv(time, dead) ~ age + sex + gamma1(age) + gamma1(sex)

or alternatively (and more safely, seeflexsurvreg)

Surv(time, dead) ~ age + sex, anc=list(gamma1=~age + sex)

Surv objects oftype="right","counting","interval1" or"interval2" are supported, corresponding toright-censored, left-truncated or interval-censored observations.

data

A data frame in which to find variables supplied informula. If not given, the variables should be in the workingenvironment.

weights

Optional variable giving case weights.

bhazard

Optional variable giving expected hazards for relativesurvival models.

rtrunc

Optional variable giving individual right-truncation times (seeflexsurvreg). Note that these models can suffer from weakly identifiable parameters andbadly-behaved likelihood functions, and it is advised to compareconvergence for different initial values by supplying differentinits arguments toflexsurvspline.

subset

Vector of integers or logicals specifying the subset of theobservations to be used in the fit.

k

Number of knots in the spline. The defaultk=0 gives aWeibull, log-logistic or lognormal model, if"scale" is"hazard","odds" or"normal" respectively.kis equivalent todf-1 in the notation ofstpm for Stata. Theknots are then chosen as equally-spaced quantiles of the log uncensoredsurvival times, for example, at the median with one knot, or at the 33%and 67% quantiles of log time (or time, see"timescale") with twoknots. To override this default knot placement, specifyknotsinstead.

knots

Locations of knots on the axis of log time (or time, see"timescale"). If not specified, knot locations are chosen asdescribed ink above. Eitherk orknots must bespecified. If both are specified,knots overridesk.

bknots

Locations of boundary knots, on the axis of log time (ortime, see"timescale"). If not supplied, these are are chosen asthe minimum and maximum log death time.

scale

If"hazard", the log cumulative hazard is modelled as aspline function.

If"odds", the log cumulative odds is modelled as a spline function.

If"normal",-\Phi^{-1}(S(t)) is modelled as aspline function, where\Phi^{-1}() is the inverse normaldistribution functionqnorm.

timescale

If"log" (the default) the log cumulative hazard(or alternative) is modelled as a spline function of log time. If"identity", it is modelled as a spline function of time, howeverthis model would not satisfy the desirable property that the cumulative hazard(or alternative) should approach 0 at time zero.

spline

"rp" to use the natural cubic spline basisdescribed in Royston and Parmar.

"splines2ns" to use the alternative natural cubic splinebasis from thesplines2 package (Wang and Yan 2021),which may be better behaved due to the basis being orthogonal.

...

Any other arguments to be passed to or throughflexsurvreg, for example,anc,inits,fixedpars,weights,subset,na.action, and anyoptions to control optimisation. Seeflexsurvreg.

Details

This function works as a wrapper aroundflexsurvreg bydynamically constructing a custom distribution usingdsurvspline,psurvspline andunroll.function.

In the spline-based survival model of Royston and Parmar (2002), atransformationg(S(t,z)) of the survival function is modelled as anatural cubic spline function of log timex = \log(t)plus linear effects of covariatesz.

g(S(t,z)) = s(x, \bm{\gamma}) + \bm{\beta}^T \mathbf{z}

The proportional hazards model (scale="hazard") definesg(S(t,\mathbf{z})) = \log(-\log(S(t,\mathbf{z}))) =\log(H(t,\mathbf{z})), thelog cumulative hazard.

The proportional odds model (scale="odds") definesg(S(t,\mathbf{z})) =\log(S(t,\mathbf{z})^{-1} - 1), the logcumulative odds.

The probit model (scale="normal") definesg(S(t,\mathbf{z})) = -\Phi^{-1}(S(t,\mathbf{z})), where\Phi^{-1}() is the inverse normaldistribution functionqnorm.

With no knots, the spline reduces to a linear function, and these modelsare equivalent to Weibull, log-logistic and lognormal models respectively.

The spline coefficients\gamma_j: j=1, 2 \ldots, which are called the "ancillary parameters" above, may also bemodelled as linear functions of covariates\mathbf{z}, as

\gamma_j(\mathbf{z}) = \gamma_{j0} + \gamma_{j1}z_1 + \gamma_{j2}z_2+ ...

giving a model where the effects of covariates are arbitrarily flexiblefunctions of time: a non-proportional hazards or odds model.

Natural cubic splines are cubic splines constrained to be linear beyondboundary knotsk_{min},k_{max}. The spline function isdefined as

s(x,\boldsymbol{\gamma}) = \gamma_0 + \gamma_1 x + \gamma_2 v_1(x) + \ldots +

\gamma_{m+1} v_m(x)

wherev_j(x) is thejth basis function

v_j(x) = (x - k_j)^3_+ - \lambda_j(x - k_{min})^3_+ - (1 -

\lambda_j) (x - k_{max})^3_+

\lambda_j = \frac{k_{max} - k_j}{k_{max} - k_{min}}

and(x - a)_+ = max(0, x - a).

Value

A list of class"flexsurvreg" with the same elements asdescribed inflexsurvreg, and including extra componentsdescribing the spline model. See in particular:

k

Number of knots.

knots

Location of knots on the log timeaxis.

scale

Thescale of the model, hazard, odds or normal.

res

Matrix of maximum likelihood estimates and confidence limits.Spline coefficients are labelled"gamma...", and covariate effectsare labelled with the names of the covariates.

Coefficientsgamma1,gamma2,... here are the equivalent ofs0,s1,... in Statastreg, andgamma0 is the equivalentof thexb constant term. To reproduce results, use thenoorthog option in Stata, since no orthogonalisation is performed onthe spline basis here.

In the Weibull model, for example,gamma0,gamma1 are-shape*log(scale), shape respectively indweibull orflexsurvreg notation, or (-Intercept/scale,1/scale) insurvreg notation.

In the log-logistic model with shapea and scaleb (as ineha::dllogis from theeha package),1/b^a isequivalent toexp(gamma0), anda is equivalent togamma1.

In the log-normal model with log-scale meanmu and standarddeviationsigma,-mu/sigma is equivalent togamma0 and1/sigma is equivalent togamma1.

loglik

Themaximised log-likelihood. This will differ from Stata, where the sum ofthe log uncensored survival times is added to the log-likelihood insurvival models, to remove dependency on the time scale.

Author(s)

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

References

Wang W, Yan J (2021). Shape-Restricted Regression Splines with RPackage splines2. Journal of Data Science, 19(3), 498-517.

Jackson, C. (2016). flexsurv: A Platform for Parametric Survival Modelingin R. Journal of Statistical Software, 70(8), 1-33.doi:10.18637/jss.v070.i08

Examples

## Best-fitting model to breast cancer data from Royston and Parmar (2002)## One internal knot (2 df) and cumulative odds scalespl <- flexsurvspline(Surv(recyrs, censrec) ~ group, data=bc, k=1, scale="odds")## Fitted survivalplot(spl, lwd=3, ci=FALSE)## Simple Weibull model fits much less wellsplw <- flexsurvspline(Surv(recyrs, censrec) ~ group, data=bc, k=0, scale="hazard")lines(splw, col="blue", ci=FALSE)## Alternative way of fitting the Weibull## Not run: splw2 <- flexsurvreg(Surv(recyrs, censrec) ~ group, data=bc, dist="weibull")## End(Not run)

Constructor for a mixture multi-state model based on flexsurvmix

Description

Constructor for a mixture multi-state model based on flexsurvmix

Usage

fmixmsm(...)

Arguments

...

Named arguments. Each argument should be a fitted model asreturned byflexsurvmix. The name of each argument namesthe starting state for that model.

Value

A list offlexsurvmix objects, with the followingattribute(s):

pathways A list of all potential pathways until absorption, formodels without cycles. For models with cycles this will have an elementhas_cycle=TRUE, plus the pathways discovered before the functionfound the cycle and gave up.

Construct a multi-state model from a set of parametric survival models

Description

Construct a multi-state model from a set of parametric survival models

Usage

fmsm(..., trans)

Arguments

...

Objects returned byflexsurvreg orflexsurvspline representing fitted survival models.

trans

A matrix of integers specifying which models correspond towhich transitions. Ther,s entry isi if theithargument specified in... is the model for the stater tostates transition. The entry should beNA if the transition is disallowed.

Value

A list containing the objects given in..., and withattributes"trans" and"statenames" defining the transitionstructure matrix and state names, and with list components named todescribe the transitions they correspond to. If any of the arguments in... are named, then these are used to define the transition names,otherwise default names are chosen based on the state names.

Evaluate baseline time-to-event distribution parameters given covariate values in a flexsurvmix model

Description

Evaluate baseline time-to-event distribution parameters given covariate values in a flexsurvmix model

Usage

get_basepars(x, newdata, event)

Arguments

x

Fitted model object

newdata

Data frame of alternative covariate values

event

Event

Glance at a flexsurv model object

Description

Glance accepts a model object and returns a tibble with exactly one row of model summaries.

Usage

## S3 method for class 'flexsurvreg'glance(x, ...)

Arguments

x

Output fromflexsurvreg orflexsurvspline, representing a fitted survival model object.

...

Not currently used.

Value

A one-rowtibble containing columns:

N Number of observations used in fitting
events Number of events
censored Number of censored events
trisk Total length of time-at-risk (i.e. follow-up)
df Degrees of freedom (i.e. number of estimated parameters)
logLik Log-likelihood
AIC Akaike's "An Information Criteria"
BIC Bayesian Information Criteria

Examples

fitg <- flexsurvreg(formula = Surv(futime, fustat) ~ age, data = ovarian, dist = "gengamma")glance(fitg)

Hazard and cumulative hazard functions

Description

Hazard and cumulative hazard functions for distributions which are builtinto flexsurv, and whose distribution functions are in base R.

Usage

hexp(x, rate = 1, log = FALSE)Hexp(x, rate = 1, log = FALSE)hgamma(x, shape, rate = 1, log = FALSE)Hgamma(x, shape, rate = 1, log = FALSE)hlnorm(x, meanlog = 0, sdlog = 1, log = FALSE)Hlnorm(x, meanlog = 0, sdlog = 1, log = FALSE)hweibull(x, shape, scale = 1, log = FALSE)Hweibull(x, shape, scale = 1, log = FALSE)

Arguments

x

Vector of quantiles

rate

Rate parameter (exponential and gamma)

log

Compute log hazard or log cumulative hazard

shape

Shape parameter (Weibull and gamma)

meanlog

Mean on the log scale (log normal)

sdlog

Standard deviation on the log scale (log normal)

scale

Scale parameter (Weibull)

Details

For the exponential and the Weibull these are available analytically, andso are programmed here in numerically stable and efficient forms.

For the gamma and log-normal, these are simply computed as minus the log ofthe survivor function (cumulative hazard) or the ratio of the density andsurvivor function (hazard), so are not expected to be robust to extremevalues or quick to compute.

Value

Hazard (functions beginning 'h') or cumulative hazard (functionsbeginning 'H').

Author(s)

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

Hazard ratio as a function of time from a parametric survival model

Description

Hazard ratio as a function of time from a parametric survival model

Usage

hr_flexsurvreg(  x,  newdata = NULL,  t = NULL,  start = 0,  ci = TRUE,  B = 1000,  cl = 0.95,  na.action = na.pass)

Arguments

x

Object returned byflexsurvreg orflexsurvspline.

newdata

A data frame with two rows, each specifying a set of covariate values.The hazard ratio is calculated as hazard(z2)/hazard(z1), where z1 is the first row ofnewdata and z2 is the second row.

newdata must be supplied unless the modelx includes just one covariate.With one covariate, a default is constructed, which defines the hazard ratio betweenthe second and first level of the factor (if the covariate is a factor), or betweena value of 1 and a value of 0 (if the covariate is numeric).

t

Times to calculate fitted values for. By default, these are thesorted unique observation (including censoring) times in the data - forleft-truncated datasets these are the "stop" times.

start

Optional left-truncation time or times. The returnedsurvival, hazard or cumulative hazard will be conditioned on survival up tothis time. Predicted times returned with"rmst","mean","median" or"quantile"will be times since time zero, not times since thestart time.

A vector of the same length ast can be supplied to allow differenttruncation times for each prediction time, though this doesn't make sensein the usual case where this function is used to calculate a predictedtrajectory for a single individual. This is why the defaultstarttime was changed for version 0.4 offlexsurv - this was previously avector of the start times observed in the data.

ci

Set toFALSE to omit confidence intervals.

B

Number of simulations from the normal asymptotic distribution ofthe estimates used to calculate confidence intervals or standard errors.Decrease for greaterspeed at the expense of accuracy, or setB=0 to turn off calculationof CIs and SEs.

cl

Width of symmetric confidence intervals, relative to 1.

na.action

Function determining what should be done with missing values innewdata. Ifna.pass (the default) then summaries ofNA are produced for missing covariate values. Ifna.omit, then missing values are dropped, the behaviour ofsummary.flexsurvreg beforeflexsurv version 1.2.

Value

A data frame with estimate and confidence limits for the hazard ratio, andone row for each of the times requested int.

Add fitted flexible survival curves to a plot

Description

Add fitted survival (or hazard or cumulative hazard) curves from aflexsurvreg model fit to an existing plot.

Usage

## S3 method for class 'flexsurvreg'lines(  x,  newdata = NULL,  X = NULL,  type = "survival",  t = NULL,  est = TRUE,  ci = NULL,  B = 1000,  cl = 0.95,  col = "red",  lty = 1,  lwd = 2,  col.ci = NULL,  lty.ci = 2,  lwd.ci = 1,  ...)

Arguments

x

Output fromflexsurvreg, representing a fittedsurvival model object.

newdata

Covariate values to produce fitted curves for, as a dataframe, as described inplot.flexsurvreg.

X

Covariate values to produce fitted curves for, as a matrix, asdescribed inplot.flexsurvreg.

type

"survival" for survival,"cumhaz" for cumulativehazard, or"hazard" for hazard, as inplot.flexsurvreg.

t

Vector of times to plot fitted values for.

est

Plot fitted curves (TRUE orFALSE.)

ci

Plot confidence intervals for fitted curves.

B

Number of simulations controlling accuracy of confidenceintervals, as used insummary.

cl

Width of confidence intervals, by default 0.95 for 95%intervals.

col

Colour of the fitted curve(s).

lty

Line type of the fitted curve(s).

lwd

Line width of the fitted curve(s).

col.ci

Colour of the confidence limits, defaulting to the same asfor the fitted curve.

lty.ci

Line type of the confidence limits.

lwd.ci

Line width of the confidence limits, defaulting to the sameas for the fitted curve.

...

Other arguments to be passed to the genericplotandlines functions.

Details

Equivalent toplot.flexsurvreg(...,add=TRUE).

Author(s)

C. H. Jacksonchris.jackson@mrc-bsu.cam.ac.uk

Log likelihood from a flexsurvreg model

Description

Log likelihood from a flexsurvreg model

Usage

## S3 method for class 'flexsurvreg'logLik(object, ...)

Arguments

object

A fitted model object of classflexsurvreg, e.g. as returned byflexsurvreg orflexsurvspline.

...

Other arguments (currently unused).

Value

Log-likelihood (numeric) with additional attributesdf (degrees of freedom, or numberof parameters that were estimated), and number of observationsnobs (including observedevents and censored observations).

Mean and restricted mean survival functions

Description

Mean and restricted mean survival time functions for distributions which are builtinto flexsurv.

Usage

mean_exp(rate = 1)rmst_exp(t, rate = 1, start = 0)mean_gamma(shape, rate = 1)rmst_gamma(t, shape, rate = 1, start = 0)rmst_genf(t, mu, sigma, Q, P, start = 0)mean_genf(mu, sigma, Q, P)rmst_genf.orig(t, mu, sigma, s1, s2, start = 0)mean_genf.orig(mu, sigma, s1, s2)rmst_gengamma(t, mu = 0, sigma = 1, Q, start = 0)mean_gengamma(mu = 0, sigma = 1, Q)rmst_gengamma.orig(t, shape, scale = 1, k, start = 0)mean_gengamma.orig(shape, scale = 1, k)rmst_gompertz(t, shape, rate = 1, start = 0)mean_gompertz(shape, rate = 1)mean_llogis(shape = 1, scale = 1)rmst_llogis(t, shape = 1, scale = 1, start = 0)mean_lnorm(meanlog = 0, sdlog = 1)rmst_lnorm(t, meanlog = 0, sdlog = 1, start = 0)mean_weibull(shape, scale = 1)rmst_weibull(t, shape, scale = 1, start = 0)rmst_weibullPH(t, shape, scale = 1, start = 0)mean_weibullPH(shape, scale = 1)

Arguments

rate

Rate parameter (exponential and gamma)

t

Vector of times to which restricted mean survival time is evaluated

start

Optional left-truncation time or times. The returnedrestricted mean survival will be conditioned on survival up tothis time.

shape

Shape parameter (Weibull, gamma, log-logistic, generalized gamma [orig],generalized F [orig])

mu

Mean on the log scale (generalized gamma, generalized F)

sigma

Standard deviation on the log scale (generalized gamma, generalized F)

Q

Vector of first shape parameters (generalized gamma, generalized F)

P

Vector of second shape parameters (generalized F)

s1

Vector of first F shape parameters (generalized F [orig])

s2

vector of second F shape parameters (generalized F [orig])

scale

Scale parameter (Weibull, log-logistic, generalized gamma [orig],generalized F [orig])

k

vector of shape parameters (generalized gamma [orig]).

meanlog

Mean on the log scale (log normal)

sdlog

Standard deviation on the log scale (log normal)

Details

For the exponential, Weibull, log-logistic, lognormal, and gamma, mean survival isprovided analytically. Restricted mean survival for the exponential distributionis also provided analytically. Mean and restricted means for other distributionsare calculated via numeric integration.

Value

mean survival (functions beginning 'mean') or restricted mean survival(functions beginning 'rmst_').

Author(s)

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

Mean times to events from a flexsurvmix model

Description

This returns the mean of each event-specific parametric time-to-eventdistribution in the mixture model, which is the mean time to eventconditionally on that event being the one that happens.

Usage

mean_flexsurvmix(x, newdata = NULL, B = NULL)

Arguments

x

Fitted model object returned fromflexsurvmix.

newdata

Data frame or list of covariate values. If omitted for amodel with covariates, a default is used, defined by all combinations offactors if the only covariates in the model are factors, or all covariatevalues of zero if there are any non-factor covariates in the model.

B

Number of simulations to use to compute 95% confidence intervals,based on the asymptotic multivariate normal distribution of the basicparameter estimates. IfB=NULL then intervals are not computed.

Value

Mean times to next event conditionally on each alternative event,given the specified covariate values.

Mean time to final state in a mixture multi-state model

Description

Calculate the mean time from the start of the process to a final (or"absorbing") state in a mixture multi-state model. Models with cycles arenot supported.

Usage

meanfinal_fmixmsm(x, newdata = NULL, final = FALSE, B = NULL)

Arguments

x

Object returned byfmixmsm, representing a multi-statemodel built from piecing together mixture models fitted byflexsurvmix.

newdata

final

IfTRUE then the mean time to the final state iscalculated for each final state, by taking a weighted average of the meantime to travel each pathway ending in that final state, weighted by theprobability of the pathway. IfFALSE (the default) then aseparate mean is calculated for each pathway.

B

Number of simulations to use to compute 95% confidence intervals,based on the asymptotic multivariate normal distribution of the basicparameter estimates. IfB=NULL then intervals are not computed.

Value

A data frame of mean times to absorption, by covariate values andpathway (or by final state)

Model frame from a flexsurvmix object

Description

Returns a list of data frames, with each component containing thedata that were used for the model fit for that mixture component.

Usage

## S3 method for class 'flexsurvmix'model.frame(formula, ...)

Arguments

formula

Fitted model object fromflexsurvmix.

...

Further arguments (currently unused).

Value

A list of data frames

Extract original data from`flexsurvreg` objects.

Description

Extract the data from a model fitted withflexsurvreg.

Usage

## S3 method for class 'flexsurvreg'model.frame(formula, ...)## S3 method for class 'flexsurvreg'model.matrix(object, par = NULL, ...)

Arguments

formula

A fitted model object, as returned byflexsurvreg.

...

Further arguments (not used).

object

A fitted model object, as returned byflexsurvreg.

par

String naming the parameter whose linear model matrix isdesired.

The default value ofpar=NULL returns a matrix consisting of themodel matrices for all models in the objectcbinded together, withthe intercepts excluded. This is not really a “model matrix” in theusual sense, however, the columns directly correspond to the covariatecoefficients in the matrix of estimates from the fitted model.

Value

model.frame returns a data frame with all the originalvariables used for the model fit.

model.matrix returns a design matrix for a part of the model thatincludes covariates. The required part is indicated by the"par"argument (see above).

Author(s)

C. H. Jacksonchris.jackson@mrc-bsu.cam.ac.uk

Cumulative intensity function for parametric multi-state models

Description

Cumulative transition-specific intensity/hazard functions forfully-parametric multi-state or competing risks models, using apiecewise-constant approximation that will allow prediction using thefunctions in themstate package.

Usage

msfit.flexsurvreg(  object,  t,  newdata = NULL,  variance = TRUE,  tvar = "trans",  trans,  B = 1000)

Arguments

object

Output fromflexsurvreg orflexsurvspline, representing a fitted survival model object.

The model should have been fitted to data consisting of one row for eachobserved transition and additional rows corresponding to censored times tocompeting transitions. This is the "long" format, or counting processformat, as explained in theflexsurv vignette.

The model should contain a categorical covariate indicating the transition.Inflexsurv this variable can have any name, indicated here by thetvar argument. In the Cox models demonstrated bymstate it isusually included in model formulae asstrata(trans), but note thatthestrata function does not do anything inflexsurv. Theformula supplied toflexsurvreg should be precise about whichparameters are assumed to vary with the transition type.

Alternatively, if the parameters (including covariate effects) are assumedto be different between different transitions, then a list oftransition-specific models can be formed. This list has one component foreach permitted transition in the multi-state model. This is morecomputationally efficient, particularly for larger models and datasets.See the example below, and the vignette.

t

Vector of times. These do not need to be the same as the observedevent times, and since the model is parametric, they can be outside therange of the data. A grid of more frequent times will provide a betterapproximation to the cumulative hazard trajectory for prediction withprobtrans ormssample, at thecost of greater computational expense.

newdata

A data frame specifying the values of covariates in thefitted model, other than the transition number. This must be specified ifthere are other covariates. The variable names should be the same as thosein the fitted model formula. There must be either one value per covariate(the typical situation) orn values per covariate, a different onefor each of then allowed transitions.

variance

Calculate the variances and covariances of the transitioncumulative hazards (TRUE orFALSE). This is based onsimulation from the normal asymptotic distribution of the estimates, whichis computationally-expensive.

tvar

Name of the categorical variable in the model formula thatrepresents the transition number. This should have been defined as a factor,with factor levels that correspond to elements oftrans, conventionally a sequence ofintegers starting from 1. Not required ifx is a list oftransition-specific models.

trans

Matrix indicating allowed transitions in the multi-statemodel, in the format understood bymstate: a matrix of integers whoser,s entry isi if theith transition type (reading acrossrows) isr,s, and hasNAs on the diagonal and where ther,s transition is disallowed.

B

Number of simulations from the normal asymptotic distribution usedto calculate variances. Decrease for greater speed at the expense ofaccuracy.

Value

An object of class"msfit", in the same form as the objectsused in themstate package. Themsfit methodfrommstate returns the equivalent cumulative intensities for Coxregression models fitted withcoxph.

Author(s)

C. H. Jacksonchris.jackson@mrc-bsu.cam.ac.uk

References

Liesbeth C. de Wreede, Marta Fiocco, Hein Putter (2011).mstate: An R Package for the Analysis of Competing Risks andMulti-State Models.Journal of Statistical Software, 38(7), 1-30.doi:10.18637/jss.v038.i07

Mandel, M. (2013). "Simulation based confidence intervals for functionswith complicated derivatives." The American Statistician 67(2):76-81

Examples

## 3 state illness-death model for bronchiolitis obliterans## Compare clock-reset / semi-Markov multi-state models## Simple exponential model (reduces to Markov)bexp <- flexsurvreg(Surv(years, status) ~ trans,                    data=bosms3, dist="exp")tmat <- rbind(c(NA,1,2),c(NA,NA,3),c(NA,NA,NA))mexp <- msfit.flexsurvreg(bexp, t=seq(0,12,by=0.1),                          trans=tmat, tvar="trans", variance=FALSE)## Cox semi-parametric model within each transitionbcox <- coxph(Surv(years, status) ~ strata(trans), data=bosms3)if (require("mstate")){mcox <- mstate::msfit(bcox, trans=tmat)## Flexible parametric spline-based model bspl <- flexsurvspline(Surv(years, status) ~ trans + gamma1(trans),                       data=bosms3, k=3)mspl <- msfit.flexsurvreg(bspl, t=seq(0,12,by=0.1),                         trans=tmat, tvar="trans", variance=FALSE)## Compare fit: exponential model is OK but the spline is betterplot(mcox, lwd=1, xlim=c(0, 12), ylim=c(0,4))cols <- c("black","red","green")for (i in 1:3){    lines(mexp$Haz$time[mexp$Haz$trans==i], mexp$Haz$Haz[mexp$Haz$trans==i],             col=cols[i], lwd=2, lty=2)    lines(mspl$Haz$time[mspl$Haz$trans==i], mspl$Haz$Haz[mspl$Haz$trans==i],             col=cols[i], lwd=3)}legend("topright", lwd=c(1,2,3), lty=c(1,2,1),   c("Cox", "Exponential", "Flexible parametric"), bty="n")}## Fit a list of models, one for each transition## More computationally efficient, but only valid if parameters## are different between transitions.## Not run: bexp.list <- vector(3, mode="list")for (i in 1:3) {   bexp.list[[i]] <- flexsurvreg(Surv(years, status) ~ 1, subset=(trans==i),                                data=bosms3, dist="exp")}## The list of models can be passed to this and other functions,## as if it were a single multi-state model. msfit.flexsurvreg(bexp.list, t=seq(0,12,by=0.1), trans=tmat)## End(Not run)

Number of observations contributing to a fitted flexible survival model

Description

Number of observations contributing to a fitted flexible survival model

Usage

## S3 method for class 'flexsurvreg'nobs(object, cens = TRUE, ...)

Arguments

object

Output fromflexsurvreg orflexsurvspline, representing a fitted survival model object.

cens

Include censored observations in the number.TRUE by default.IfFALSE then the number of observed events is returned. SeeBIC.flexsurvreg for a discussion of the issueswith defining the sample size for censored data.

...

Further arguments passed to or from other methods. Currentlyunused.

Details

By default, this matches the behaviour of thenobs method forsurvreg objects, including both censored and uncensored observations.

If a weightedflexsurvreg analysis was done, then this function returns the sum of the weights.

Value

This returns themod$N component of the fittedmodel objectmod. Seeflexsurvreg,flexsurvspline for full documentation of all components.

Author(s)

C. H. Jacksonchris.jackson@mrc-bsu.cam.ac.uk

Simulate from the asymptotic normal distribution of parameter estimates.

Description

Produce a matrix of alternative parameter estimates under samplinguncertainty, at covariate values supplied by the user. Used bysummary.flexsurvreg for obtaining confidence intervals aroundfunctions of parameters.

Usage

normboot.flexsurvreg(  x,  B,  newdata = NULL,  X = NULL,  transform = FALSE,  raw = FALSE,  tidy = FALSE,  rawsim = NULL)

Arguments

x

A fitted model fromflexsurvreg (orflexsurvspline).

B

Number of samples.

newdata

Data frame or list containing the covariate values toevaluate the parameters at. If there are covariates in the model, at leastone ofnewdata orX must be supplied, unlessraw=TRUE.

X

Alternative (less convenient) format for covariate values: amatrix with one row, with one column for each covariate or factor contrast.Formed from all the "model matrices", one for each named parameter of thedistribution, with intercepts excluded,cbinded together.

transform

TRUE if the results should be transformed to thereal-line scale, typically by log if the parameter is defined as positive.The defaultFALSE returns parameters on the natural scale.

raw

Return samples of the baseline parameters and the covariateeffects, rather than the default of adjusting the baseline parameters forcovariates.

tidy

IfFALSE (the default) thena list is returned. IfTRUE a data frame is returned, consistingof the list elementsrbinded together, with integer variableslabelling the covariate number and simulation replicate number.

rawsim

allows input of raw samples from a previous run ofnormboot.flexsurvreg. This is useful if runningnormboot.flexsurvreg multiple time on the same dataset but withcounterfactual contrasts, e.g. treat =0 vs. treat =1.Used instandsurv.flexsurvreg.

Value

Ifnewdata includes only one covariate combination, a matrixwill be returned withB rows, and one column for each namedparameter of the survival distribution.

If more than one covariate combination is requested (e.g.newdata isa data frame with more than one row), then a list of matrices will bereturned, one for each covariate combination.

Author(s)

C. H. Jacksonchris.jackson@mrc-bsu.cam.ac.uk

References

Mandel, M. (2013). "Simulation based confidence intervals forfunctions with complicated derivatives." The American Statistician (inpress).

Examples

    fite <- flexsurvreg(Surv(futime, fustat) ~ age, data = ovarian, dist="exp")    normboot.flexsurvreg(fite, B=10, newdata=list(age=50))    normboot.flexsurvreg(fite, B=10, X=matrix(50,nrow=1))    normboot.flexsurvreg(fite, B=10, newdata=list(age=0))  ## closer to...    fite$res

Transition probabilities from a flexsurvmix model

Description

These quantities are variously known as transition probabilities, or stateoccupancy probabilities, or values of the "cumulative incidence" function,or values of the "subdistribution" function. They are the probabilities thatan individual has experienced an event of a particular kind by timet.

Usage

p_flexsurvmix(x, newdata = NULL, startname = "start", t = 1, B = NULL)

Arguments

x

Fitted model object returned fromflexsurvmix.

newdata

startname

Name of the state where individuals start. This considersthe model as a multi-state model where people start in this state, and maytransition to one of the competing events.

t

Vector of timest to calculate the probabilities oftransition by.

B

Number of simulations to use to compute 95% confidence intervals,based on the asymptotic multivariate normal distribution of the basicparameter estimates. IfB=NULL then intervals are not computed.

Details

Note that "cumulative incidence" is a misnomer, as "incidence" typicallymeans a hazard, and the quantities computed here are not cumulative hazards,but probabilities.

Value

A data frame with transition probabilities by time, covariate valueand destination state.

Transition-specific parameters in a flexible parametric multi-state model

Description

List of maximum likelihood estimates of transition-specific parameters ina flexible parametric multi-state model, at given covariate values.

Usage

pars.fmsm(x, trans, newdata = NULL, tvar = "trans")

Arguments

x

A multi-state model fitted withflexsurvreg. Seemsfit.flexsurvreg for the required form of the model and thedata.

x can also be a list offlexsurvreg models, with onecomponent for each permitted transition in the multi-state model, asillustrated inmsfit.flexsurvreg.

trans

Matrix indicating allowed transitions. Seemsfit.flexsurvreg.

newdata

A data frame specifying the values of covariates in thefitted model, other than the transition number. Seemsfit.flexsurvreg.

tvar

Variable in the data representing the transition type. Notrequired ifx is a list of models.

Value

A list with one component for each permitted transition. Each componenthas one element for each parameter of the parametric distribution that generatesthe corresponding event in the multi-state model.

Author(s)

Christopher Jacksonchris.jackson@mrc-bsu.cam.ac.uk.

Fitted densities for times to events in a flexsurvmix model

Description

This returns an estimate of the probability density for the time to eachcompeting event, at a vector of times supplied by the user.

Usage

pdf_flexsurvmix(x, newdata = NULL, t = NULL)

Arguments

x

Fitted model object returned fromflexsurvmix.

newdata

t

Vector of times at which to evaluate the probability density

Value

A data frame with each row giving the fitted densitydens fora combination of covariate values, time and competing event.

Probabilities of final states in a flexible parametric competing risks model

Description

This requires the model to be Markov, and is not valid for semi-Markovmodels, as it works by wrappingpmatrix.fs to calculate thetransition probability over a very large time. As it also works on afmsm object formed from transition-specific time-to-event models,it therefore only works on competing risks models, defined by just one startingstate with multiple destination states representing competing events. For these models, this function returns the probability governing whichcompeting event happens next. However this function simply wrapspmatrix.fs,so for other models,pmatrix.fs orpmatrix.simfs can be used with alarge forecast timet.

Usage

pfinal_fmsm(x, newdata = NULL, fromstate, maxt = 1e+05, B = 0, cores = NULL)

Arguments

x

Object returned byfmsm, representing a multi-statemodel formed from transition-specific time-to-event models fitted byflexsurvreg.

newdata

Data frame of covariate values, with one column percovariate, and one row per alternative value.

fromstate

State from which to calculate the transition probabilitystate. This should refer to the name of a row of the transition matrixattr(x,trans).

maxt

Large time to use for forecasting final state probabilities.The transition probability from zero to this time is used. NoteInf will not work. The default is100000.

B

Number of simulations to use to calculate 95% confidence intervalsbased on the asymptotic normal distribution of the basic parameterestimates. IfB=0 then no intervals are calculated.

cores

Number of processor cores to use. IfNULL (the default)then a single core is used.

Value

A data frame with one row per covariate value and destination state,giving the state in columnstate, and probability in columnval. Additional columnslower andupper for theconfidence limits are returned ifB=0.

Plots of fitted flexible survival models

Description

Plot fitted survival, cumulative hazard or hazard from a parametric modelagainst nonparametric estimates to diagnose goodness-of-fit. Alternativelyplot a user-defined function of the model parameters against time.

Usage

## S3 method for class 'flexsurvreg'plot(  x,  newdata = NULL,  X = NULL,  type = "survival",  fn = NULL,  t = NULL,  start = 0,  est = TRUE,  ci = NULL,  B = 1000,  cl = 0.95,  col.obs = "black",  lty.obs = 1,  lwd.obs = 1,  col = "red",  lty = 1,  lwd = 2,  col.ci = NULL,  lty.ci = 2,  lwd.ci = 1,  ylim = NULL,  add = FALSE,  ...)

Arguments

x

Output fromflexsurvreg orflexsurvspline, representing a fitted survival model object.

newdata

Data frame containing covariate values to produce fittedvalues for. Seesummary.flexsurvreg.

If there are only factor covariates in the model, then Kaplan-Meier (ornonparametric hazard...) curves are plotted for all distinct groups, andby default, fitted curves are also plotted for these groups. To plotKaplan-Meier and fitted curves for only a subset of groups, useplot(survfit()) followed bylines.flexsurvreg().

If there are any continuous covariates, then a single populationKaplan-Meier curve is drawn. By default, a single fitted curve is drawnwith the covariates set to their mean values in the data - for categoricalcovariates, the means of the 0/1 indicator variables are taken.

X

Alternative way to supply covariate values, as a model matrix.Seesummary.flexsurvreg.newdata is an easier way.

type

"survival" for survival, to be plotted againstKaplan-Meier estimates fromplot.survfit.

"cumhaz" for cumulative hazard, plotted against transformedKaplan-Meier estimates fromplot.survfit.

"hazard" for hazard, to be plotted against smooth nonparametricestimates frommuhaz. The nonparametric estimatestend to be unstable, and these plots are intended just to roughly indicatethe shape of the hazards through time. Themin.time andmax.time options tomuhaz may sometimes need tobe passed as arguments toplot.flexsurvreg to avoid an errorhere.

Ignored if"fn" is specified.

fn

Custom function of the parameters to summarise against time. Thefirst two arguments of the function must bet representing time, andstart representing left-truncation points, and any remainingarguments must be parameters of the distribution. It should return avector of the same length ast.

t

Vector of times to plot fitted values for, seesummary.flexsurvreg.

start

Left-truncation points, seesummary.flexsurvreg.

est

Plot fitted curves (TRUE orFALSE.)

ci

Plot confidence intervals for fitted curves. By default, this isTRUE if one observed/fitted curve is plotted, andFALSE ifmultiple curves are plotted.

B

Number of simulations controlling accuracy of confidenceintervals, as used insummary.Decrease for greater speed at the expense of accuracy, or setB=0 toturn off calculation of CIs.

cl

Width of confidence intervals, by default 0.95 for 95%intervals.

col.obs

Colour of the nonparametric curve.

lty.obs

Line type of the nonparametric curve.

lwd.obs

Line width of the nonparametric curve.

col

Colour of the fitted parametric curve(s).

lty

Line type of the fitted parametric curve(s).

lwd

Line width of the fitted parametric curve(s).

col.ci

Colour of the fitted confidence limits, defaulting to thesame as for the fitted curve.

lty.ci

Line type of the fitted confidence limits.

lwd.ci

Line width of the fitted confidence limits.

ylim

y-axis limits: vector of two elements.

add

IfTRUE, add lines to an existing plot, otherwise newaxes are drawn.

...

Other options to be passed toplot.survfit ormuhaz, for example, to control the smoothness of thenonparametric hazard estimates. Themin.time andmax.timeoptions tomuhaz may sometimes need to be changed fromthe defaults.

Note

Some standard plot arguments such as"xlim","xlab" may notwork. This function was designed as a quick check of model fit. Userswanting publication-quality graphs are advised to set up an empty plot withthe desired axes first (e.g. withplot(...,type="n",...)), then usesuitablelines functions to add lines.

If case weights were used to fit the model, these are used when producingnonparametric estimates of survival and cumulative hazard, but not forthe hazard estimates.

Author(s)

C. H. Jacksonchris.jackson@mrc-bsu.cam.ac.uk

Plot standardized metrics from a fitted flexsurv model

Description

Plot standardized metrics such as the marginal survival, restricted mean survival and hazard, based on a fitted flexsurv model.

Usage

## S3 method for class 'standsurv'plot(x, contrast = FALSE, ci = FALSE, expected = FALSE, ...)

Arguments

x

A standsurv object returned bystandsurv

contrast

Should contrasts of standardized metrics be plotted. Defaultsto FALSE

ci

Should confidence intervals be plotted (if calculated instandsurv)?

expected

Should the marginal expected survival / hazard also be plotted? This can only be invoked ifrmap andratetable have been passed tostandsurv

...

Not currently used

Value

A ggplot showing the standardized metric calculated bystandsurv over time. Modification of the plot ispossible by adding further ggplot objects, see Examples.

Examples

## Use bc dataset, with an age variable appended## mean age is higher in those with smaller observed survival times newbc <- bcnewbc$age <- rnorm(dim(bc)[1], mean = 65-scale(newbc$recyrs, scale=FALSE), sd = 5)## Fit a Weibull flexsurv model with group and age as covariatesweib_age <- flexsurvreg(Surv(recyrs, censrec) ~ group+age, data=newbc,                       dist="weibull")## Calculate standardized survival and the difference in standardized survival## for the three levels of group across a grid of survival timesstandsurv_weib_age <- standsurv(weib_age,                                           at = list(list(group="Good"),                                                     list(group="Medium"),                                                     list(group="Poor")),                                           t=seq(0,7, length=100),                                           contrast = "difference", ci=TRUE,                                           boot = TRUE, B=10, seed=123)plot(standsurv_weib_age)plot(standsurv_weib_age) + ggplot2::theme_bw() + ggplot2::ylab("Survival") + ggplot2::xlab("Time (years)") +  ggplot2::guides(color=ggplot2::guide_legend(title="Prognosis"),                               fill=ggplot2::guide_legend(title="Prognosis"))plot(standsurv_weib_age, contrast=TRUE, ci=TRUE) +  ggplot2::ylab("Difference in survival")

Plot nonparametric estimates of survival from right-truncated data.

Description

plot.survrtrunc creates a new plot, whilelines.survrtrunc adds lines to an exising plot.

Usage

## S3 method for class 'survrtrunc'plot(x, ...)## S3 method for class 'survrtrunc'lines(x, ...)

Arguments

x

Object of class"survrtrunc" as returned bysurvrtrunc.

...

Other arguments to be passed toplot.survfit orlines.survfit.

Transition probability matrix from a fully-parametric, time-inhomogeneousMarkov multi-state model

Description

The transition probability matrix for time-inhomogeneous Markov multi-statemodels fitted to time-to-event data withflexsurvreg. Thishasr,s entry giving the probability that an individual is in states at timet, given they are in stater at time0.

Usage

pmatrix.fs(  x,  trans = NULL,  t = 1,  newdata = NULL,  condstates = NULL,  ci = FALSE,  tvar = "trans",  sing.inf = 1e+10,  B = 1000,  cl = 0.95,  tidy = FALSE,  ...)

Arguments

x

A model fitted withflexsurvreg. Seemsfit.flexsurvreg for the required form of the model and thedata. Additionally, this must be a Markov / clock-forward model, but canbe time-inhomogeneous. See the package vignette for further explanation.

x can also be a list of models, with one component for eachpermitted transition, as illustrated inmsfit.flexsurvreg.

trans

Matrix indicating allowed transitions. Seemsfit.flexsurvreg.

t

Time or vector of times to predict state occupancy probabilitiesfor.

newdata

A data frame specifying the values of covariates in thefitted model, other than the transition number. Seemsfit.flexsurvreg.

condstates

xInstead of the unconditional probability of being in states at timet given stater at time 0, return the probability conditional on being in a particular subset of states at timet. This subset is specified in thecondstates argument, as a vector of character labels or integers.

This is used, for example, in competing risks situations, e.g. if the competing states are death or recovery from a disease, and we want to compute the probability a patient has died, given they have died or recovered. If these are absorbing states, then ast increases, this converges to the case fatality ratio. To compute this, sett to a very large number,Inf will not work.

ci

Return a confidence interval calculated by simulating from theasymptotic normal distribution of the maximum likelihood estimates. Turnedoff by default, since this is computationally intensive. If turned on,users should increaseB until the results reach the desiredprecision.

tvar

Variable in the data representing the transition type. Notrequired ifx is a list of models.

sing.inf

If there is a singularity in the observed hazard, forexample a Weibull distribution withshape < 1 has infinite hazard att=0, then as a workaround, the hazard is assumed to be a largefinite number,sing.inf, at this time. The results should not besensitive to the exact value assumed, but users should make sure byadjusting this parameter in these cases.

B

Number of simulations from the normal asymptotic distribution usedto calculate variances. Decrease for greater speed at the expense ofaccuracy.

cl

Width of symmetric confidence intervals, relative to 1.

tidy

If TRUE then return the results as a tidy data frame

...

Arguments passed toode indeSolve.

Details

This is computed by solving the Kolmogorov forward differential equationnumerically, using the methods in thedeSolve package. Theequation is

\frac{dP(t)}{dt} = P(t) Q(t)

whereP(t) is the transition probability matrix for timet, andQ(t) is the transition hazard or intensity as a function oft.The initial condition isP(0) = I.

Note that the packagemsm has a similar methodpmatrix.msm.pmatrix.fs should give the same results aspmatrix.msm whenboth of these conditions hold:

the time-to-event distribution is exponential for alltransitions, thus theflexsurvreg model was fitted withdist="exp" and the model is time-homogeneous.
themsmmodel was fitted withexacttimes=TRUE, thus all the event times areknown, and there are no time-dependent covariates.

msm only allows exponential or piecewise-exponential time-to-eventdistributions, whileflexsurvreg allows more flexible models.msm however was designed in particular for panel data, where theprocess is observed only at arbitrary times, thus the times of transitionare unknown, which makes flexible models difficult.

This function is only valid for Markov ("clock-forward") multi-statemodels, though no warning or error is currently given if the model is notMarkov. Seepmatrix.simfs for the equivalent for semi-Markov("clock-reset") models.

Value

The transition probability matrix, ift is of length 1. Ift is longer, return a list of matrices, or a data frame iftidy is TRUE.

Ifci=TRUE, each element has attributes"lower" and"upper" giving matrices of the corresponding confidence limits.These are formatted for printing but may be extracted usingattr().

Author(s)

Christopher Jacksonchris.jackson@mrc-bsu.cam.ac.uk.

Examples

# BOS example in vignette, and in msfit.flexsurvregbexp <- flexsurvreg(Surv(Tstart, Tstop, status) ~ trans,                    data=bosms3, dist="exp")tmat <- rbind(c(NA,1,2),c(NA,NA,3),c(NA,NA,NA))# more likely to be dead (state 3) as time moves on, or if start with# BOS (state 2)pmatrix.fs(bexp, t=c(5,10), trans=tmat)

Transition probability matrix from a fully-parametric, semi-Markovmulti-state model

Description

The transition probability matrix for semi-Markov multi-state models fittedto time-to-event data withflexsurvreg. This hasr,sentry giving the probability that an individual is in states at timet, given they are in stater at time0.

Usage

pmatrix.simfs(  x,  trans,  t = 1,  newdata = NULL,  ci = FALSE,  tvar = "trans",  tcovs = NULL,  M = 1e+05,  B = 1000,  cl = 0.95,  cores = NULL,  tidy = FALSE)

Arguments

x

A model fitted withflexsurvreg. Seemsfit.flexsurvreg for the required form of the model and thedata. Additionally this should be semi-Markov, so that the time variablerepresents the time since the last transition. In other words the responseshould be of the formSurv(time,status). See the package vignettefor further explanation.

x can also be a list offlexsurvreg models,with one component for each permitted transition, as illustratedinmsfit.flexsurvreg. This can be constructed byfmsm.

trans

Matrix indicating allowed transitions. Seemsfit.flexsurvreg. This is not required ifx is a list constructed byfmsm.

t

Time to predict state occupancy probabilities for. This can be a single number or a vector of different numbers.

newdata

A data frame specifying the values of covariates in thefitted model, other than the transition number. Seemsfit.flexsurvreg.

ci

Return a confidence interval calculated by simulating from theasymptotic normal distribution of the maximum likelihood estimates. Thisis turned off by default, since two levels of simulation are required. Ifturned on, users should adjustB and/orM until the resultsreach the desired precision. The simulation overM is generallyvectorised, therefore increasingB is usually more expensive thanincreasingM.

tvar

Variable in the data representing the transition type. Notrequired ifx is a list of models.

tcovs

Predictable time-dependent covariates such as age, seesim.fmsm.

M

Number of individuals to simulate in order to approximate thetransition probabilities. Users should adjust this to obtain the requiredprecision.

B

Number of simulations from the normal asymptotic distribution usedto calculate confidence limits. Decrease for greater speed at the expense ofaccuracy.

cl

Width of symmetric confidence intervals, relative to 1.

cores

Number of processor cores used when calculating confidence limits by repeated simulation. The default uses single-core processing.

tidy

IfTRUE then the results are returned as a tidy data frame with columns for the estimate and confidence limits, and rows per state transition and time interval.

Details

This is computed by simulating a large number of individualsM usingthe maximum likelihood estimates of the fitted model and the functionsim.fmsm. Therefore this requires a random sampling functionfor the parametric survival model to be available: see the "Details"section ofsim.fmsm. This will be available for all built-indistributions, though users may need to write this for custom models.

Note the random sampling method forflexsurvspline models iscurrently very inefficient, so that looping over theM individualswill be very slow.

pmatrix.fs is a more efficient method based on solving theKolmogorov forward equation numerically, which requires the multi-statemodel to be Markov. No error or warning is given if runningpmatrix.simfs with a Markov model, but this is still invalid.

Value

The transition probability matrix. Ifci=TRUE, there areattributes"lower" and"upper" giving matrices of thecorresponding confidence limits. These are formatted for printing but maybe extracted usingattr().

Author(s)

Christopher Jacksonchris.jackson@mrc-bsu.cam.ac.uk.

Examples

# BOS example in vignette, and in msfit.flexsurvregbexp <- flexsurvreg(Surv(years, status) ~ trans, data=bosms3, dist="exp")tmat <- rbind(c(NA,1,2),c(NA,NA,3),c(NA,NA,NA))# more likely to be dead (state 3) as time moves on, or if start with# BOS (state 2)pmatrix.simfs(bexp, t=5, trans=tmat)pmatrix.simfs(bexp, t=10, trans=tmat)# these results should converge to those in help(pmatrix.fs), as M# increases here and ODE solving precision increases there, since with# an exponential distribution, the semi-Markov model is the same as the# Markov model.

Probability of each pathway taken through a mixture multi-state model

Description

Probability of each pathway taken through a mixture multi-state model

Usage

ppath_fmixmsm(x, newdata = NULL, final = FALSE, B = NULL)

Arguments

x

Object returned byfmixmsm, representing a multi-statemodel built from piecing together mixture models fitted byflexsurvmix.

newdata

final

IfTRUE then the probabilities of pathways with the samefinal state are added together, to produce the probability of eachultimate outcome or absorbing state from the multi-state model.

B

Number of simulations to use to compute 95% confidence intervals,based on the asymptotic multivariate normal distribution of the basicparameter estimates. IfB=NULL then intervals are not computed.

Value

Data frame of pathway probabilities by covariate value and pathway.

Predictions from flexible survival models

Description

Predict outcomes from flexible survival models at the covariate valuesspecified innewdata.

Usage

## S3 method for class 'flexsurvreg'predict(  object,  newdata,  type = "response",  times,  start = 0,  conf.int = FALSE,  conf.level = 0.95,  se.fit = FALSE,  p = c(0.1, 0.9),  ...)

Arguments

object

Output fromflexsurvreg orflexsurvspline, representing a fitted survival model object.

newdata

Data frame containing covariate values at which to producefitted values. There must be a column for every covariate in the modelformula used to fitobject, and one row for every combination ofcovariate values at which to obtain the fitted predictions.

Ifnewdata is omitted, then the original data used to fit the modelare used, as extracted bymodel.frame(object). However this willcurrently not work if the model formula contains functions, e.g.~ factor(x). The names of the model frame must correspond tovariables in the original data.

type

Character vector for the type of predictions desired.

"response" for mean survival time (the default)."mean" isan acceptable synonym
"quantile" for quantiles of the survival distribution as specifiedbyp
"rmst" for restricted mean survival time
"survival" for survival probabilities
"cumhaz" for cumulative hazards
"hazard" for hazards
"link" for fitted values of the location parameter, analogous tothe linear predictor in generalized linear models (type = "lp" andtype = "linear" are acceptable synonyms). This is on the naturalscale of the parameter, not the log scale.

times

Vector of time horizons at which to compute fitted values.Only applies whentype is"survival","cumhaz","hazard", or"rmst". Will be silently ignored for all othertypes.

If not specified, predictions for"survival","cumhaz", and"hazard" will be made at each observed event time inmodel.frame(object).

For"rmst", whentimes is not specified predictions will bemade at the maximum observed event time from the data used to fitobject. Specifyingtimes = Inf is valid, and will returnmean survival (equal totype = "response").

start

Optional left-truncation time or times. The returnedsurvival, hazard, or cumulative hazard will be conditioned on survival upto this time.start must be length 1 or the same length astimes.Predicted times returned withtype"rmst" or"quantile"will be times since time zero, not times since thestart time.

conf.int

Logical. Should confidence intervals be returned?Default isFALSE.

conf.level

Width of symmetric confidence intervals, relative to 1.

se.fit

Logical. Should standard errors of fitted values be returned?Default isFALSE.

p

Vector of quantiles at which to return fitted values whentype = "quantile". Default isc(0.1, 0.9).

...

Not currently used.

Value

Atibble with same number of rows asnewdata and in the same order. If multiple predictions arerequested, atibble containing a single list-columnof data frames.

For the list-column of data frames - the dimensions of each data framewill be identical. Rows are added for each value oftimes orp requested.

This function is a wrapper aroundsummary.flexsurvreg,designed to helpflexsurv to integrate with the "tidymodels"ecosystem, in particular thecensored package.summary.flexsurvreg returns the same results but in a moreconventional format.

Author(s)

Matthew T. Warkentin (https://github.com/mattwarkentin)

Examples

fitg <- flexsurvreg(formula = Surv(futime, fustat) ~ age, data = ovarian, dist = "gengamma")## Simplest prediction: mean or median, for covariates defined by original datasetpredict(fitg)predict(fitg, type = "quantile", p = 0.5)## Simple prediction for user-defined covariate valuespredict(fitg, newdata = data.frame(age = c(40, 50, 60)))predict(fitg, type = "quantile", p = 0.5, newdata = data.frame(age = c(40,50,60)))## Predict multiple quantiles and unnestrequire(tidyr)pr <- predict(fitg, type = "survival", times = c(600, 800))tidyr::unnest(pr, .pred)

Probabilities of competing events from a flexsurvmix model

Description

Probabilities of competing events from a flexsurvmix model

Usage

probs_flexsurvmix(x, newdata = NULL, B = NULL)

Arguments

x

Fitted model object returned fromflexsurvmix.

newdata

B

Number of simulations to use to compute 95% confidence intervals,based on the asymptotic multivariate normal distribution of the basicparameter estimates. IfB=NULL then intervals are not computed.

Value

A data frame containing the probability that each of the competingevents will occur next, by event and by any covariate values specified innewdata.

Quantiles of the distribution of the time until reaching a final state in amixture multi-state model

Description

Calculate the quantiles of the time from the start of the process to eachpossible final (or "absorbing") state in a mixture multi-state model.Models with cycles are not supported.

Usage

qfinal_fmixmsm(  x,  newdata = NULL,  final = FALSE,  B = NULL,  n = 10000,  probs = c(0.025, 0.5, 0.975))

Arguments

x

Object returned byfmixmsm, representing a multi-statemodel built from piecing together mixture models fitted byflexsurvmix.

newdata

final

B

Number of simulations to use to compute 95% confidence intervals,based on the asymptotic multivariate normal distribution of the basicparameter estimates. IfB=NULL then intervals are not computed.

n

Number of individual-level simulations to use to characterise thetime-to-event distributions

probs

Quantiles to calculate, by default,c(0.025, 0.5, 0.975)

Value

Data frame of quantiles of the time to final state by pathway andcovariate value, or by final state and covariate value.

Generic function to find quantiles of a distribution

Description

Generic function to find the quantiles of a distribution, given theequivalent probability distribution function.

Usage

qgeneric(pdist, p, matargs = NULL, scalarargs = NULL, ...)

Arguments

pdist

Probability distribution function, for example,pnorm for the normal distribution, which must be defined inthe current workspace. This should accept and return vectorised parametersand values. It should also return the correct values for the entire realline, for example a positive distribution should havepdist(x)==0forx<0.

p

Vector of probabilities to find the quantiles for.

matargs

Character vector giving the elements of... whichrepresent vector parameters of the distribution. Empty by default. Whenvectorised, these will become matrices. This is used for the argumentsgamma andknots inqsurvspline.

scalarargs

Character vector naming scalar arguments of the distribution function that cannot be vectorised. This is used for the argumentsscale andtimescale inqsurvspline.

...

The remaining arguments define parameters of the distributionpdist. These MUST be named explicitly.

This may also contain the standard argumentslog.p (logical; defaultFALSE, ifTRUE, probabilities p are given as log(p)), andlower.tail (logical; ifTRUE (default), probabilities are P[X<= x] otherwise, P[X > x].).

If the distribution is bounded above or below, then this should containargumentslbound andubound respectively, and these will bereturned ifp is 0 or 1 respectively. Defaults to-Inf andInf respectively.

Details

This function is used by default for custom distributions for which aquantile function is not provided.

It works by finding the root of the equationh(q) = pdist(q) - p = 0.Starting from the interval(-1, 1), the interval width is expanded by50% untilh() is of opposite sign at either end. The root is thenfound usinguniroot.

This assumes a suitably smooth, continuous distribution.

Value

Vector of quantiles of the distribution atp.

Author(s)

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

Examples

qnorm(c(0.025, 0.975), 0, 1)qgeneric(pnorm, c(0.025, 0.975), mean=0, sd=1) # must name the arguments

Quantiles of time-to-event distributions in a flexsurvmix model

Description

This returns the quantiles of each event-specific parametric time-to-eventdistribution in the mixture model, which describes the time to the eventconditionally on that event being the one that happens.

Usage

quantile_flexsurvmix(x, newdata = NULL, B = NULL, probs = c(0.025, 0.5, 0.975))

Arguments

x

Fitted model object returned fromflexsurvmix.

newdata

B

Number of simulations to use to compute 95% confidence intervals,based on the asymptotic multivariate normal distribution of the basicparameter estimates. IfB=NULL then intervals are not computed.

probs

Vector of alternative quantiles, by defaultc(0.025, 0.95, 0.975)giving the median and a 95% interval.

Objects exported from other packages

Description

These objects are imported from other packages. Follow the linksbelow to see their documentation.

generics: augment,glance,tidy

Calculate residuals for flexible survival models

Description

Calculates residuals forflexsurvreg orflexsurvspline model fits.

Usage

## S3 method for class 'flexsurvreg'residuals(object, type = "response", ...)

Arguments

object

Output fromflexsurvreg orflexsurvspline, representing a fitted survival model object.

type

Character string for the type of residual desired. Currently only"response" and"coxsnell" are supported. More residual types may become available in future versions.

...

Not currently used.

Details

Residuals oftype = "response" are calculated as the naive difference between the observed survival and the covariate-specific predicted mean survival frompredict.flexsurvreg, ignoring whether the event time is observed or censored.

type="coxsnell" returns the Cox-Snell residual, defined as the estimated cumulative hazard at each data point. To check the fit of theA more fully featured utility for this is provided in the functioncoxsnell_flexsurvreg.

Value

Numeric vector with the same length asnobs(object).

Examples

fitg <- flexsurvreg(formula = Surv(futime, fustat) ~ age, data = ovarian, dist = "gengamma")residuals(fitg, type="response")

Restricted mean times to events from a flexsurvmix model

Description

This returns the restricted mean of each event-specific parametric time-to-eventdistribution in the mixture model, which is the mean time to eventconditionally on that event being the one that happens, and conditionallyon the event time being less than some time horizontot.

Usage

rmst_flexsurvmix(x, newdata = NULL, tot = Inf, B = NULL)

Arguments

x

Fitted model object returned fromflexsurvmix.

newdata

tot

Time horizon to compute the restricted mean until.

B

Number of simulations to use to compute 95% confidence intervals,based on the asymptotic multivariate normal distribution of the basicparameter estimates. IfB=NULL then intervals are not computed.

Value

Restricted mean times to next event conditionally on each alternative event,given the specified covariate values.

Generic function to find restricted mean survival time for some distribution

Description

Generic function to find the restricted mean of a distribution, given theequivalent probability distribution function, using numeric integration.

Usage

rmst_generic(pdist, t, start = 0, matargs = NULL, scalarargs = NULL, ...)

Arguments

pdist

t

Vector of times at which rmst is evaluated

start

Optional left-truncation time or times. The returnedrestricted mean survival will be conditioned on survival up tothis time.

matargs

scalarargs

Character vector naming scalar arguments of the distribution function that cannot be vectorised. This is used, for example, for the argumentsscale andtimescale inpsurvspline.

...

The remaining arguments define parameters of the distributionpdist. These MUST be named explicitly.

Details

This function is used by default for custom distributions for which anrmst function is not provided.

This assumes a suitably smooth, continuous distribution.

Value

Vector of restricted mean survival times of the distribution atp.

Author(s)

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

Examples

rmst_lnorm(500, start=250, meanlog=7.4225, sdlog = 1.1138)rmst_generic(plnorm, 500, start=250, meanlog=7.4225, sdlog = 1.1138)# must name the arguments

Simulate paths through a fully parametric semi-Markov multi-state model

Description

Simulate changes of state and transition times from a semi-Markovmulti-state model fitted usingflexsurvreg.

Usage

sim.fmsm(  x,  trans = NULL,  t,  newdata = NULL,  start = 1,  M = 10,  tvar = "trans",  tcovs = NULL,  tidy = FALSE)

Arguments

x

A model fitted withflexsurvreg. Seemsfit.flexsurvreg for the required form of the model and thedata.

Alternativelyx can be a list of fittedflexsurvregmodel objects. Theith element of this list is the modelcorresponding to theith transition intrans. This is a moreefficient way to fit a multi-state model, but only valid if the parametersare different between different transitions.

trans

Matrix indicating allowed transitions. Seemsfit.flexsurvreg.

t

Time, or vector of times for each of theM individuals, tosimulate trajectories until.

newdata

A data frame specifying the values of covariates in thefitted model, other than the transition number. Seemsfit.flexsurvreg.

start

Starting state, or vector of starting states for each of theM individuals.

M

Number of individual trajectories to simulate.

tvar

Variable in the data representing the transition type. Notrequired ifx is a list of models.

tcovs

Names of "predictable" time-dependent covariates innewdata, i.e. those whose values change at the same rate as time.Age is a typical example. During simulation, their values will be updatedafter each transition time, by adding the current time to the valuesupplied innewdata. This assumes the covariate is measured in thesame unit as time.tcovs is supplied as a character vector.

tidy

IfTRUE then the simulated data are returned as a tidy data frame with one row per simulated transition. Seesimfs_bytrans for a function to rearrange the data into this format if it was simulated in non-tidy format. Currently this includes only event times, and excludes any times of censoring that are reported whentidy=FALSE.

Details

sim.fmsm relies on the presence of a function to sample randomnumbers from the parametric survival distribution used in the fitted modelx, for examplerweibull for Weibull models. Ifx was fitted using a custom distribution, calleddist say,then there must be a function called (something like)rdist eitherin the working environment, or supplied through thedfns argument toflexsurvreg. This must be in the same format as standard Rfunctions such asrweibull, with first argumentn, andremaining arguments giving the parameters of the distribution. It must bevectorised with respect to the parameter arguments.

This function is only valid for semi-Markov ("clock-reset") models, thoughno warning or error is currently given if the model is not of this type. Anequivalent for time-inhomogeneous Markov ("clock-forward") models hascurrently not been implemented.

Value

Iftidy=TRUE, a data frame with one row for each simulated transition, giving the individual IDid, start statestart, end stateend, transition labeltrans, time of the transition since the start of the process (time), and time since the previous transition (delay).

Iftidy=FALSE, a list of two matrices namedst andt. The rows ofeach matrix represent simulated individuals. The columns oftcontain the times when the individual changes state, to the correspondingstates inst.

The first columns will always contain the starting states and the startingtimes. The last column oft represents either the time when theindividual moves to an absorbing state, or right-censoring in a transientstate at the time given in thet argument tosim.fmsm.

Author(s)

Christopher Jacksonchris.jackson@mrc-bsu.cam.ac.uk.

Examples

bexp <- flexsurvreg(Surv(years, status) ~ trans, data=bosms3, dist="exp")tmat <- rbind(c(NA,1,2),c(NA,NA,3),c(NA,NA,NA))sim.fmsm(bexp, M=10, t=5, trans=tmat)

Simulate and summarise final outcomes from a flexible parametric multi-statemodel

Description

Estimates the probability of each final outcome ("absorbing" state), and themean and quantiles of the time to that outcome for people who experience it,by simulating a large sample of individuals from the model. This can be usedfor both Markov and semi-Markov models.

Usage

simfinal_fmsm(  x,  newdata = NULL,  probs = c(0.025, 0.5, 0.975),  t = 1000,  M = 1e+05,  B = 0,  cores = NULL)

Arguments

x

Object returned byfmsm, representing a multi-statemodel formed from transition-specific time-to-event models fitted byflexsurvreg.

newdata

Data frame of covariate values, with one column percovariate, and one row per alternative value.

probs

Quantiles to calculate, by default,c(0.025, 0.5, 0.975)for a median and 95% interval.

t

Maximum time to simulate to, passed tosim.fmsm, sothat the summaries are taken from the subset of individuals in thesimulated data who are in the absorbing state at this time.

M

Number of individuals to simulate.

B

Number of simulations to use to calculate 95% confidence intervalsbased on the asymptotic normal distribution of the basic parameterestimates. IfB=0 then no intervals are calculated.

cores

Number of processor cores to use. IfNULL (the default)then a single core is used.

Details

For a competing risks model, i.e. a model defined by just one starting stateand multiple destination states representing competing events, this returnsthe probability governing the next event that happens, and the distribution of the time to each event conditionally on that event happening.

Value

A tidy data frame with rows for each combination of covariate valuesand quantity of interest. The quantity of interest is identified in thecolumnquantity, and the value of the quantity is inval,with additional columnslower andupper giving 95%confidence intervals for the quantity, ifB>0.

Reformat simulated multi-state data with one row per simulated transition

Description

Reformat simulated multi-state data with one row per simulated transition

Usage

simfs_bytrans(simfs)

Arguments

simfs

Output fromsim.fmsm representing simulated histories from a multi-state model.

Value

Data frame with four columns giving transition start state, transition end state, transition name and the time taken by the transition.

Simulate times to competing events from a mixture multi-state model

Description

Simulate times to competing events from a mixture multi-state model

Usage

simt_flexsurvmix(x, newdata = NULL, n)

Arguments

x

Fitted model object returned fromflexsurvmix.

newdata

n

Number of simulations

Value

Data frame withn*m rows and a column for each competingevent, wherem is the number of alternative covariate values, thatis the number of rows ofnewdata. The simulated time representsthe time to that event conditionally on that event being the one thatoccurs. This function doesn't simulate which event occurs.

Simulate censored time-to-event data from a fitted flexsurvreg model

Description

Simulate censored time-to-event data from a fitted flexsurvreg model

Usage

## S3 method for class 'flexsurvreg'simulate(  object,  nsim = 1,  seed = NULL,  newdata = NULL,  start = NULL,  censtime = NULL,  tidy = FALSE,  ...)

Arguments

object

Object returned byflexsurvreg.

nsim

Number of simulations per row innewdata.

seed

Random number seed. This is returned with the result of thisfunction, as described insimulate for thelm method.

newdata

Data frame defining alternative sets of covariate values to simulate with.If omitted, this defaults to the data originally used to fit the model.

start

censtime

A right-censoring time, or vector of times matching the rowsofnewdata. IfNULL (the default) then uncensored times to eventsare simulated.

tidy

IfTRUE then a "tidy" or "long"-format data frame isreturned, with rows defined by combinations of covariates and simulationreplicates. The simulation replicate is indicated in the column namedi.

IfFALSE, then a data frame is returned with one row per set ofcovariate values, and different columns for different simulationreplicates. This is the traditional format for 'simulate' methods in baseR.

In either case, the simulated time and indicator for whether the time isan event time (rather than a time of right-censoring) are returned indifferent columns.

...

Other arguments (not currently used).

Value

A data frame, with format determined by whethertidy was specified.

Examples

fit <- flexsurvreg(formula = Surv(futime, fustat) ~ rx, data = ovarian, dist="weibull")fit2 <- flexsurvspline(formula = Surv(futime, fustat) ~ rx, data = ovarian, k=3)nd = data.frame(rx=1:2)simulate(fit, seed=1002, newdata=nd)simulate(fit, seed=1002, newdata=nd, start=500)simulate(fit2, nsim=3, seed=1002, newdata=nd)simulate(fit2, nsim=3, seed=1002, newdata=nd, start=c(500,1000))

Marginal survival and hazards of fitted flexsurvreg models

Description

Returns a tidy data.frame of marginal survival probabilities, or hazards, restricted mean survival, or quantiles of the marginal survival functionat user-defined time points and covariate patterns.Standardization is performed over any undefined covariates in the model. The user provides the data to standardize over. Contrasts can be calculated resulting in estimates of the average treatment effect or the average treatment effect in the treated if a treated subset of the data are supplied.

Usage

standsurv(  object,  newdata = NULL,  at = list(list()),  atreference = 1,  type = "survival",  t = NULL,  ci = FALSE,  se = FALSE,  boot = FALSE,  B = NULL,  cl = 0.95,  trans = "log",  contrast = NULL,  trans.contrast = NULL,  seed = NULL,  rmap,  ratetable,  scale.ratetable = 365.25,  n.gauss.quad = 100,  quantiles = 0.5,  interval = c(1e-08, 500))

Arguments

object

Output fromflexsurvreg orflexsurvspline, representing a fitted survival model object.

newdata

Data frame containing covariate values to produce marginalvalues for. If not specified then the fitted model data.frame is used.There must be a column for every covariate in the model formulafor which the user wishes to standardize over. These are in the same formatas the original data, with factors as a single variable, not 0/1 contrasts.Any covariates that are to be fixed should be specified inat. There should be one row for every combination of covariates in which to standardize over. If newdata contains a variable named '(weights)' then a weighted mean will be used to create the standardized estimates. This is thedefault behaviour if the fitted model contains case weights, which are stored in the fitted model data.frame.

at

A list of scenarios in which specific covariates are fixed to certain values. Each element ofat must itself be a list. For example,for a covariategroup with levels "Good", "Medium" and "Poor", the standardized survival plots for each group averaging over all other covariates is specified usingat=list(list(group="Good"), list(group="Medium"), list(group="Poor")).

atreference

The reference scenario for making contrasts. Default is 1(i.e. the first element ofat).

type

"survival" for marginal survival probabilities. In a relative survival framework this returns the marginal all-cause survival (see details).

"hazard" for the hazard of the marginal survival probability. In a relative survival framework this returns the marginal all-cause hazard (see details).

"rmst" for standardized restricted mean survival.

"relsurvival" for marginal relative survival (can only be specifiedif a relative survival model has been fitted in flexsurv).

"excesshazard" for marginal excess hazards (can only be specifiedif a relative survival model has been fitted in flexsurv).

"quantile" for quantiles of the marginal all-cause survival distribution. Thequantiles option also needs to be provided.

t

Times to calculate marginal values at.

ci

Should confidence intervals be calculated? Defaults to FALSE

se

Should standard errors be calculated? Defaults to FALSE

boot

Should bootstrapping be used to calculate standard error and confidence intervals? Defaults to FALSE, in which case the delta method is used

B

Number of bootstrap simulations from the normal asymptotic distribution of the estimates used to calculate confidence intervals or standard errors. Decrease for greater speed at the expense of accuracy. Only specify ifboot = TRUE

cl

Width of symmetric confidence intervals, relative to 1.

trans

Transformation to apply when calculating standard errors via thedelta method to obtain confidence intervals. The default transformation is "log". Other possible names are "none", "loglog", "logit".

contrast

Contrasts between standardized measures defined byatscenarios. Options are"difference" and"ratio". There will ben-1 new columns created where n is the number ofat scenarios. Defaultis NULL (i.e. no contrasts are calculated).

trans.contrast

Transformation to apply when calculating standard errorsfor contrasts via the delta method to obtain confidence intervals. The defaulttransformation is "none" for differences in survival, hazard, quantiles, or RMST, and "log" for ratios of survival, hazard, quantiles or RMST.

seed

The random seed to use (for bootstrapping confidence intervals)

rmap

An list that maps data set names to expected ratetable names. This must be specified if all-cause survival and hazards are required afterfitting a relative survival model. This can also be specified if expectedrates are required for plotting purposes. See the details section below.

ratetable

A table of expected event rates (seeratetable)

scale.ratetable

Transformation from the time scale of the fitted flexsurv model to the time scale inratetable. For example, if the analysis time of the fitted model is in years and the ratetable is in units/day then we should usescale.ratetable = 365.25. This is the default as often the ratetable will be in units/day (see example).

n.gauss.quad

Number of Gaussian quadrature points used for integrating the all-cause survival function when calculating RMST in a relative survival framework (default = 100)

quantiles

Iftype="quantile", this specifies the quantiles of the survival time distribution to return estimates for.

interval

Interval of survival times for quantile root finding. Default is c(1e-08, 500).

Details

The syntax ofstandsurv follows closely that of Stata'sstandsurv command written by Paul Lambert and Michael Crowther. The function calculates standardized (marginal) measures including standardizedsurvival functions, standardized restricted mean survival times, quantilesand the hazard of standardized survival. The standardized survival is defined as

S_s(t|X=x) = E(S(t|X=x,Z)) = \frac{1}{N} \sum_{i=1}^N S(t|X=x,Z=z_i)

The hazard of the standardized survival is a weighted average of individual hazard functions at time t, weighted by the survivalfunction at this time:

h_s(t|X=x) = \frac{\sum_{i=1}^N S(t|X=x,Z=z_i)h(t|X=x,Z=z_i)}{\sum_{i=1}^N S(t|X=x,Z=z_i)}

Marginal expected survival and hazards can be calculated by providing a population-based lifetable of class ratetable inratetable and a mapping between stratification factors in the lifetable and the user datasetusingrmap. If these stratification factors are not in the fittedsurvival model then the user must specify them innewdata along withthe covariates of the model. The marginal expected survival is calculated using the "Ederer" method that assumes no censoring as this is most relevant approach for forecasting (seesurvexp). A worked example is given below.

Marginal all-cause survival and hazards can be calculated after fitting arelative survival model, which utilise the expected survival from a populationratetable. See Rutherford et al. (Chapter 6) for further details.

Value

Atibble containing one row for each time-point. The column naming convention isat{i} for the ith scenariowith corresponding confidence intervals (if specified) namedat{i}_lciandat{i}_uci. Contrasts are namedcontrast{k}_{j} for the comparison of the kth versus the jthat scenario.

In addition tidy long-format data.frames are returned in the attributesstandsurv_at andstandsurv_contrast. These can be passed toggplot for plotting purposes (seeplot.standsurv).

Author(s)

Michael Sweeting <mikesweeting79@gmail.com>

References

Paul Lambert, 2021. "STANDSURV: Stata module to compute standardized (marginal) survival and related functions," Statistical Software Components S458991, Boston College Department of Economics. https://ideas.repec.org/c/boc/bocode/s458991.html

Rutherford, MJ, Lambert PC, Sweeting MJ, Pennington B, Crowther MJ, Abrams KR,Latimer NR. 2020. "NICE DSU Technical Support Document 21: Flexible Methods for Survival Analysis" https://nicedsu.sites.sheffield.ac.uk/tsds/flexible-methods-for-survival-analysis-tsd

Examples

## mean age is higher in those with smaller observed survival times newbc <- bcset.seed(1)newbc$age <- rnorm(dim(bc)[1], mean = 65-scale(newbc$recyrs, scale=FALSE), sd = 5)## Fit a Weibull flexsurv model with group and age as covariatesweib_age <- flexsurvreg(Surv(recyrs, censrec) ~ group+age, data=newbc,                        dist="weibull")                       ## Calculate standardized survival and the difference in standardized survival## for the three levels of group across a grid of survival times                        standsurv_weib_age <- standsurv(weib_age,                                            at = list(list(group="Good"),                                                      list(group="Medium"),                                                      list(group="Poor")),                                            t=seq(0,7, length.out=100),                                           contrast = "difference", ci=FALSE)standsurv_weib_age## Calculate hazard of standardized survival and the marginal hazard ratio## for the three levels of group across a grid of survival times## 10 bootstraps for confidence intervals (this should be larger)## Not run:           haz_standsurv_weib_age <- standsurv(weib_age,                                            at = list(list(group="Good"),                                                      list(group="Medium"),                                                      list(group="Poor")),                                            t=seq(0,7, length.out=100),                                           type="hazard",                                           contrast = "ratio", boot = TRUE,                                           B=10, ci=TRUE)haz_standsurv_weib_age                                            plot(haz_standsurv_weib_age, ci=TRUE)## Hazard ratio plot shows a decreasing marginal HR ## Whereas the conditional HR is constant (model is a PH model)plot(haz_standsurv_weib_age, contrast=TRUE, ci=TRUE)## Calculate standardized survival from a Weibull model together with expected## survival matching to US lifetables# age at diagnosis in days. This is required to match to US ratetable, whose# timescale is measured in daysnewbc$agedays <- floor(newbc$age * 365.25)  ## Create some random diagnosis dates centred on 01/01/2010 with SD=1 year## These will be used to match to expected rates in the lifetablenewbc$diag <- as.Date(floor(rnorm(dim(newbc)[1],                      mean = as.Date("01/01/2010", "%d/%m/%Y"), sd=365)),                      origin="1970-01-01")## Create sex (assume all are female)newbc$sex <- factor("female")standsurv_weib_expected <- standsurv(weib_age,                                            at = list(list(group="Good"),                                                      list(group="Medium"),                                                      list(group="Poor")),                                            t=seq(0,7, length.out=100),                                           rmap=list(sex = sex,                                                     year = diag,                                                     age = agedays),                                           ratetable = survival::survexp.us,                                           scale.ratetable = 365.25,                                           newdata = newbc)## Plot marginal survival with expected survival superimposed                                            plot(standsurv_weib_expected, expected=TRUE)## End(Not run)

Summaries of fitted flexible survival models

Description

Return fitted survival, cumulative hazard or hazard at a series of timesfrom a fittedflexsurvreg orflexsurvsplinemodel.

Usage

## S3 method for class 'flexsurvreg'summary(  object,  newdata = NULL,  X = NULL,  type = "survival",  fn = NULL,  t = NULL,  quantiles = 0.5,  start = 0,  cross = TRUE,  ci = TRUE,  se = FALSE,  B = 1000,  cl = 0.95,  tidy = FALSE,  na.action = na.pass,  ...)

Arguments

object

Output fromflexsurvreg orflexsurvspline, representing a fitted survival model object.

newdata

Data frame containing covariate values to produce fittedvalues for. Or a list that can be coerced to such a data frame. Theremust be a column for every covariate in the model formula, and one row forevery combination of covariates the fitted values are wanted for. Theseare in the same format as the original data, with factors as a singlevariable, not 0/1 contrasts.

If this is omitted, if there are any continuous covariates, then a singlesummary is provided with all covariates set to their mean values in thedata - for categorical covariates, the means of the 0/1 indicator variablesare taken. If there are only factor covariates in the model, then alldistinct groups are used by default.

X

Alternative way of defining covariate values to produce fittedvalues for. Since version 0.4,newdata is an easier way thatdoesn't require the user to create factor contrasts, butX has beenkept for backwards compatibility.

Columns ofX represent different covariates, and rows representmultiple combinations of covariate values. For examplematrix(c(1,2),nrow=2) if there is only one covariate in the model,and we want survival for covariate values of 1 and 2. A vector can also besupplied if just one combination of covariates is needed.

For “factor” (categorical) covariates, the values of the contrastsrepresenting factor levels (as returned by thecontrastsfunction) should be used. For example, for a covariateagegroupspecified as an unordered factor with levels20-29, 30-39, 40-49,50-59, and baseline level20-29, there are three contrasts. Toreturn summaries for groups20-29 and40-49, supplyX =rbind(c(0,0,0), c(0,1,0)), since all contrasts are zero for the baselinelevel, and the second contrast is “turned on” for the third level40-49.

type

"survival" for survival probabilities.

"cumhaz" for cumulative hazards.

"hazard" for hazards.

"rmst" for restricted mean survival.

"mean" for mean survival.

"median" for median survival (alternative totype="quantile" withquantiles=0.5).

"quantile" for quantiles of the survival time distribution.

"link" for the fitted value of the location parameter (i.e. the "linear predictor" but on the natural scale of the parameter, not on the log scale)

Ignored if"fn" is specified.

fn

Custom function of the parameters to summarise against time.This has optional first two argumentst representing time, andstart representing left-truncation points, and any remainingarguments must be parameters of the distribution. It should be vectorised, andreturn a vector corresponding to the vectors given byt,start andthe parameter vectors.

t

Times to calculate fitted values for. By default, these are thesorted unique observation (including censoring) times in the data - forleft-truncated datasets these are the "stop" times.

quantiles

Iftype="quantile", this specifies the quantiles of the survival time distribution to return estimates for.

start

cross

IfTRUE (the default) then summaries are calculated for all combinations of timesspecified int and covariate vectors specifed innewdata.

IfFALSE,then the timest should be of length equal to the number of rows innewdata,and one summary is produced for each row ofnewdata paired with the correspondingelement oft. This is used, e.g. when determining Cox-Snell residuals.

ci

Set toFALSE to omit confidence intervals.

se

Set toTRUE to include standard errors.

B

cl

Width of symmetric confidence intervals, relative to 1.

tidy

IfTRUE, then the results are returned as a tidy dataframe instead of a list. This can help with using theggplot2package to compare summaries for different covariate values.

na.action

...

Further arguments passed to or from other methods. Currentlyunused.

Details

Time-dependent covariates are not currently supported. The covariatevalues are assumed to be constant through time for each fitted curve.

Value

Iftidy=FALSE, a list with one component for each uniquecovariate value (if there are only categorical covariates) or one component(if there are no covariates or any continuous covariates). Each of thesecomponents is a matrix with one row for each time int, giving theestimated survival (or cumulative hazard, or hazard) and 95% confidencelimits. These list components are named with the covariate names andvalues which define them.

Iftidy=TRUE, a data frame is returned instead. This is formed bystacking the above list components, with additional columns to identify thecovariate values that each block corresponds to.

If there are multiple summaries, an additional list component namedX contains a matrix with the exact values of contrasts (dummycovariates) defining each summary.

Theplot.flexsurvreg function can be used to quickly plotthese model-based summaries against empirical summaries such asKaplan-Meier curves, to diagnose model fit.

Confidence intervals are obtained by sampling randomly from the asymptoticnormal distribution of the maximum likelihood estimates and then taking quantiles(see, e.g. Mandel (2013)).

Author(s)

C. H. Jacksonchris.jackson@mrc-bsu.cam.ac.uk

References

Mandel, M. (2013). "Simulation based confidence intervals forfunctions with complicated derivatives." The American Statistician (inpress).

Summarise quantities of interest from fitted flexsurvrtrunc models

Description

This function extracts quantities of interest from the untruncated version of a model with individual-specific right truncation points fitted byflexsurvrtrunc. Note that covariates arecurrently not supported byflexsurvrtrunc.

Usage

## S3 method for class 'flexsurvrtrunc'summary(  object,  type = "survival",  fn = NULL,  t = NULL,  quantiles = 0.5,  ci = TRUE,  se = FALSE,  B = 1000,  cl = 0.95,  ...)

Arguments

object

Output fromflexsurvreg orflexsurvspline, representing a fitted survival model object.

type

"survival" for survival probabilities.

"cumhaz" for cumulative hazards.

"hazard" for hazards.

"rmst" for restricted mean survival.

"mean" for mean survival.

"median" for median survival (alternative totype="quantile" withquantiles=0.5).

"quantile" for quantiles of the survival time distribution.

Ignored if"fn" is specified.

fn

Custom function of the parameters to summarise against time.This has optional first argumentt representing time, and any remainingarguments must be parameters of the distribution. It should return avector of the same length ast.

t

Times to calculate fitted values for. By default, these are thesorted unique observation (including censoring) times in the data - forleft-truncated datasets these are the "stop" times.

quantiles

Iftype="quantile", this specifies the quantiles of the survival time distribution to return estimates for.

ci

Set toFALSE to omit confidence intervals.

se

Set toTRUE to include standard errors.

B

cl

Width of symmetric confidence intervals, relative to 1.

...

Further arguments passed to or from other methods. Currentlyunused.

Nonparametric estimator of survival from right-truncated, uncensored data

Description

Estimates the survivor function from right-truncated, uncensored data byreversing time, interpreting the data as left-truncated, applying theKaplan-Meier / Lynden-Bell estimator and transforming back.

Usage

survrtrunc(t, rtrunc, tmax, data = NULL, eps = 0.001, conf.int = 0.95)

Arguments

t

Vector of observed times from an initial event to a final event.

rtrunc

Individual-specific right truncation points, so that eachindividual's survival timet would not have been observed if it wasgreater than the corresponding element ofrtrunc. If any of theseare greater thantmax, then the actual individual-level truncationpoint for these individuals is taken to betmax.

tmax

Maximum possible time to event that could have been observed.

data

Data frame to findt andrtrunc in. If notsupplied, these should be in the working environment.

eps

Small number that is added tot before implementing thetime-reversed estimator, to ensure the risk set is consistent betweenforward and reverse time scales. It should be just large enough thatt+eps is not==t. This should not need changing from thedefault of 0.001, unlesst are extremely large or small and thedata are rounded to integer.

conf.int

Confidence level, defaulting to 0.95.

Details

Note that this does not estimate the untruncated survivor function - insteadit estimates the survivor function truncated above at a time defined by themaximum possible time that might have been observed in the data.

DefineX as the time of the initial event,Y as the time of thefinal event, then we wish to determine the distribution ofT = Y- X.

Observations are only recorded ifY \leq t_{max}. Then thedistribution ofT in the resulting sample is right-truncated byrtrunc = t_{max} - X.

Equivalently, the distribution oft_{max} - T is left-truncated, sinceit is only observed ift_{max} - T \geq X. Then the standardKaplan-Meier type estimator as implemented insurvfit is used (as described by Lynden-Bell, 1971)and the results transformed back.

This situation might happen in a disease epidemic, whereX is the dateof disease onset for an individual,Y is the date of death, and wewish to estimate the distribution of the timeT from onset to death,given we have only observed people who have died by the datet_{max}.

If the estimated survival is unstable at the highest times, then considerreplacingtmax by a slightly lower value, then if necessary, removingindividuals witht > tmax, so that the estimand is changed to thesurvivor function truncated over a slightly narrower interval.

Value

A list with components:

time Time points where the estimated survival changes.

surv Estimated survival attime, truncated above attmax.

se.surv Standard error of survival.

std.err Standard error of -log(survival). Named this way for consistency withsurvfit.

lower Lower confidence limits for survival.

upper Upper confidence limits for survival.

References

D. Lynden-Bell (1971) A method of allowing for known observationalselection in small samples applied to 3CR quasars. Monthly Notices of theRoyal Astronomical Society, 155:95–118.

Seaman, S., Presanis, A. and Jackson, C. (2020) Review of methods forestimating distribution of time to event from right-truncated data.

Examples

## simulate some event time dataset.seed(1)X <- rweibull(100, 2, 10)T <- rweibull(100, 2, 10)## truncate abovetmax <- 20obs <- X + T < tmaxrtrunc <- tmax - Xdat <- data.frame(X, T, rtrunc)[obs,]sf <-    survrtrunc(T, rtrunc, data=dat, tmax=tmax)plot(sf, conf.int=TRUE)## Kaplan-Meier estimate ignoring truncation is biasedsfnaive <- survfit(Surv(T) ~ 1, data=dat)lines(sfnaive, conf.int=TRUE, lty=2, col="red")## truncate above the maximum observed timetmax <- max(X + T) + 10obs <- X + T < tmaxrtrunc <- tmax - Xdat <- data.frame(X, T, rtrunc)[obs,]sf <-    survrtrunc(T, rtrunc, data=dat, tmax=tmax)plot(sf, conf.int=TRUE)## estimates identical to the standard Kaplan-Meiersfnaive <- survfit(Surv(T) ~ 1, data=dat)lines(sfnaive, conf.int=TRUE, lty=2, col="red")

Tidy a flexsurv model object

Description

Tidy summarizes information about the components of the model into a tidy data frame.

Usage

## S3 method for class 'flexsurvreg'tidy(  x,  conf.int = FALSE,  conf.level = 0.95,  pars = "all",  transform = "none",  ...)

Arguments

x

Output fromflexsurvreg orflexsurvspline, representing a fitted survival model object.

conf.int

Logical. Should confidence intervals be returned? Default isFALSE.

conf.level

The confidence level to use for the confidence interval ifconf.int = TRUE. Default is0.95.

pars

One of"all","baseline", or"coefs" for all parameters, baseline distribution parameters, or covariate effects (i.e. regression betas), respectively. Default is"all".

transform

Character vector of transformations to apply to requestedpars. Default is"none", which returnspars as-is.

Users can specify one or both types of transformations:

"baseline.real" which transforms the baseline distribution parameters to the real number line used for estimation.
"coefs.exp" which exponentiates the covariate effects.

SeeDetails for a more complete explanation.

...

Not currently used.

Details

flexsurvreg models estimate two types of coefficients, baseline distribution parameters, and covariate effects which act on the baseline distribution. By design,flexsurvreg returns distribution parameters on the same scale as is found in the relevantd/p/q/r functions. Covariate effects are returned on the log-scale, which represents either log-time ratios (accelerated failure time models) or log-hazard ratios for proportional hazard models. By default,tidy() will return baseline distribution parameters on their natural scale and covariate effects on the log-scale.

To transform the baseline distribution parameters to the real-value number line (the scale used for estimation), pass the character argument"baseline.real" totransform. To get time ratios or hazard ratios, pass"coefs.exp" totransform. These transformations may be done together by submitting both arguments as a character vector.

Value

Atibble containing the columns:term,estimate,std.error,statistic,p.value,conf.low, andconf.high, by default.

statistic andp.value are only provided for covariate effects (NA for baseline distribution parameters). These are computed as Wald-type test statistics with p-values from a standard normal distribution.

Examples

fitg <- flexsurvreg(formula = Surv(futime, fustat) ~ age, data = ovarian, dist = "gengamma")tidy(fitg)tidy(fitg, pars = "coefs", transform = "coefs.exp")

Tidy a standsurv object.

Description

This function is used internally bystandsurv and tidydata.frames are automatically returned by the function.

Usage

## S3 method for class 'standsurv'tidy(x, ...)

Arguments

x

A standsurv object.

...

Not currently used.

Value

Returns additional tidy data.frames (tibbles)stored as attributes named standpred_at and standpred_contrast.

Total length of stay in particular states for a fully-parametric,time-inhomogeneous Markov multi-state model

Description

The matrix whoser,s entry is the expected amount of time spent instates for a time-inhomogeneous, continuous-time Markov multi-stateprocess that starts in stater, up to a maximum timet. This isdefined as the integral of the corresponding transition probability up tothat time.

Usage

totlos.fs(  x,  trans = NULL,  t = 1,  newdata = NULL,  ci = FALSE,  tvar = "trans",  sing.inf = 1e+10,  B = 1000,  cl = 0.95,  ...)

Arguments

x

x can also be a list of models, with one component for eachpermitted transition, as illustrated inmsfit.flexsurvreg.

trans

Matrix indicating allowed transitions. Seemsfit.flexsurvreg.

t

Time or vector of times to predict up to. Must be finite.

newdata

A data frame specifying the values of covariates in thefitted model, other than the transition number. Seemsfit.flexsurvreg.

ci

tvar

Variable in the data representing the transition type. Notrequired ifx is a list of models.

sing.inf

B

Number of simulations from the normal asymptotic distribution usedto calculate variances. Decrease for greater speed at the expense ofaccuracy.

cl

Width of symmetric confidence intervals, relative to 1.

...

Arguments passed toode indeSolve.

Details

This is computed by solving a second order extension of the Kolmogorovforward differential equation numerically, using the methods in thedeSolve package. The equation is expressed as a linearsystem

\frac{dT(t)}{dt} = P(t)

\frac{dP(t)}{dt} = P(t) Q(t)

and solved forT(t) andP(t) simultaneously, whereT(t)is the matrix of total lengths of stay,P(t) is the transitionprobability matrix for timet, andQ(t) is the transitionhazard or intensity as a function oft. The initial conditions areT(0) = 0 andP(0) = I.

Note that the packagemsm has a similar methodtotlos.msm.totlos.fs should give the same results astotlos.msm whenboth of these conditions hold:

the time-to-event distribution is exponential for alltransitions, thus theflexsurvreg model was fitted withdist="exp", and is time-homogeneous.
themsm model wasfitted withexacttimes=TRUE, thus all the event times are known, andthere are no time-dependent covariates.

This function is only valid for Markov ("clock-forward") multi-statemodels, though no warning or error is currently given if the model is notMarkov. Seetotlos.simfs for the equivalent for semi-Markov("clock-reset") models.

Value

The matrix of lengths of stayT(t), ift is of length1, or a list of matrices ift is longer.

Ifci=TRUE, each element has attributes"lower" and"upper" giving matrices of the corresponding confidence limits.These are formatted for printing but may be extracted usingattr().

The result also has an attributeP giving the transition probabilitymatrices, since these are unavoidably computed as a side effect. These aresuppressed for printing, but can be extracted withattr(...,"P").

Author(s)

Christopher Jacksonchris.jackson@mrc-bsu.cam.ac.uk.

Examples

# BOS example in vignette, and in msfit.flexsurvregbexp <- flexsurvreg(Surv(Tstart, Tstop, status) ~ trans,                    data=bosms3, dist="exp")tmat <- rbind(c(NA,1,2),c(NA,NA,3),c(NA,NA,NA))# predict 4 years spent without BOS, 3 years with BOS, before death# As t increases, this should convergetotlos.fs(bexp, t=10, trans=tmat)totlos.fs(bexp, t=1000, trans=tmat)totlos.fs(bexp, t=c(5,10), trans=tmat)# Answers should match results in help(totlos.simfs) up to Monte Carlo# error there / ODE solving precision here, since with an exponential# distribution, the "semi-Markov" model there is the same as the Markov# model here

Expected total length of stay in specific states, from a fully-parametric,semi-Markov multi-state model

Description

The expected total time spent in each state for semi-Markov multi-statemodels fitted to time-to-event data withflexsurvreg. Thisis defined by the integral of the transition probability matrix, thoughthis is not analytically possible and is computed by simulation.

Usage

totlos.simfs(  x,  trans,  t = 1,  start = 1,  newdata = NULL,  ci = FALSE,  tvar = "trans",  tcovs = NULL,  group = NULL,  M = 1e+05,  B = 1000,  cl = 0.95,  cores = NULL)

Arguments

x

x can also be a list offlexsurvreg models,with one component for each permitted transition, as illustratedinmsfit.flexsurvreg. This can be constructed byfmsm.

trans

Matrix indicating allowed transitions. Seemsfit.flexsurvreg. This is not required ifx is a list constructed byfmsm.

t

Maximum time to predict to.

start

Starting state.

newdata

A data frame specifying the values of covariates in thefitted model, other than the transition number. Seemsfit.flexsurvreg.

ci

tvar

Variable in the data representing the transition type. Notrequired ifx is a list of models.

tcovs

Predictable time-dependent covariates such as age, seesim.fmsm.

group

Optional grouping for the states. For example, if there arefour states, andgroup=c(1,1,2,2), thentotlos.simfsreturns the expected total time in states 1 and 2 combined, and states 3and 4 combined.

M

Number of individuals to simulate in order to approximate thetransition probabilities. Users should adjust this to obtain the requiredprecision.

B

Number of simulations from the normal asymptotic distribution usedto calculate confidence limits. Decrease for greater speed at the expense ofaccuracy.

cl

Width of symmetric confidence intervals, relative to 1.

cores

Number of processor cores used when calculating confidence limits by repeated simulation. The default uses single-core processing.

Details

Note the random sampling method forflexsurvspline models iscurrently very inefficient, so that looping overM will be veryslow.

The equivalent function for time-inhomogeneous Markov models istotlos.fs. Note neither of these functions give errors orwarnings if used with the wrong type of model, but the results will beinvalid.

Value

The expected total time spent in each state (or group of statesgiven bygroup) up to timet, and corresponding confidenceintervals if requested.

Author(s)

Christopher Jacksonchris.jackson@mrc-bsu.cam.ac.uk.

Examples

# BOS example in vignette, and in msfit.flexsurvregbexp <- flexsurvreg(Surv(years, status) ~ trans, data=bosms3, dist="exp")tmat <- rbind(c(NA,1,2),c(NA,NA,3),c(NA,NA,NA))# predict 4 years spent without BOS, 3 years with BOS, before death# As t increases, this should convergetotlos.simfs(bexp, t=10, trans=tmat)totlos.simfs(bexp, t=1000, trans=tmat)

Convert a function with matrix arguments to a function with vectorarguments.

Description

Given a function with matrix arguments, construct an equivalent functionwhich takes vector arguments defined by the columns of the matrix. The newfunction simply usescbind on the vector arguments to make a matrix,and calls the old one.

Usage

unroll.function(mat.fn, ...)

Arguments

mat.fn

A function with any number of arguments, some of which arematrices.

...

A series of other arguments. Their names define whicharguments ofmat.fn are matrices. Their values define a vector ofstrings to be appended to the names of the arguments in the new function.For example

fn <- unroll.function(oldfn, gamma=1:3, alpha=0:1)

will make a new functionfn with argumentsgamma1,gamma2,gamma3,alpha0,alpha1.

Calling

fn(gamma1=a,gamma2=b,gamma3=c,alpha0=d,alpha1=e)

should give the same answer as

oldfn(gamma=cbind(a,b,c),alpha=cbind(d,e))

Value

The new function, with vector arguments.

Usage inflexsurv

This is used byflexsurvspline to allow spline models, whichhave an arbitrary number of parameters, to be fitted usingflexsurvreg.

The “custom distributions” facility offlexsurvregexpects the user-supplied probability density and distributionfunctions to have one explicitly named argument for each scalarparameter, and given R vectorisation, each of those argumentscould be supplied as a vector of alternative parameter values.

However, spline models have a varying number of scalar parameters,determined by the number of knots in the spline.dsurvspline andpsurvspline have anargument calledgamma. This can be supplied as a matrix,with number of columnsn determined by the number of knots(plus 2), and rows referring to alternative parameter values. Thefollowing statements are used in the source offlexsurvspline:

 dfn <-unroll.function(dsurvspline, gamma=0:(nk-1)) pfn <-unroll.function(psurvspline, gamma=0:(nk-1))

to convert these into functions with argumentsgamma0,gamma1,...,gamman, corresponding to the columnsofgamma, wheren = nk-1, and with other argumentsin the same format.

Author(s)

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

Examples

fn <- unroll.function(ncol, x=1:3)fn(1:3, 1:3, 1:3) # equivalent to...ncol(cbind(1:3,1:3,1:3))

Variance-covariance matrix from a flexsurvreg model

Description

Variance-covariance matrix from a flexsurvreg model

Usage

## S3 method for class 'flexsurvreg'vcov(object, ...)

Arguments

object

A fitted model object of classflexsurvreg, e.g. as returned byflexsurvreg orflexsurvspline.

...

Other arguments (currently unused).

Value

Variance-covariance matrix of the estimated parameters, onthe scale that they were estimated on (for positive parametersthis is the log scale).

Movatterモバイル変換

flexsurv: Flexible parametric survival and multi-state models

Description

Details

User guide

Author(s)

References

See Also

helper function to safely convert a Hessian matrix to covariance matrix

Description

Usage

Arguments

Numerical evaluation of the hessian of a function using numDeriv::hessian

Description

Usage

Arguments

Akaike's information criterion from a flexible parametric multistate model

Description

Usage

Arguments

Value

Second-order Akaike information criterion

Description

Usage

Arguments

Details

Second-order Akaike information criterion

Description

Usage

Arguments

Details

Value

References

See Also

Bayesian Information Criterion (BIC) for comparison of flexsurvreg models

Description

Usage

Arguments

Details

Value

References

See Also

Generalized F distribution

Description

Usage

Arguments

Details

Value

Note

Author(s)

References

See Also

Generalized F distribution (original parameterisation)

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Generalized gamma distribution

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Generalized gamma distribution (original parameterisation)

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

The Gompertz distribution