| Title: | Piece-Wise Exponential Additive Mixed Modeling Tools forSurvival Analysis |
| Version: | 0.7.3 |
| Date: | 2025-03-23 |
| Description: | The Piece-wise exponential (Additive Mixed) Model (PAMM; Bender and others (2018) <doi:10.1177/1471082X17748083>) is a powerful model class for the analysis of survival (or time-to-event) data, based on Generalized Additive (Mixed) Models (GA(M)Ms). It offers intuitive specification and robust estimation of complex survival models with stratified baseline hazards, random effects, time-varying effects, time-dependent covariates and cumulative effects (Bender and others (2019)), as well as support for left-truncated data as well as competing risks, recurrent events and multi-state settings. pammtools provides tidy workflow for survival analysis with PAMMs, including data simulation, transformation and other functions for data preprocessing and model post-processing as well as visualization. |
| Depends: | R (≥ 4.1.0) |
| Imports: | mgcv, survival (≥ 2.39-5), checkmate, magrittr, rlang, tidyr(≥ 1.0.0), ggplot2 (≥ 3.2.2), dplyr (≥ 1.0.0), purrr (≥0.2.3), tibble, lazyeval, Formula, mvtnorm, pec, vctrs (≥0.3.0), scam |
| Suggests: | testthat, mstate |
| Config/Needs/website: | coxme, eha, etm, scam, msm, mvna, TBFmultinomial |
| License: | MIT + file LICENSE |
| LazyData: | true |
| URL: | https://adibender.github.io/pammtools/ |
| BugReports: | https://github.com/adibender/pammtools/issues |
| RoxygenNote: | 7.3.2 |
| Encoding: | UTF-8 |
| NeedsCompilation: | no |
| Packaged: | 2025-03-24 14:04:35 UTC; ab |
| Author: | Andreas Bender |
| Maintainer: | Andreas Bender <andreas.bender@stat.uni-muenchen.de> |
| Repository: | CRAN |
| Date/Publication: | 2025-03-24 15:20:02 UTC |
pammtools: Piece-wise exponential Additive Mixed Modeling tools.
Description
pammtools provides functions and utilities that facilitate fittingPiece-wise Exponential Additive Mixed Models (PAMMs), including datatransformation and other convenience functions for pre- and post-processingas well as plotting.
Details
The best way to get an overview of the functionality provided and how tofit PAMMs is to view the vignettesavailable athttps://adibender.github.io/pammtools/articles/.A summary of the vignettes' content is given below:
basics:Introduction to PAMMs and basic modeling.
baseline:Shows how to estimate and visualize baseline model (without covariates) andcomparison to respective Cox-PH model.
convenience:Convenience functions for post-processing and plotting PAMMs.
data-transformation:Transforming data into a format suitable to fit PAMMs.
frailty:Specifying "frailty" terms, i.e., random effects for PAMMs.
splines:Specifying spline smooth terms for PAMMs.
strata:Specifying stratified models in which each level of a grouping variable has adifferent baseline hazard.
tdcovar:Dealing with time-dependent covariates.
tveffects:Specifying time-varying effects.
left-truncation:Estimation for left-truncated data.
competing-risks:Competing risks analysis.
Author(s)
Maintainer: Andreas Benderandreas.bender@stat.uni-muenchen.de (ORCID)
Authors:
Fabian Scheiplfabian.scheipl@stat.uni-muenchen.de (ORCID)
Johannes Pillerjohannes.piller@lmu.de (ORCID)
Philipp Kopperphilipp.kopper@stat.uni-muenchen.de (ORCID)
Other contributors:
Lukas Burkburk@leibniz-bips.de (ORCID) [contributor]
References
Bender, Andreas, Andreas Groll, and Fabian Scheipl. 2018.“A Generalized Additive Model Approach to Time-to-Event Analysis”Statistical Modelling, February. https://doi.org/10.1177/1471082X17748083.
Bender, Andreas, Fabian Scheipl, Wolfgang Hartl, Andrew G. Day, and Helmut Küchenhoff. 2019.“Penalized Estimation of Complex, Non-Linear Exposure-Lag-Response Associations.”Biostatistics 20 (2): 315–31. https://doi.org/10.1093/biostatistics/kxy003.
Bender, Andreas, and Fabian Scheipl. 2018.“pammtools: Piece-Wise Exponential Additive Mixed Modeling Tools.”ArXiv:1806.01042 Stat, June. https://arxiv.org/abs/1806.01042.Ramjith J, Bender A, Roes KCB, Jonker MA. Recurrent events analysiswith piece-wise exponential additive mixed models. 2022. StatisticalModelling., 2022
See Also
Useful links:
Pipe operator
Description
Seemagrittr::%>% for details.
Usage
lhs %>% rhsAdd cumulative incidence function to data
Description
Add cumulative incidence function to data
Usage
add_cif(newdata, object, ...)## Default S3 method:add_cif( newdata, object, ci = TRUE, overwrite = FALSE, alpha = 0.05, nsim = 500L, cause_var = "cause", time_var = NULL, ...)Arguments
newdata | A data frame or list containing the values of the model covariates at which predictionsare required. If this is not provided then predictions corresponding to theoriginal data are returned. If |
object | a fitted |
... | Further arguments passed to |
ci |
|
overwrite | Should hazard columns be overwritten if already present inthe data set? Defaults to |
alpha | The alpha level for confidence/credible intervals. |
nsim | Number of simulations (draws from posterior of estimated coefficients)on which estimation of CIFs and their confidence/credible intervals will bebased on. |
cause_var | Character. Column name of the 'cause' variable. |
time_var | Name of the variable used for the baseline hazard. Ifnot given, defaults to |
Add counterfactual observations for possible transitions
Description
If data only contains one row per transition that took place, this functionadds additional rows for each transition that was possible at that time(for each subject in the data).
Usage
add_counterfactual_transitions( data, from_to_pairs = list(), from_col = "from", to_col = "to", transition_col = "transition")Arguments
data | Data set that only contains rows for transitions that took place. |
from_to_pairs | A list with one element for each possible initial state.The values of each list element indicate possible transitions from that state.Will be calculated from the data if unspecified. |
from_col | Name of the column that stores initial state. |
to_col | Name of the column that stores end state. |
transition_col | Name of the column that contains the transition identifier (factor variable). |
Add predicted (cumulative) hazard to data set
Description
Add (cumulative) hazard based on the provided data set and model.Ifci=TRUE confidence intervals (CI) are also added. Their width canbe controlled via these_mult argument. The method by which theCI are calculated can be specified byci_type.This is a wrapper aroundpredict.gam. Whenreference is specified, the(log-)hazard ratio is calculated.
Usage
add_hazard(newdata, object, ...)## Default S3 method:add_hazard( newdata, object, reference = NULL, type = c("response", "link"), ci = TRUE, se_mult = 2, ci_type = c("default", "delta", "sim"), overwrite = FALSE, time_var = NULL, ...)add_cumu_hazard( newdata, object, ci = TRUE, se_mult = 2, overwrite = FALSE, time_var = NULL, interval_length = "intlen", ...)Arguments
newdata | A data frame or list containing the values of the model covariates at which predictionsare required. If this is not provided then predictions corresponding to theoriginal data are returned. If |
object | a fitted |
... | Further arguments passed to |
reference | A data frame with number of rows equal to |
type | Either |
ci |
|
se_mult | Factor by which standard errors are multiplied for calculatingthe confidence intervals. |
ci_type | The method by which standard errors/confidence intervalswill be calculated. Default transforms the linear predictor atrespective intervals. |
overwrite | Should hazard columns be overwritten if already present inthe data set? Defaults to |
time_var | Name of the variable used for the baseline hazard. Ifnot given, defaults to |
interval_length | The variable in newdata containing the interval lengths.Can be either bare unquoted variable name or character. Defaults to |
See Also
Examples
ped <- tumor[1:50,] %>% as_ped(Surv(days, status)~ age)pam <- mgcv::gam(ped_status ~ s(tend)+age, data = ped, family=poisson(), offset=offset)ped_info(ped) %>% add_hazard(pam, type="link")ped_info(ped) %>% add_hazard(pam, type = "response")ped_info(ped) %>% add_cumu_hazard(pam)Add survival probability estimates
Description
Given suitable data (i.e. data with all columns used for estimation of the model),this functions adds a columnsurv_prob containing survival probabilitiesfor the specified covariate and follow-up information (and CIssurv_lower,surv_upper ifci=TRUE).
Usage
add_surv_prob( newdata, object, ci = TRUE, se_mult = 2, overwrite = FALSE, time_var = NULL, interval_length = "intlen", ...)Arguments
newdata | A data frame or list containing the values of the model covariates at which predictionsare required. If this is not provided then predictions corresponding to theoriginal data are returned. If |
object | a fitted |
ci |
|
se_mult | Factor by which standard errors are multiplied for calculatingthe confidence intervals. |
overwrite | Should hazard columns be overwritten if already present inthe data set? Defaults to |
time_var | Name of the variable used for the baseline hazard. Ifnot given, defaults to |
interval_length | The variable in newdata containing the interval lengths.Can be either bare unquoted variable name or character. Defaults to |
... | Further arguments passed to |
See Also
Examples
ped <- tumor[1:50,] %>% as_ped(Surv(days, status)~ age)pam <- mgcv::gam(ped_status ~ s(tend)+age, data=ped, family=poisson(), offset=offset)ped_info(ped) %>% add_surv_prob(pam, ci=TRUE)Add time-dependent covariate to a data set
Description
Given a data set in standard format (with one row per subject/observation),this function adds a column with the specified exposure time pointsand a column with respective exposures, created fromrng_fun.This function should usually only be used to create data sets passedtosim_pexp.
Usage
add_tdc(data, tz, rng_fun, ...)Arguments
data | A data set with variables specified in |
tz | A numeric vector of exposure times (relative to thebeginning of the follow-up time |
rng_fun | A random number generating function that createsthe time-dependent covariates at time points |
... | Currently not used. |
Embeds the data set with the specified (relative) term contribution
Description
Adds the contribution of a specific term to thelinear predictor to the data specified bynewdata.Essentially a wrapper topredict.gam, withtype="terms".Thus most arguments and their documentation below is frompredict.gam.
Usage
add_term(newdata, object, term, reference = NULL, ci = TRUE, se_mult = 2, ...)Arguments
newdata | A data frame or list containing the values of the model covariates at which predictionsare required. If this is not provided then predictions corresponding to theoriginal data are returned. If |
object | a fitted |
term | A character (vector) or regular expression indicating forwhich term(s) information should be extracted and added to data set. |
reference | A data frame with number of rows equal to |
ci |
|
se_mult | The factor by which standard errors are multiplied to formconfidence intervals. |
... | Further arguments passed to |
Examples
library(ggplot2)ped <- as_ped(tumor, Surv(days, status)~ age, cut = seq(0, 2000, by = 100))pam <- mgcv::gam(ped_status ~ s(tend) + s(age), family = poisson(), offset = offset, data = ped)#term contribution for sequence of agess_age <- ped %>% make_newdata(age = seq_range(age, 50)) %>% add_term(pam, term = "age")ggplot(s_age, aes(x = age, y = fit)) + geom_line() + geom_ribbon(aes(ymin = ci_lower, ymax = ci_upper), alpha = .3)# term contribution relative to mean ages_age2 <- ped %>% make_newdata(age = seq_range(age, 50)) %>% add_term(pam, term = "age", reference = list(age = mean(.$age)))ggplot(s_age2, aes(x = age, y = fit)) + geom_line() + geom_ribbon(aes(ymin = ci_lower, ymax = ci_upper), alpha = .3)Add transition probabilities confidence intervals
Description
Add transition probabilities confidence intervals
Usage
add_trans_ci(newdata, object, nsim = 100L, alpha = 0.05, ...)Add transition probabilities
Description
Add (cumulative) hazard based on the provided data set and model.Ifci=TRUE confidence intervals (CI) are also added. Their width canbe controlled via these_mult argument. The method by which theCI are calculated can be specified byci_type.This is a wrapper aroundpredict.gam. Whenreference is specified, the(log-)hazard ratio is calculated.
Usage
add_trans_prob( newdata, object, overwrite = FALSE, ci = FALSE, alpha = 0.05, nsim = 100L, time_var = NULL, interval_length = "intlen", ...)Arguments
newdata | A data frame or list containing the values of the model covariates at which predictionsare required. If this is not provided then predictions corresponding to theoriginal data are returned. If |
object | a fitted |
overwrite | Should hazard columns be overwritten if already present inthe data set? Defaults to |
ci |
|
alpha | The alpha level for confidence/credible intervals. |
nsim | Number of simulations (draws from posterior of estimated coefficients)on which estimation of CIFs and their confidence/credible intervals will bebased on. |
time_var | Name of the variable used for the baseline hazard. Ifnot given, defaults to |
interval_length | The variable in newdata containing the interval lengths.Can be either bare unquoted variable name or character. Defaults to |
... | Further arguments passed to |
See Also
Examples
ped <- tumor[1:50,] %>% as_ped(Surv(days, status)~ age)pam <- mgcv::gam(ped_status ~ s(tend)+age, data = ped, family=poisson(), offset=offset)ped_info(ped) %>% add_hazard(pam, type="link")ped_info(ped) %>% add_hazard(pam, type = "response")ped_info(ped) %>% add_cumu_hazard(pam)Transform crps object to data.frame
Description
Aas.data.frame S3 method for objects of classcrps.
Usage
## S3 method for class 'crps'as.data.frame(x, row.names = NULL, optional = FALSE, ...)Arguments
x | An object of class |
row.names |
|
optional | logical. If |
... | additional arguments to be passed to or from methods. |
Transform data to Piece-wise Exponential Data (PED)
Description
This is the general data transformation function provided by thepammtools package. Two main applications must be distinguished:
Transformation of standard time-to-event data.
Transformation of left-truncated time-to-event data.
Transformation of time-to-event data with time-dependent covariates (TDC).
For the latter, the type of effect one wants to estimate is alsoimportant for the data transformation step.In any case, the data transformation is specified by a two sided formula.In case of TDCs, the right-hand-side of the formula can contain formula specialsconcurrent andcumulative.See thedata-transformationvignette for details.
Usage
as_ped(data, ...)## S3 method for class 'data.frame'as_ped( data, formula, cut = NULL, max_time = NULL, tdc_specials = c("concurrent", "cumulative"), censor_code = 0L, transition = character(), timescale = c("gap", "calendar"), min_events = 1L, ...)## S3 method for class 'nested_fdf'as_ped(data, formula, ...)## S3 method for class 'list'as_ped( data, formula, tdc_specials = c("concurrent", "cumulative"), censor_code = 0L, ...)is.ped(x)## S3 method for class 'ped'as_ped(data, newdata, ...)## S3 method for class 'pamm'as_ped(data, newdata, ...)as_ped_multistate( data, formula, cut = NULL, max_time = NULL, tdc_specials = c("concurrent", "cumulative"), censor_code = 0L, transition = character(), timescale = c("gap", "calendar"), min_events = 1L, ...)Arguments
data | Either an object inheriting from data frame or in case oftime-dependent covariates a list of data frames (of length 2), where the first data framecontains the time-to-event information and static covariates while the second(and potentially further data frames) contain information on time-dependentcovariates and the times at which they have been observed. |
... | Further arguments passed to the |
formula | A two sided formula with a |
cut | Split points, used to partition the follow up into intervals.If unspecified, all unique event times will be used. |
max_time | If |
tdc_specials | A character vector. Names of potential specials in |
censor_code | Specifies the value of the status variable that indicates censoring.Often this will be |
x | any R object. |
newdata | A new data set ( |
Value
A data frame classped in piece-wise exponential data format.
Examples
tumor[1:3, ]tumor[1:3, ] %>% as_ped(Surv(days, status)~ age + sex, cut = c(0, 500, 1000))tumor[1:3, ] %>% as_ped(Surv(days, status)~ age + sex)## Not run: data("cgd", package = "frailtyHL")cgd2 <- cgd %>% select(id, tstart, tstop, enum, status, age) %>% filter(enum %in% c(1:2))ped_re <- as_ped_multistate( formula = Surv(tstart, tstop, status) ~ age + enum, data = cgd2, transition = "enum", timescale = "calendar")## End(Not run)Competing risks trafo
Description
This is the general data transformation function provided by thepammtools package. Two main applications must be distinguished:
Transformation of standard time-to-event data.
Transformation of left-truncated time-to-event data.
Transformation of time-to-event data with time-dependent covariates (TDC).
For the latter, the type of effect one wants to estimate is alsoimportant for the data transformation step.In any case, the data transformation is specified by a two sided formula.In case of TDCs, the right-hand-side of the formula can contain formula specialsconcurrent andcumulative.See thedata-transformationvignette for details.
Usage
as_ped_cr( data, formula, cut = NULL, max_time = NULL, tdc_specials = c("concurrent", "cumulative"), censor_code = 0L, combine = TRUE, ...)Arguments
data | Either an object inheriting from data frame or in case oftime-dependent covariates a list of data frames (of length 2), where the first data framecontains the time-to-event information and static covariates while the second(and potentially further data frames) contain information on time-dependentcovariates and the times at which they have been observed. |
formula | A two sided formula with a |
cut | Split points, used to partition the follow up into intervals.If unspecified, all unique event times will be used. |
max_time | If |
tdc_specials | A character vector. Names of potential specials in |
censor_code | Specifies the value of the status variable that indicates censoring.Often this will be |
... | Further arguments passed to the |
Value
A data frame classped in piece-wise exponential data format.
Examples
tumor[1:3, ]tumor[1:3, ] %>% as_ped(Surv(days, status)~ age + sex, cut = c(0, 500, 1000))tumor[1:3, ] %>% as_ped(Surv(days, status)~ age + sex)## Not run: data("cgd", package = "frailtyHL")cgd2 <- cgd %>% select(id, tstart, tstop, enum, status, age) %>% filter(enum %in% c(1:2))ped_re <- as_ped_multistate( formula = Surv(tstart, tstop, status) ~ age + enum, data = cgd2, transition = "enum", timescale = "calendar")## End(Not run)Calculate confidence intervals
Description
Given 2 column matrix or data frame, returns 3 column data.framewith coefficient estimate plus lower and upper borders of the95% confidence intervals.
Usage
calc_ci(ftab)Arguments
ftab | A table with two columns, containing coefficients in the firstcolumn and standard-errors in the second column. |
Create a data frame from all combinations of data frames
Description
Works likeexpand.grid but for data frames.
Usage
combine_df(...)Arguments
... | Data frames that should be combined to one data frame.Elements of first df vary fastest, elements of last df vary slowest. |
Examples
combine_df( data.frame(x=1:3, y=3:1), data.frame(x1=c("a", "b"), x2=c("c", "d")), data.frame(z=c(0, 1)))Calculate difference in cumulative hazards and respective standard errors
Description
CIs are calculated by sampling coefficients from their posterior andcalculating the cumulative hazard differencensim times. The CIare obtained by the 2.5\
Usage
compute_cumu_diff(d1, d2, model, alpha = 0.05, nsim = 100L)Arguments
d1 | A data set used as |
d2 | See |
model | A model object for which a predict method is implemented whichreturns the design matrix (e.g., |
Formula specials for defining time-dependent covariates
Description
So far, two specials are implemented.concurrent is used whenthe goal is to estimate a concurrent effect of the TDC.cumulativeis used when the goal is to estimate a cumulative effect of the TDC. Theseshould usually not be called directly but rather as part of theformulaargument toas_ped.See thevignette on data transformationfor details.
Usage
cumulative(..., tz_var, ll_fun = function(t, tz) t >= tz, suffix = NULL)concurrent(..., tz_var, lag = 0, suffix = NULL)has_special(formula, special = "cumulative")Arguments
... | For |
tz_var | The name of the variable that stores information on thetimes at which the TDCs specified in this term where observed. |
ll_fun | Function that specifies how the lag-lead matrixshould be constructed. First argument is the follow up timesecond argument is the time of exposure. |
lag | a single positive number giving the time lag between fora concurrent effect to occur (i.e., the TDC at time of exposure |
formula | A two sided formula with a |
special | The name of the special whose existence in the |
Time-dependent covariates of thepatient data set.
Description
This data set contains the time-dependent covariates (TDCs) for thepatientdata set. Note that nutrition was protocoled for at most 12 days afterICU admission. The data set includes:
- CombinedID
Unique patient identifier. Can be used to merge with
patientdata- Study_Day
The calendar (!) day at which calories (or proteins) wereadministered
- caloriesPercentage
The percentage of target calories supplied to thepatient by the ICU staff
- proteinGproKG
The amount of protein supplied to the patient by theICU staff
Usage
dailyFormat
An object of classtbl_df (inherits fromtbl,data.frame) with 18797 rows and 4 columns.
dplyr Verbs forped-Objects
Description
Seedplyr documentation of the respective functions fordescription and examples.
Usage
## S3 method for class 'ped'arrange(.data, ...)## S3 method for class 'ped'group_by(.data, ..., .add = FALSE)## S3 method for class 'ped'ungroup(x, ...)## S3 method for class 'ped'distinct(.data, ..., .keep_all = FALSE)## S3 method for class 'ped'filter(.data, ...)## S3 method for class 'ped'sample_n(tbl, size, replace = FALSE, weight = NULL, .env = NULL, ...)## S3 method for class 'ped'sample_frac(tbl, size = 1, replace = FALSE, weight = NULL, .env = NULL, ...)## S3 method for class 'ped'slice(.data, ...)## S3 method for class 'ped'select(.data, ...)## S3 method for class 'ped'mutate(.data, ...)## S3 method for class 'ped'rename(.data, ...)## S3 method for class 'ped'summarise(.data, ...)## S3 method for class 'ped'summarize(.data, ...)## S3 method for class 'ped'transmute(.data, ...)## S3 method for class 'ped'inner_join(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...)## S3 method for class 'ped'full_join(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...)## S3 method for class 'ped'left_join(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...)## S3 method for class 'ped'right_join(x, y, by = NULL, copy = FALSE, suffix = c(".x", ".y"), ...)Arguments
.data | an object of class |
... | see |
x | an object of class |
tbl | an object of class |
size | < |
replace | Sample with or without replacement? |
weight | < |
.env | DEPRECATED. |
by | A join specification created with If To join on different variables between To join by multiple variables, use a
For simple equality joins, you can alternatively specify a character vectorof variable names to join by. For example, To perform a cross-join, generating all combinations of |
copy | If |
suffix | If there are non-joined duplicate variables in |
Value
a modifiedped object (except fordo)
A formula special used to handle cumulative effect specifications
Description
Can be used in the second part of the formula specification providedtosim_pexp and should only be used in thiscontext.
Usage
fcumu(..., by = NULL, f_xyz, ll_fun)Extract transition information from different objects
Description
Extract transition information from different objects
Usage
from_to_pairs(t_mat, ...)from_to_pairs2(t_mat, ...)## S3 method for class 'data.frame'from_to_pairs(t_mat, from_col = "from", to_col = "to", ...)Arguments
t_mat | an object that contains information about possible transitions. |
from_col | The name of the column in the data frame that contains "from" states. |
to_col | The name of the column in the data frame that contains "to" states. |
Examples
## Not run: df = data.frame(id = c(1,1, 2,2), from = c(1, 1, 2, 2), to = c(2, 3, 2, 2))from_to_pairs(df)## End(Not run)(Cumulative) (Step-) Hazard Plots.
Description
geom_hazard is an extension of thegeom_line, andis optimized for (cumulative) hazard plots. Essentially, it adds a (0,0)row to the data, if not already the case. Stolen from theRmcdrPlugin.KMggplot2 (slightly modified).
Usage
geom_hazard( mapping = NULL, data = NULL, stat = "identity", position = "identity", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, ...)geom_stephazard( mapping = NULL, data = NULL, stat = "identity", position = "identity", direction = "vh", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, ...)geom_surv( mapping = NULL, data = NULL, stat = "identity", position = "identity", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, ...)Arguments
mapping | Set of aesthetic mappings created by |
data | The data to be displayed in this layer. There are threeoptions: If A A |
stat | The statistical transformation to use on the data for this layer.When using a
|
position | A position adjustment to use on the data for this layer. Thiscan be used in various ways, including to prevent overplotting andimproving the display. The
|
na.rm | If |
show.legend | logical. Should this layer be included in the legends? |
inherit.aes | If |
... | Other arguments passed on to
|
direction | direction of stairs: 'vh' for vertical then horizontal,'hv' for horizontal then vertical, or 'mid' for step half-way betweenadjacent x-values. |
See Also
Examples
library(ggplot2)library(pammtools)ped <- tumor[10:50,] %>% as_ped(Surv(days, status)~1)pam <- mgcv::gam(ped_status ~ s(tend), data=ped, family = poisson(), offset = offset)ndf <- make_newdata(ped, tend = unique(tend)) %>% add_hazard(pam)# piece-wise constant hazardsggplot(ndf, aes(x = tend, y = hazard)) + geom_vline(xintercept = c(0, ndf$tend[c(1, (nrow(ndf)-2):nrow(ndf))]), lty = 3) + geom_hline(yintercept = c(ndf$hazard[1:3], ndf$hazard[nrow(ndf)]), lty = 3) + geom_stephazard() + geom_step(col=2) + geom_step(col=2, lty = 2, direction="vh")# comulative hazardndf <- ndf %>% add_cumu_hazard(pam)ggplot(ndf, aes(x = tend, y = cumu_hazard)) + geom_hazard() + geom_line(col=2) # doesn't start at (0, 0)# survival probabilityndf <- ndf %>% add_surv_prob(pam)ggplot(ndf, aes(x = tend, y = surv_prob)) + geom_surv() + geom_line(col=2) # doesn't start at c(0,1)Step ribbon plots.
Description
geom_stepribbon is an extension of thegeom_ribbon, andis optimized for Kaplan-Meier plots with pointwise confidence intervalsor a confidence band. The defaultdirection-argument"hv" isappropriate for right-continuous step functions like the hazard rates etcreturned bypammtools.
Usage
geom_stepribbon( mapping = NULL, data = NULL, stat = "identity", position = "identity", direction = "hv", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, ...)Arguments
mapping | Set of aesthetic mappings created by |
data | The data to be displayed in this layer. There are threeoptions: If A A |
stat | The statistical transformation to use on the data for this layer.When using a
|
position | A position adjustment to use on the data for this layer. Thiscan be used in various ways, including to prevent overplotting andimproving the display. The
|
direction | direction of stairs: 'vh' for vertical then horizontal,'hv' for horizontal then vertical, or 'mid' for step half-way betweenadjacent x-values. |
na.rm | If |
show.legend | logical. Should this layer be included in the legends? |
inherit.aes | If |
... | Other arguments passed on to
|
See Also
geom_ribbongeom_stepribbon
Examples
library(ggplot2)huron <- data.frame(year = 1875:1972, level = as.vector(LakeHuron))h <- ggplot(huron, aes(year))h + geom_stepribbon(aes(ymin = level - 1, ymax = level + 1), fill = "grey70") + geom_step(aes(y = level))h + geom_ribbon(aes(ymin = level - 1, ymax = level + 1), fill = "grey70") + geom_line(aes(y = level))Calculate CIF for one cause
Description
Calculate CIF for one cause
Usage
get_cif(newdata, object, ...)## Default S3 method:get_cif( newdata, object, ci, time_var, alpha, nsim, cause_var, coefs, V, sim_coef_mat, ...)Extract cumulative coefficients (cumulative hazard differences)
Description
These functions are designed to extract (or mimic) the cumulative coefficientsusually used in additive hazards models (Aalen model) to depict (time-varying)covariate effects. For PAMMs, these are the differencesbetween the cumulative hazard rates where all covariates except one have theidentical values. For a numeric covariate of interest, this calculates\Lambda(t|x+1) - \Lambda(t|x). For non-numeric covariatesthe cumulative hazard of the reference level is subtracted fromthe cumulative hazards evaluated at all non reference levels. Standarderrors are calculated using the delta method.
Usage
get_cumu_coef(model, data = NULL, terms, ...)## S3 method for class 'gam'get_cumu_coef(model, data, terms, ...)## S3 method for class 'aalen'get_cumu_coef(model, data = NULL, terms, ci = TRUE, ...)## S3 method for class 'cox.aalen'get_cumu_coef(model, data = NULL, terms, ci = TRUE, ...)Arguments
model | Object from which to extract cumulative coefficients. |
data | Additional data if necessary. |
terms | A character vector of variables for which the cumulativecoefficient should be calculated. |
... | Further arguments passed to methods. |
ci | Logical. Indicates if confidence intervals should be returned aswell. |
Calculate (or plot) cumulative effect for all time-points of the follow-up
Description
Calculate (or plot) cumulative effect for all time-points of the follow-up
Usage
get_cumu_eff(data, model, term, z1, z2 = NULL, se_mult = 2)gg_cumu_eff(data, model, term, z1, z2 = NULL, se_mult = 2, ci = TRUE)Arguments
data | Data used to fit the |
model | A suitable model object which will be used to estimate thepartial effect of |
term | A character string indicating the model term for which partialeffects should be plotted. |
z1 | The exposure profile for which to calculate the cumulative effect.Can be either a single number or a vector of same length as unique observationtime points. |
z2 | If provided, calculated cumulative effect is for the differencebetween the two exposure profiles (g(z1,t)-g(z2,t)). |
se_mult | Multiplicative factor used to calculate confidence intervals(e.g., lower = fit - 2*se). |
ci | Logical. Indicates if confidence intervals for the |
Calculate cumulative hazard
Description
Calculate cumulative hazard
Usage
get_cumu_hazard( newdata, object, ci = TRUE, ci_type = c("default", "delta", "sim"), time_var = NULL, se_mult = 2, interval_length = "intlen", nsim = 100L, ...)Arguments
newdata | A data frame or list containing the values of the model covariates at which predictionsare required. If this is not provided then predictions corresponding to theoriginal data are returned. If |
object | a fitted |
ci |
|
ci_type | The method by which standard errors/confidence intervalswill be calculated. Default transforms the linear predictor atrespective intervals. |
time_var | Name of the variable used for the baseline hazard. Ifnot given, defaults to |
se_mult | Factor by which standard errors are multiplied for calculatingthe confidence intervals. |
interval_length | The variable in newdata containing the interval lengths.Can be either bare unquoted variable name or character. Defaults to |
... | Further arguments passed to |
Expand time-dependent covariates to functionals
Description
Given formula specification on how time-dependent covariates affect theoutcome, creates respective functional covariate as well as auxiliarymatrices for time/latency etc.
Usage
get_cumulative(data, formula)expand_cumulative(data, func, n_func)Arguments
data | Data frame (or similar) in which variables specified in ...will be looked for |
formula | A formula containing |
func | Single evaluated |
Obtain interval break points
Description
Default method words for data frames.The list method applies the default method to each data set within the list.
Usage
get_cut(data, formula, cut = NULL, ...)## Default S3 method:get_cut(data, formula, cut = NULL, max_time = NULL, event = 1L, ...)## S3 method for class 'list'get_cut( data, formula, cut = NULL, max_time = NULL, event = 1L, timescale = "gap", ...)Exctract event types
Description
Given a formula that specifies the status variable of the outcome, this functionextracts the different event types (except for censoring, specified bycensor_code).
Usage
get_event_types(data, formula, censor_code)Arguments
data | Either an object inheriting from data frame or in case oftime-dependent covariates a list of data frames (of length 2), where the first data framecontains the time-to-event information and static covariates while the second(and potentially further data frames) contain information on time-dependentcovariates and the times at which they have been observed. |
formula | A two sided formula with a |
censor_code | Specifies the value of the status variable that indicates censoring.Often this will be |
Calculate predicted hazard
Description
Calculate predicted hazard
Usage
get_hazard(object, newdata, ...)## Default S3 method:get_hazard( object, newdata, reference = NULL, ci = TRUE, type = c("response", "link"), ci_type = c("default", "delta", "sim"), time_var = NULL, se_mult = 2, ...)Arguments
object | a fitted |
newdata | A data frame or list containing the values of the model covariates at which predictionsare required. If this is not provided then predictions corresponding to theoriginal data are returned. If |
... | Further arguments passed to |
reference | A data frame with number of rows equal to |
ci |
|
type | Either |
ci_type | The method by which standard errors/confidence intervalswill be calculated. Default transforms the linear predictor atrespective intervals. |
time_var | Name of the variable used for the baseline hazard. Ifnot given, defaults to |
se_mult | Factor by which standard errors are multiplied for calculatingthe confidence intervals. |
Information on intervals in which times fall
Description
Information on intervals in which times fall
Usage
get_intervals(x, times, ...)## Default S3 method:get_intervals(x, times, left.open = TRUE, rightmost.closed = TRUE, ...)Arguments
x | An object from which interval information can be obtained,see |
times | A vector of times for which corresponding interval informationshould be returned. |
... | Further arguments passed to |
left.open | logical; if true all the intervals are open at leftand closed at right; in the formulas below, |
rightmost.closed | logical; if true, the rightmost interval, |
Value
Adata.frame containing information on intervals in whichvalues oftimes fall.
See Also
Examples
set.seed(111018)brks <- c(0, 4.5, 5, 10, 30)int_info(brks)x <- runif (3, 0, 30)xget_intervals(brks, x)Construct or extract data that represents a lag-lead window
Description
Constructs lag-lead window data set from raw inputs or from data objectswith suitable information stored in attributes, e.g., objects createdbyas_ped.
Usage
get_laglead(x, ...)## Default S3 method:get_laglead(x, tz, ll_fun, ...)## S3 method for class 'data.frame'get_laglead(x, ...)Arguments
x | Either a numeric vector of follow-up cut points or a suitable object. |
... | Further arguments passed to methods. |
tz | A vector of exposure times |
ll_fun | Function that specifies how the lag-lead matrixshould be constructed. First argument is the follow up timesecond argument is the time of exposure. |
Examples
get_laglead(0:10, tz=-5:5, ll_fun=function(t, tz) { t >= tz + 2 & t <= tz + 2 + 3})gg_laglead(0:10, tz=-5:5, ll_fun=function(t, tz) { t >= tz + 2 & t <= tz + 2 + 3})Extract variables from the left-hand-side of a formula
Description
Extract variables from the left-hand-side of a formula
Extract variables from the right-hand side of a formula
Usage
get_lhs_vars(formula)get_rhs_vars(formula)Arguments
formula | A |
Extract variables from the left-hand-side of a formula
Description
Extract variables from the left-hand-side of a formula
Extract variables from the right-hand side of a formula
Usage
get_ped_form( formula, data = NULL, tdc_specials = c("concurrent", "cumulative"))Arguments
formula | A |
Extract plot information for all special model terms
Description
Given amgcvgamObject, returns the informationused for the default plots produced byplot.gam.
Usage
get_plotinfo(x, ...)Arguments
x | a fitted |
... | Further arguments passed to |
Calculate simulation based confidence intervals
Description
Calculate simulation based confidence intervals
Usage
get_sim_ci(newdata, object, alpha = 0.05, nsim = 100L, ...)helper function for add_trans_ci
Description
helper function for add_trans_ci
Usage
get_sim_cumu(newdata, ...)Calculate survival probabilities
Description
Calculate survival probabilities
Usage
get_surv_prob( newdata, object, ci = TRUE, ci_type = c("default", "delta", "sim"), se_mult = 2L, time_var = NULL, interval_length = "intlen", nsim = 100L, ...)Arguments
newdata | A data frame or list containing the values of the model covariates at which predictionsare required. If this is not provided then predictions corresponding to theoriginal data are returned. If |
object | a fitted |
ci |
|
se_mult | Factor by which standard errors are multiplied for calculatingthe confidence intervals. |
time_var | Name of the variable used for the baseline hazard. Ifnot given, defaults to |
interval_length | The variable in newdata containing the interval lengths.Can be either bare unquoted variable name or character. Defaults to |
... | Further arguments passed to |
Extract variables from the left-hand-side of a formula
Description
Extract variables from the left-hand-side of a formula
Extract variables from the right-hand side of a formula
Usage
get_tdc_form( formula, data = NULL, tdc_specials = c("concurrent", "cumulative"), invert = FALSE)Arguments
formula | A |
Extract variables from the left-hand-side of a formula
Description
Extract variables from the left-hand-side of a formula
Extract variables from the right-hand side of a formula
Usage
get_tdc_vars(formula, specials = "cumulative", data = NULL)Arguments
formula | A |
Extract partial effects for specified model terms
Description
Extract partial effects for specified model terms
Usage
get_term(data, fit, term, n = 100, ...)Arguments
data | A data frame containing variables used to fit the model. Onlyfirst row will be used. |
fit | A fitted object of class |
term | The (non-linear) model term of interest. |
... | Further arguments passed to |
Extract the partial effects of non-linear model terms
Description
This function basically creates a newdf fromdata foreach term interms, creating a range from minimum and maximum of thepredict(fit, newdata=df, type="terms"). Terms are thenstacked to a tidy data frame.
Usage
get_terms(data, fit, terms, ...)Arguments
data | A data frame containing variables used to fit the model. Onlyfirst row will be used. |
fit | A fitted object of class |
terms | A character vector (can be length one). Specifies the termsfor which partial effects will be returned |
... | Further arguments passed to |
Value
A tibble with 5 columns.
Examples
library(survival)fit <- coxph(Surv(time, status) ~ pspline(karno) + pspline(age), data=veteran)terms_df <- veteran %>% get_terms(fit, terms = c("karno", "age"))head(terms_df)tail(terms_df)Forrest plot of fixed coefficients
Description
Given a model object, returns a data frame with columnsvariable,coef (coefficient),ci_lower (lower 95\ci_upper (upper 95\
Usage
gg_fixed(x, intercept = FALSE, ...)Arguments
x | A model object. |
intercept | Logical, indicating whether intercept term should be included.Defaults to |
... | Currently not used. |
See Also
Examples
g <- mgcv::gam(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width + Species, data=iris)gg_fixed(g, intercept=TRUE)gg_fixed(g)Plot Lag-Lead windows
Description
Given data defining a Lag-lead window, returns respective plot as aggplot2 object.
Usage
gg_laglead(x, ...)## Default S3 method:gg_laglead(x, tz, ll_fun, ...)## S3 method for class 'LL_df'gg_laglead( x, high_col = "grey20", low_col = "whitesmoke", grid_col = "lightgrey", ...)## S3 method for class 'nested_fdf'gg_laglead(x, ...)Arguments
x | Either a numeric vector of follow-up cut points or a suitable object. |
... | Further arguments passed to methods. |
tz | A vector of exposure times |
ll_fun | Function that specifies how the lag-lead matrixshould be constructed. First argument is the follow up timesecond argument is the time of exposure. |
high_col | Color used to highlight exposure times within the lag-lead window. |
low_col | Color of exposure times outside the lag-lead window. |
grid_col | Color of grid lines. |
See Also
get_laglead
Examples
## Example 1: supply t, tz, ll_fun directly gg_laglead(1:10, tz=-5:5, ll_fun=function(t, tz) { t >= tz + 2 & t <= tz + 2 + 3})## Example 2: extract information on t, tz, ll_from data with respective attributesdata("simdf_elra", package = "pammtools")gg_laglead(simdf_elra)Visualize effect estimates for specific covariate combinations
Description
Depending on the plot function and input, creates either a 1-dimensional slices,bivariate surface or (1D) cumulative effect.
Usage
gg_partial(data, model, term, ..., reference = NULL, ci = TRUE)gg_partial_ll( data, model, term, ..., reference = NULL, ci = FALSE, time_var = "tend")get_partial_ll( data, model, term, ..., reference = NULL, ci = FALSE, time_var = "tend")Arguments
data | Data used to fit the |
model | A suitable model object which will be used to estimate thepartial effect of |
term | A character string indicating the model term for which partialeffects should be plotted. |
... | Covariate specifications (expressions) that will be evaluatedby looking for variables in |
reference | If specified, should be a list with covariate value pairs,e.g. |
ci | Logical. Indicates if confidence intervals for the |
time_var | The name of the variable that was used in |
Plot Normal QQ plots for random effects
Description
Plot Normal QQ plots for random effects
Usage
gg_re(x, ...)Arguments
x | a fitted |
... | Further arguments passed to |
See Also
Examples
library(pammtools)data("patient")ped <- patient %>% dplyr::slice(1:100) %>% as_ped(Surv(Survdays, PatientDied)~ ApacheIIScore + CombinedicuID,)pam <- mgcv::gam(ped_status ~ s(tend) + ApacheIIScore + s(CombinedicuID, bs="re"), data=ped, family=poisson(), offset=offset)gg_re(pam)plot(pam, select = 2)Plot 1D (smooth) effects
Description
Flexible, high-level plotting function for (non-linear) effects conditionalon further covariate specifications and potentially relative toa comparison specification.
Usage
gg_slice(data, model, term, ..., reference = NULL, ci = TRUE)Arguments
data | Data used to fit the |
model | A suitable model object which will be used to estimate thepartial effect of |
term | A character string indicating the model term for which partialeffects should be plotted. |
... | Covariate specifications (expressions) that will be evaluatedby looking for variables in |
reference | If specified, should be a list with covariate value pairs,e.g. |
ci | Logical. Indicates if confidence intervals for the |
Examples
ped <- tumor[1:200, ] %>% as_ped(Surv(days, status) ~ . )model <- mgcv::gam(ped_status~s(tend) + s(age, by = complications), data=ped, family = poisson(), offset=offset)make_newdata(ped, age = seq_range(age, 20), complications = levels(complications))gg_slice(ped, model, "age", age=seq_range(age, 20), complications=levels(complications))gg_slice(ped, model, "age", age=seq_range(age, 20), complications=levels(complications), ci = FALSE)gg_slice(ped, model, "age", age=seq_range(age, 20), complications=levels(complications), reference=list(age = 50))Plot smooth 1d terms of gam objects
Description
Given a gam model this convenience function returns a plot of allsmooth terms contained in the model. If more than one smooth is present, thedifferent smooth are faceted.
Usage
gg_smooth(x, ...)## Default S3 method:gg_smooth(x, fit, ...)Arguments
x | A data frame or object of class |
... | Further arguments passed to |
fit | A model object. |
Value
Aggplot object.
See Also
get_terms
Examples
g1 <- mgcv::gam(Sepal.Length ~ s(Sepal.Width) + s(Petal.Length), data=iris)gg_smooth(iris, g1, terms=c("Sepal.Width", "Petal.Length"))Plot tensor product effects
Description
Given a gam model this convenience function returns aggplot2 objectdepicting 2d smooth terms specified in the model as heat/contour plots. Ifmore than one 2d smooth term is present individual terms are faceted.
Usage
gg_tensor(x, ci = FALSE, ...)Arguments
x | a fitted |
ci | A logical value indicating whether confidence intervals should becalculated and returned. Defaults to |
... | Further arguments passed to |
See Also
Examples
g <- mgcv::gam(Sepal.Length ~ te(Sepal.Width, Petal.Length), data=iris)gg_tensor(g)gg_tensor(g, ci=TRUE)gg_tensor(update(g, .~. + te(Petal.Width, Petal.Length)))Checks if data contains timd-dependent covariates
Description
Checks if data contains timd-dependent covariates
Usage
has_tdc(data, id_var)Arguments
data | A data frame (potentially) containing time-dependent covariates. |
id_var | A character indicating the grouping variable. For each covariateit will be checked if their values change within a group specified by |
Value
Logical.TRUE if data contains time-dependent covariates, elseFALSE.
Create start/end times and interval information
Description
Given interval breaks points, returns data frame with information oninterval start time, interval end time, interval length and a factorvariable indicating the interval (left open intervals). If an object of classped is provided, extracts unique interval information from object.
Usage
int_info(x, ...)## Default S3 method:int_info(x, min_time = 0L, ...)## S3 method for class 'data.frame'int_info(x, min_time = 0L, ...)## S3 method for class 'ped'int_info(x, ...)## S3 method for class 'pamm'int_info(x, ...)Arguments
x | A numeric vector of cut points in which the follow-up should bepartitioned in or object of class |
... | Currently ignored. |
min_time | Only intervals that have lower borders larger thanthis value will be included in the resulting data frame. |
Value
A data frame containing the start and end times of theintervals specified by thex argument. Additionally, the intervallength, interval mid-point and a factor variable indicating the intervals.
See Also
as_ped ped_info
Examples
## create interval information from cut pointsint_info(c(1, 2.3, 5))## extract interval information used to create ped objecttdf <- data.frame(time=c(1, 2.3, 5), status=c(0, 1, 0))ped <- tdf %>% as_ped(Surv(time, status)~.,)int_info(ped)Create design matrix from a suitable object
Description
Create design matrix from a suitable object
Usage
make_X(object, ...)## Default S3 method:make_X(object, newdata, ...)## S3 method for class 'gam'make_X(object, newdata, ...)Arguments
object | A suitable object from which a design matrix can be generated.Often a model object. |
newdata | A data frame from which design matrix will be constructed |
Create design matrix from a suitable object
Description
Create design matrix from a suitable object
Usage
## S3 method for class 'scam'make_X(object, newdata, ...)Arguments
object | A suitable object from which a design matrix can be generated.Often a model object. |
newdata | A data frame from which design matrix will be constructed |
Construct a data frame suitable for prediction
Description
This functions provides a flexible interface to create a data set thatcan be plugged in asnewdata argument to a suitablepredictfunction (or similar).The function is particularly useful in combination with one of theadd_* functions, e.g.,add_term,add_hazard, etc.
Usage
make_newdata(x, ...)## Default S3 method:make_newdata(x, ...)## S3 method for class 'ped'make_newdata(x, ...)## S3 method for class 'fped'make_newdata(x, ...)Arguments
x | A data frame (or object that inherits from |
... | Covariate specifications (expressions) that will be evaluatedby looking for variables in |
Details
Depending on the type of variables inx, mean or modus valueswill be used for variables not specified in ellipsis(see alsosample_info). Ifx is an objectthat inherits from classped, useful data set completion will beattempted depending on variables specified in ellipsis. This is especiallyuseful, when creating a data set with different time points, e.g. tocalculate survival probabilities over time (add_surv_prob)or to calculate a time-varying covariate effects (add_term).To do so, the time variable has to be specified in..., e.g.,tend = seq_range(tend, 20). The problem with this specification is thatnot all values produced byseq_range(tend, 20) will be actual valuesoftend used at the stage of estimation (and in general, it willoften be tedious to specify exacttend values).make_newdatatherefore finds the correct interval and setstend to the respectiveinterval endpoint. For example, if the intervals of the PED object are(0,1], (1,2] thentend = 1.5 will be set to2 and theremaining time-varying information (e.g. offset) completed accordingly.See examples below.
Examples
# General functionalitytumor %>% make_newdata()tumor %>% make_newdata(age=c(50))tumor %>% make_newdata(days=seq_range(days, 3), age=c(50, 55))tumor %>% make_newdata(days=seq_range(days, 3), status=unique(status), age=c(50, 55))# mean/modus values of unspecified variables are calculated over whole datatumor %>% make_newdata(sex=unique(sex))tumor %>% group_by(sex) %>% make_newdata()# Examples for PED dataped <- tumor %>% slice(1:3) %>% as_ped(Surv(days, status)~., cut = c(0, 500, 1000))ped %>% make_newdata(age=c(50, 55))# if time information is specified, other time variables will be specified# accordingly and offset calculated correctlyped %>% make_newdata(tend = c(1000), age = c(50, 55))ped %>% make_newdata(tend = unique(tend))ped %>% group_by(sex) %>% make_newdata(tend = unique(tend))# tend is set to the end point of respective interval:ped <- tumor %>% as_ped(Surv(days, status)~.)seq_range(ped$tend, 3)make_newdata(ped, tend = seq_range(tend, 3))Create matrix components for cumulative effects
Description
These functions are called internally byget_cumulative andshould usually not be called directly.
Usage
make_time_mat(data, nz)make_latency_mat(data, tz)make_lag_lead_mat(data, tz, ll_fun = function(t, tz) t >= tz)make_z_mat(data, z_var, nz, ...)Arguments
data | A data set (or similar) from which meta information on cut-points,interval-specific time, covariates etc. can be obtained. |
z_var | Which should be transformed into functional covariate formatsuitable to fit cumulative effects in |
Calculate the modus
Description
Calculate the modus
Usage
modus(var)Arguments
var | A atomic vector |
Create nested data frame from data with time-dependent covariates
Description
Provides methods to nest data with time-dependent covariates (TDCs).Aformula must be provided where the right hand side (RHS) containsthe structure of the TDCs
Usage
nest_tdc(data, formula, ...)## Default S3 method:nest_tdc(data, formula, ...)## S3 method for class 'list'nest_tdc(data, formula, ...)Arguments
data | A suitable data structure (e.g. unnested data frame withconcurrent TDCs or a list where each element is a data frame, potentiallycontaining TDCs as specified in the RHS of |
formula | A two sided formula with a two part RHS, where the secondpart indicates the structure of the TDC structure. |
... | Further arguments passed to methods. |
Fit a piece-wise exponential additive model
Description
A thin wrapper aroundgam, however, some arguments areprespecified:family=poisson() andoffset=data$offset.These two can not be overwritten. In many cases it will also be advisable tosetmethod="REML".
Usage
pamm(formula, data = list(), ..., trafo_args = NULL, engine = "gam")is.pamm(x)## S3 method for class 'pamm'print(x, ...)## S3 method for class 'pamm'summary(object, ...)## S3 method for class 'pamm'plot(x, ...)Arguments
formula | A GAM formula, or a list of formulae (see |
data | A data frame or list containing the model response variable and covariates required by the formula. By default the variables are taken from |
... | Further arguments passed to |
trafo_args | A named list. If data is not in PED format, |
engine | Character name of the function that will be called to fit themodel. The intended entries are either |
x | Any R object. |
object | An object of class |
See Also
Examples
ped <- tumor[1:100, ] %>% as_ped(Surv(days, status) ~ complications, cut = seq(0, 3000, by = 50))pam <- pamm(ped_status ~ s(tend) + complications, data = ped)summary(pam)## Alternativelypamm( ped_status ~ s(tend) + complications, data = tumor[1:100, ],trafo_args = list(formula = Surv(days, status)~complications))Survival data of critically ill ICU patients
Description
A data set containing the survival time (or hospital release time) amongother covariates.The full data is availablehere.The following variables are provided:
- Year
The year of ICU Admission
- CombinedicuID
Intensive Care Unit (ICU) ID
- CombinedID
Patient identificator
- Survdays
Survival time of patients. Here it is assumed that patientssurvive until t=30 if released from hospital.
- PatientDied
Status indicator; 1=death, 0=censoring
- survhosp
Survival time in hospital. Here it is assumed that patientsare censored at time of hospital release (potentially informative)
- Gender
Male or female
- Age
The patients age at Admission
- AdmCatID
Admission category: medical, surgical elective or surgical emergency
- ApacheIIScore
The patient's Apache II Score at Admission
- BMI
Patient's Body Mass Index
- DiagID2
Diagnosis at admission in 9 categories
Usage
patientFormat
An object of classdata.frame with 2000 rows and 12 columns.
Extract interval information and median/modus values for covariates
Description
Given an object of classped, returns data frame with one row for eachinterval containing interval information, mean values for numericalvariables and modus for non-numeric variables in the data set.
Usage
ped_info(ped)## S3 method for class 'ped'ped_info(ped)Arguments
ped | An object of class |
Value
A data frame with one row for each unique interval inped.
See Also
Examples
ped <- tumor[1:4,] %>% as_ped(Surv(days, status)~ sex + age)ped_info(ped)S3 method for pamm objects for compatibility with package pec
Description
S3 method for pamm objects for compatibility with package pec
Usage
## S3 method for class 'pamm'predictSurvProb(object, newdata, times, ...)Arguments
object | A fitted model from which to extract predicted survivalprobabilities |
newdata | A data frame containing predictor variable combinations forwhich to compute predicted survival probabilities. |
times | A vector of times in the range of the response variable, e.g.times when the response is a survival object, at which to return thesurvival probabilities. |
... | Additional arguments that are passed on to the current method. |
Extract information on concurrent effects
Description
Extract information on concurrent effects
Usage
prep_concurrent(x, formula, ...)## S3 method for class 'list'prep_concurrent(x, formula, ...)Arguments
x | A suitable object from which variables contained in |
... | Further arguments passed to methods. |
Draw random numbers from piece-wise exponential distribution.
Description
This is a copy of the same function fromrpexp from packagemsm.Copied here to reduce dependencies.
Usage
rpexp(n = 1, rate = 1, t = 0)Arguments
n | number of observations. If |
rate | vector of rates. |
t | vector of the same length as |
Extract information of the sample contained in a data set
Description
Given a data set and grouping variables, this function returns mean valuesfor numeric variables and modus for characters and factors. Usuallythis function should not be called directly but will rather be calledas part of a call tomake_newdata.
Usage
sample_info(x)## S3 method for class 'data.frame'sample_info(x)## S3 method for class 'ped'sample_info(x)## S3 method for class 'fped'sample_info(x)Arguments
x | A data frame (or object that inherits from |
Value
A data frame containing sample information (for each group).If applied to an object of classped, the sample means of theoriginal data is returned.Note: When applied to aped object, that doesn't contain covariates(only interval information), returns data frame with 0 columns.
Generate a sequence over the range of a vector
Description
Stolen fromhere
Usage
seq_range(x, n, by, trim = NULL, expand = NULL, pretty = FALSE)Arguments
x | A numeric vector |
n,by | Specify the output sequence either by supplying thelength of the sequence with I recommend that you name these arguments in order to make it clear tothe reader. |
trim | Optionally, trim values off the tails. |
expand | Optionally, expand the range by |
pretty | If |
Examples
x <- rcauchy(100)seq_range(x, n = 10)seq_range(x, n = 10, trim = 0.1)seq_range(x, by = 1, trim = 0.1)# Make pretty sequencesy <- runif (100)seq_range(y, n = 10)seq_range(y, n = 10, pretty = TRUE)seq_range(y, n = 10, expand = 0.5, pretty = TRUE)seq_range(y, by = 0.1)seq_range(y, by = 0.1, pretty = TRUE)Simulate survival times from the piece-wise exponential distribution
Description
Simulate survival times from the piece-wise exponential distribution
Usage
sim_pexp(formula, data, cut)Arguments
formula | An extended formula that specifies the linear predictor.If you want to include a smooth baselineor time-varying effects, use |
data | A data set with variables specified in |
cut | A sequence of time-points starting with 0. |
Examples
library(survival)library(dplyr)library(pammtools)# set number of observations/subjectsn <- 250# create data set with variables which will affect the hazard rate.df <- cbind.data.frame(x1 = runif (n, -3, 3), x2 = runif (n, 0, 6)) %>% as_tibble()# the formula which specifies how covariates affet the hazard ratef0 <- function(t) { dgamma(t, 8, 2) *6}form <- ~ -3.5 + f0(t) -0.5*x1 + sqrt(x2)set.seed(24032018)sim_df <- sim_pexp(form, df, 1:10)head(sim_df)plot(survfit(Surv(time, status)~1, data = sim_df ))# for control, estimate with Cox PHmod <- coxph(Surv(time, status) ~ x1 + pspline(x2), data=sim_df)coef(mod)[1]layout(matrix(1:2, nrow=1))termplot(mod, se = TRUE)# and using PAMslayout(1)ped <- sim_df %>% as_ped(Surv(time, status)~., max_time=10)library(mgcv)pam <- gam(ped_status ~ s(tend) + x1 + s(x2), data=ped, family=poisson, offset=offset)coef(pam)[2]plot(pam, page=1)## Not run: # Example 2: Functional covariates/cumulative coefficients# function to generate one exposure profile, tz is a vector of time points# at which TDC z was observedrng_z = function(nz) { as.numeric(arima.sim(n = nz, list(ar = c(.8, -.6))))}# two different exposure times for two different exposurestz1 <- 1:10tz2 <- -5:5# generate exposures and add to data setdf <- df %>% add_tdc(tz1, rng_z) %>% add_tdc(tz2, rng_z)df# define tri-variate function of time, exposure time and exposure zft <- function(t, tmax) { -1*cos(t/tmax*pi)}fdnorm <- function(x) (dnorm(x,1.5,2)+1.5*dnorm(x,7.5,1))wpeak2 <- function(lag) 15*dnorm(lag,8,10)wdnorm <- function(lag) 5*(dnorm(lag,4,6)+dnorm(lag,25,4))f_xyz1 <- function(t, tz, z) { ft(t, tmax=10) * 0.8*fdnorm(z)* wpeak2(t - tz)}f_xyz2 <- function(t, tz, z) { wdnorm(t-tz) * z}# define lag-lead window functionll_fun <- function(t, tz) {t >= tz}ll_fun2 <- function(t, tz) {t - 2 >= tz}# simulate data with cumulative effectsim_df <- sim_pexp( formula = ~ -3.5 + f0(t) -0.5*x1 + sqrt(x2)| fcumu(t, tz1, z.tz1, f_xyz=f_xyz1, ll_fun=ll_fun) + fcumu(t, tz2, z.tz2, f_xyz=f_xyz2, ll_fun=ll_fun2), data = df, cut = 0:10)## End(Not run)Simulate data for competing risks scenario
Description
Simulate data for competing risks scenario
Usage
sim_pexp_cr(formula, data, cut)Simulated data with cumulative effects
Description
This is data simulated using thesim_pexp function.It contains two time-constant and two time-dependent covariates (observedon different exposure time grids). The code used for simulation iscontained in the examples of?sim_pexp.
Usage
simdf_elraFormat
An object of classnested_fdf (inherits fromsim_df,tbl_df,tbl,data.frame) with 250 rows and 9 columns.
New basis for penalized lag selection
Description
Originally proposed in Obermeier et al., 2015, Flexible Distributed Lags for Modelling Earthquake Data,Journal of the Royal Statistical Society: Series C (Applied Statistics),10.1111/rssc.12077.Here extended in order to penalize lead times in addition to lag times.Ideally the lag-lead window would then be selected in a data-driven fashion.Treat as experimental.
Usage
## S3 method for class 'fdl.smooth.spec'smooth.construct(object, data, knots)Arguments
object | An object handled by mgcv |
data | The data set |
knots | A vector of knots |
Function to transform data without time-dependent covariates into piece-wiseexponential data format
Description
Function to transform data without time-dependent covariates into piece-wiseexponential data format
Usage
split_data( formula, data, cut = NULL, max_time = NULL, multiple_id = FALSE, ...)Arguments
formula | A two sided formula with a |
data | Either an object inheriting from data frame or in case oftime-dependent covariates a list of data frames (of length 2), where the first data framecontains the time-to-event information and static covariates while the second(and potentially further data frames) contain information on time-dependentcovariates and the times at which they have been observed. |
cut | Split points, used to partition the follow up into intervals.If unspecified, all unique event times will be used. |
max_time | If |
multiple_id | Are occurences of same id allowed (per transition).Defaults to |
... | Further arguments passed to the |
See Also
Split data to obtain recurrent event data in PED format
Description
Currently, the input data must be in start-stop notation for each spell andcontain a colum that indicates the spell (event number).
Usage
split_data_multistate( formula, data, transition = character(), cut = NULL, max_time = NULL, event = 1L, min_events = 1L, timescale = c("gap", "calendar"), ...)Arguments
formula | A two sided formula with a |
data | Either an object inheriting from data frame or in case oftime-dependent covariates a list of data frames (of length 2), where the first data framecontains the time-to-event information and static covariates while the second(and potentially further data frames) contain information on time-dependentcovariates and the times at which they have been observed. |
transition | A character indicating the column in data that indicates theevent/episode number for recurrent events. |
cut | Split points, used to partition the follow up into intervals.If unspecified, all unique event times will be used. |
max_time | If |
event | The value that encodes the occurrence of an event in the data set. |
min_events | Minimum number of events for each event number. |
timescale | Defines the timescale for the recurrent event data transformation.Defaults to |
... | Further arguments passed to the |
See Also
Time until staphylococcus aureaus infection in children, with possible recurrence
Description
This dataset originates from the Drakenstein child health study.The data contains the following variables:
- id
Randomly generated unique child ID
- t.start
The time at which the child enters the risk set for the $k$-th event
- t.stop
Time of $k$-th infection or censoring
.
- enum
Event number. Maximum of 6.
- hiv
Usage
staphFormat
An object of classtbl_df (inherits fromtbl,data.frame) with 374 rows and 6 columns.
Extract fixed coefficient table from model object
Description
Given a model object, returns a data frame with columnsvariable,coef (coefficient),ci_lower (lower 95\ci_upper (upper 95\
Usage
tidy_fixed(x, ...)## S3 method for class 'gam'tidy_fixed(x, intercept = FALSE, ...)## S3 method for class 'coxph'tidy_fixed(x, ...)Arguments
x | A model object. |
... | Currently not used. |
intercept | Should intercept also be returned? Defaults to |
Examples
library(survival)gc <- coxph(Surv(days, status)~age + sex, data = tumor)tidy_fixed(gc)Extract random effects in tidy data format.
Description
Extract random effects in tidy data format.
Usage
tidy_re(x, keep = c("fit", "main", "xlab", "ylab"), ...)Arguments
x | a fitted |
keep | A vector of variables to keep. |
... | Further arguments passed to |
See Also
Extract 1d smooth objects in tidy data format.
Description
Extract 1d smooth objects in tidy data format.
Usage
tidy_smooth(x, keep = c("x", "fit", "se", "xlab", "ylab"), ci = TRUE, ...)Arguments
x | a fitted |
keep | A vector of variables to keep. |
ci | A logical value indicating whether confidence intervals should becalculated and returned. Defaults to |
... | Further arguments passed to |
Extract 2d smooth objects in tidy format.
Description
Extract 2d smooth objects in tidy format.
Usage
tidy_smooth2d( x, keep = c("x", "y", "fit", "se", "xlab", "ylab", "main"), ci = FALSE, ...)Arguments
x | a fitted |
keep | A vector of variables to keep. |
ci | A logical value indicating whether confidence intervals should becalculated and returned. Defaults to |
... | Further arguments passed to |
Stomach area tumor data
Description
Information on patients treated for a cancer diseaselocated in the stomach area.The data set includes:
- days
Time from operation until death in days.
- status
Event indicator (0 = censored, 1 = death).
- age
The subject's age.
- sex
The subject's sex (male/female).
- charlson_score
Charlson comorbidity score, 1-6.
- transfusion
Has subject received transfusions (no/yes).
- complications
Did major complications occur during operation (no/yes).
- metastases
Did the tumor develop metastases? (no/yes).
- resection
Was the operation accompanied by a major resection (no/yes).
Usage
tumorFormat
An object of classtbl_df (inherits fromtbl,data.frame) with 776 rows and 9 columns.
Warn if new t_j are used
Description
Warn if new t_j are used
Usage
warn_about_new_time_points(object, newdata, ...)## S3 method for class 'pamm'warn_about_new_time_points(object, newdata, ...)Warn if new t_j are used
Description
Warn if new t_j are used
Usage
## S3 method for class 'glm'warn_about_new_time_points(object, newdata, time_var, ...)