| Type: | Package |
| Title: | Direct Parametric Inference for the Cumulative IncidenceFunction in Competing Risks |
| Version: | 0.0.2 |
| Date: | 2025-04-16 |
| Description: | Implements parametric (Direct) regression methods for modeling cumulative incidence functions (CIFs) in the presence of competing risks. Methods include the direct Gompertz-based approach and generalized regression models as described in Jeong and Fine (2006) <doi:10.1111/j.1467-9876.2006.00532.x> and Jeong and Fine (2007) <doi:10.1093/biostatistics/kxj040>. The package facilitates maximum likelihood estimation, variance computation, with applications to clinical trials and survival analysis. |
| License: | GPL-2 |GPL-3 [expanded from: GPL (≥ 2)] |
| Imports: | ggplot2, dplyr, tidyr, numDeriv, cmprsk, tidyselect, stats,Rcpp |
| LinkingTo: | Rcpp, RcppEigen |
| Depends: | R (≥ 4.1.2) |
| LazyData: | true |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.2 |
| Suggests: | roxygen2 |
| NeedsCompilation: | yes |
| Packaged: | 2025-05-02 18:22:34 UTC; habib |
| Author: | Habib Ezzatabadipour [aut, cre] |
| Maintainer: | Habib Ezzatabadipour <habibezati@outlook.com> |
| Repository: | CRAN |
| Date/Publication: | 2025-05-02 22:00:10 UTC |
Direct Parametric Inference for the Cumulative Incidence Function in Competing Risks
Description
Thecmpp package provides parametric (Direct) modeling methods for analyzing cumulative incidence functions (CIFs)in the context of competing risks. It includes Gompertz-based models, regression techniques, and parametric (Direct) approachessuch as the Generalized odds rate (GOR), Proportional Odds Model (POM), and Proportional Hazards Model (PHM).The package enables users to estimate and compare CIFs using maximum likelihood estimation, perform regression analysis,and visualize CIFs with confidence intervals. It also supports covariate adjustment and bootstrap variance estimation.
Details
Thecmpp package offers functions for modeling cumulative incidence functions (CIFs) Directlyusing the Gompertz distribution and generalized regression models.
Key features include:
Direct parametric modeling for cumulative incidence functions.
Maximum likelihood estimation of parameters.
Regression analysis with covariates, including treatment effects.
Visualization of CIFs with confidence intervals.
Covariate-adjusted CIF estimation.
Bootstrap variance estimation for model parameters.
Commonly used functions include:
Initialize: Initializes the data for the Cmpp model.LogLike1: Computes the negative log-likelihood for the model without covariate effect.compute_grad: Computes the gradient of the log-likelihood.compute_hessian: Computes the Hessian matrix of the log-likelihood.estimate_parameters_GOR: Estimates parameters using the Generalized odds rate (GOR).estimate_parameters_POM: Estimates parameters using the Proportional Odds Model (POM).estimate_parameters_PHM: Estimates parameters using the Proportional Hazards Model (PHM).CIF_res1: Computes CIF results for competing risks without covariates.CIF_Figs: Plots CIFs with confidence intervals (without covariate effect).Cmpp_CIF: Computes and plots CIFs for competing risks using GOR, POM, and PHM.FineGray_Model: Fits a Fine-Gray regression model for competing risks data andvisualize CIF by Fine-Gray model result usingcmprsk::cumincandcmprsk::crr.bootstrap_variance: Estimates variance of parameters using the bootstrap method.GetData: Retrieves initialized data from the Cmpp model.Cleanup: Cleans up memory by deleting the Cmpp instance.
Author(s)
Habib Ezzatabadipourhabibezati@outlook.com
References
Jeong, J.-H., & Fine, J. (2006). Direct parametric inference for the cumulative incidence function.Applied Statistics, 55(2), 187-200.
Jeong, J.-H., & Fine, J. (2007). Parametric regression on cumulative incidence function.Biostatistics, 8(2), 184-196.
See Also
Initialize,LogLike1,compute_grad,compute_hessian,estimate_parameters_GOR,estimate_parameters_POM,estimate_parameters_PHM,CIF_res1,CIF_Figs,Cmpp_CIF,FineGray_Model,bootstrap_variance,GetData,Cleanup
Examples
## Example: Initialize the Cmpp model and compute CIFslibrary(cmpp)features <- matrix(rnorm(300, 1, 2), nrow = 100, ncol = 3)delta1 <- sample(c(0, 1), 100, replace = TRUE)delta2 <- 1 - delta1timee <- rexp(100, rate = 1/10)Initialize(features, timee, delta1, delta2, h = 1e-5)# Initialize the Cmpp model# Estimate parameters using the Generalized odds rate (GOR)initial_params <- rep(0.001, 2 * (ncol(features) + 3))result <- estimate_parameters_GOR(initial_params)print(result)# Compute CIFs for competing risks (without covariate effect | Not Regression model)cif_results <- CIF_res1()print(cif_results)# Plot CIFs with confidence intervalsplot <- CIF_Figs(rep(0.01, 4), timee)print(plot)# Compute and plot adjusted CIFsresult_cif <- Cmpp_CIF(featureID = c(1, 2),featureValue = c(0.5, 1.2),RiskNames = c("Event1", "Event2"),TypeMethod = "GOR",predTime = seq(0, 10, by = 0.5))print(result_cif$Plot$Plot_InputModel) # Plot for the specified modelprint(result_cif$CIF$CIFAdjusted) # Adjusted CIF values# Fit a Fine-Gray model for competing risksresult_fg <- FineGray_Model(CovarNames = c("Covar1", "Covar2", 'Covar3'),Failcode = 1,RiskNames = c("Event1", "Event2"))print(result_fg$Results) # Summary of the Fine-Gray modelprint(result_fg$Plot) # Plot of the CIFs# Clean up memoryCleanup()Plot Cumulative Incidence Functions (CIF) with Confidence Intervals
Description
This function plots the cumulative incidence functions (CIF) for two competing risks based on the estimated parameters and their variances. It includes confidence intervals for the CIFs.
Usage
CIF_Figs(initial_params, TimeFailure, OrderType = c(2, 1), RiskNames = NULL)Arguments
initial_params | A numeric vector of initial parameter values to start the optimization. |
TimeFailure | A numeric vector of failure times corresponding to observations. |
OrderType | A numeric vector indicating the order of the competing risks. Default is |
RiskNames | A character vector of names for the competing risks. Default is |
Details
This function performs the following steps:
Estimates the model parameters using the
estimate_parametersfunction.Computes the Hessian matrix using the
compute_hessianfunction.Ensures that the diagonal elements of the covariance matrix are positive.
Computes the cumulative incidence functions (CIF) for two competing risks.
Plots the CIFs along with their confidence intervals.
Value
A ggplot object showing the CIFs and their confidence intervals.
Examples
library(cmpp)data("fertility_data")Nam <- names(fertility_data)fertility_data$Educationdatt <- make_Dummy(fertility_data, features = c("Education"))datt <- datt$New_Data datt['Primary_Secondary'] <- datt$`Education:2`datt['Higher_Education'] <- datt$`Education:3`datt$`Education:2` <- datt$`Education:3` <- NULLdatt2 <- make_Dummy(datt, features = 'Event')$New_Datad1 <- datt2$`Event:2`d2 <- datt2$`Event:3`feat <- datt2[c('age', 'Primary_Secondary', 'Higher_Education')] |> data.matrix()timee <- datt2[['time']]Initialize(feat, timee, d1, d2, 1e-10)initial_params <- c(0.001, 0.001, 0.001, 0.001)result <- CIF_res1(initial_params)print(result)initial_params <- c(0.01, 0.01, 0.01, 0.01)TimeFailure <- seq(0, 10, by = 0.1)plot <- CIF_Figs(initial_params, TimeFailure)print(plot)Compute Cumulative Incidence Function (CIF) Results
Description
This function estimates the parameters of the model, computes the Hessian matrix, and calculates the variances and p-values for the parameters. It ensures that the diagonal elements of the covariance matrix are positive.
Usage
CIF_res1(initial_params = rep(0.001, 4))Arguments
initial_params | A numeric vector of initial parameter values to start the optimization. Default is |
Details
This function performs the following steps:
Estimates the model parameters using the
estimate_parametersfunction.Computes the Hessian matrix using the
compute_hessianfunction.Ensures that the diagonal elements of the covariance matrix are positive.
Calculates the variances and p-values for the parameters.
Value
A data frame containing:
Params | The parameter names ("alpha1", "beta1", "alpha2", "beta2"). |
STD | The standard deviations of the parameters. |
Examples
library(cmpp)data("fertility_data")Nam <- names(fertility_data)fertility_data$Educationdatt <- make_Dummy(fertility_data, features = c("Education"))datt <- datt$New_Data datt['Primary_Secondary'] <- datt$`Education:2`datt['Higher_Education'] <- datt$`Education:3`datt$`Education:2` <- datt$`Education:3` <- NULLdatt2 <- make_Dummy(datt, features = 'Event')$New_Datad1 <- datt2$`Event:2`d2 <- datt2$`Event:3`feat <- datt2[c('age', 'Primary_Secondary', 'Higher_Education')] |> data.matrix()timee <- datt2[['time']]Initialize(feat, timee, d1, d2, 1e-10)initial_params <- c(0.001, 0.001, 0.001, 0.001)result <- CIF_res1(initial_params)print(result)Clean up memory by deleting the pointer to the Cmpp instance
Description
This function is used to clean up and delete the instance of the Cmpp class inthe C++ code. It ensures proper memory management and prevents memory leaks bydeleting the pointer to theCmpp object when it is no longer needed.It is important to call this function after you are done with theCmpp objectto ensure that no memory is leaked.
Usage
Cleanup()Details
TheCleanup function must be called after using theCmpp object to clean upthe allocated memory in C++. Failure to call this function may result in memoryleaks, as the memory allocated for theCmpp object is not automatically freed.
Value
No return value. Called for side effects.
Examples
# Assuming you have previously initialized the Cmpp object with `Initialize()`Cleanup()Compute and Plot Cumulative Incidence Functions (CIF) for Competing Risks
Description
This function computes and plots the cumulative incidence functions (CIF) for competing risks using three parametric models:Generalized odds rate (GOR), Proportional Odds Model (POM), and Proportional Hazards Model (PHM).It allows for adjusted CIFs based on specific covariate values and provides visualizations for all models.
Usage
Cmpp_CIF( featureID = NULL, featureValue = NULL, RiskNames = NULL, TypeMethod = "GOR", predTime = NULL)Arguments
featureID | A numeric vector of indices specifying the features to adjust. Default is |
featureValue | A numeric vector of values corresponding to the features specified in |
RiskNames | A character vector specifying the names of the competing risks. Default is |
TypeMethod | A character string specifying the model to use for plotting. Must be one of |
predTime | A numeric vector of time points for which CIFs are computed. Default is |
Details
This function performs the following steps:
Estimates the model parameters for GOR, POM, and PHM using the
estimate_parameters_GOR,estimate_parameters_POM, andestimate_parameters_PHMfunctions.Computes the CIFs for the specified time points and covariate values.
Generates plots for the CIFs, including adjusted CIFs based on specific covariate values.
Provides separate plots for each model and a combined plot for all models.
IffeatureID andfeatureValue are provided, the function adjusts the CIFs based on the specified covariate values.IfRiskNames is not provided, the default names "Risk1" and "Risk2" are used. TheTypeMethod parameter determineswhich model's CIF plot is returned in the output.
Value
A list containing:
Time | A list with the input time points, time points for adjusted plots, and time points for null plots. |
CIF | A list with the following elements:
|
Plot | A list with the following elements:
|
Examples
library(cmpp)data("fertility_data")Nam <- names(fertility_data)fertility_data$Educationdatt <- make_Dummy(fertility_data, features = c("Education"))datt <- datt$New_Data datt['Primary_Secondary'] <- datt$`Education:2`datt['Higher_Education'] <- datt$`Education:3`datt$`Education:2` <- datt$`Education:3` <- NULLdatt2 <- make_Dummy(datt, features = 'Event')$New_Datad1 <- datt2$`Event:2`d2 <- datt2$`Event:3`feat <- datt2[c('age', 'Primary_Secondary', 'Higher_Education')] |> data.matrix()timee <- datt2[['time']]Initialize(feat, timee, d1, d2, 1e-10)result <- Cmpp_CIF( featureID = c(1, 2), featureValue = c(0.5, 1.2), RiskNames = c("Event1", "Event2"), TypeMethod = "GOR", predTime = seq(0, 10, by = 0.5))print(result$Plot$Plot_InputModel) # Plot for the specified modelprint(result$Plot$PlotAdjusted_AllModels) # Adjusted CIFs for all modelsprint(result$CIF$CIFAdjusted) # Adjusted CIF valuesCompute the CDF of the Parametric Generalized odds rate (GOR)
Description
This function computes the cumulative distribution function (CDF) of the parametric model (GOR Approach).
Arguments
Params | A numeric vector of parameters. |
Z | A numeric vector of covariates. |
x | A numeric value representing the time point. |
Value
A numeric value representing the CDF.
Examples
library(cmpp)set.seed(321)features <- matrix(rnorm(300, 1, 2), nrow = 100, ncol = 3)delta1 <- sample(c(0, 1), 100, replace = TRUE)delta2 <- 1 - delta1x <- rexp(100, rate = 1/2)Initialize(features, x, delta1, delta2, h = 1e-3)params <- rep(0.001, (ncol(features) + 3))y <- 0.07z <- features[1, ](cdf_value <- F_cdf_rcpp(params, z, y))Compute the CDF of the Parametric Proportional Odds Model (POM)
Description
This function computes the cumulative distribution function (CDF) of the parametric model (POM Approach).
Arguments
Params | A numeric vector of parameters. |
Z | A numeric vector of covariates. |
x | A numeric value representing the time point. |
Value
A numeric value representing the CDF.
Examples
library(cmpp)set.seed(1984)features <- matrix(rnorm(300, 1, 2), nrow = 100, ncol = 3)delta1 <- sample(c(0, 1), 100, replace = TRUE)delta2 <- 1 - delta1x <- rexp(100, rate = 1/2)Initialize(features, x, delta1, delta2, h = 1e-5)params <- rep(0.001, (ncol(features) + 2))x <- 2cdf_value <- F_cdf_rcpp2(params, features[1, ], x)print(cdf_value)Compute the CDF of the Parametric Proportional Hazards Model (PHM)
Description
This function computes the cumulative distribution function (CDF) of the parametric model (PHM Approach).
Arguments
Params | A numeric vector of parameters. |
Z | A numeric vector of covariates. |
x | A numeric value representing the time point. |
Value
A numeric value representing the CDF.
Examples
library(cmpp)set.seed(1984)features <- matrix(rnorm(300, 1, 2), nrow = 100, ncol = 3)delta1 <- sample(c(0, 1), 100, replace = TRUE)delta2 <- 1 - delta1x <- rexp(100, rate = 1/10)Initialize(features, x, delta1, delta2, h = 1e-5)params <- rep(0.001, (ncol(features) + 2))x <- 5cdf_value <- F_cdf_rcpp3(params, features[1, ], x)print(cdf_value)Fine-Gray Model for Competing Risks Data
Description
This function fits a Fine-Gray model for competing risks data using thecmprsk package.It estimates the subdistribution hazard model parameters, computes cumulative incidence functions (CIFs),and provides a summary of the results along with a plot of the CIFs.
Usage
FineGray_Model(CovarNames = NULL, Failcode = 1, RiskNames = NULL)Arguments
CovarNames | A character vector of names for the covariates. If |
Failcode | An integer specifying the event of interest (default is |
RiskNames | A character vector specifying the names of the competing risks. If |
Details
This function retrieves the data initialized in the Cmpp model using theGetData function.It uses thecrr function from thecmprsk package to fit the Fine-Gray model for competing risks.The function also computes cumulative incidence functions (CIFs) using thecuminc function andgenerates a plot of the CIFs for the competing risks.
Value
A list containing:
Results | A summary of the Fine-Gray model fit. |
Plot | A ggplot object showing the cumulative incidence functions (CIFs) for the competing risks. |
CIF_Results | A data frame containing the CIFs for the competing risks, along with their corresponding time points. |
Examples
library(cmpp)data("fertility_data")Nam <- names(fertility_data)fertility_data$Educationdatt <- make_Dummy(fertility_data, features = c("Education"))datt <- datt$New_Data datt['Primary_Secondary'] <- datt$`Education:2`datt['Higher_Education'] <- datt$`Education:3`datt$`Education:2` <- datt$`Education:3` <- NULLdatt2 <- make_Dummy(datt, features = 'Event')$New_Datad1 <- datt2$`Event:2`d2 <- datt2$`Event:3`feat <- datt2[c('age', 'Primary_Secondary', 'Higher_Education')] |> data.matrix()timee <- datt2[['time']]Initialize(feat, timee, d1, d2, 1e-10)result <- FineGray_Model( CovarNames = c("Covar1", "Covar2", "Covar3"), Failcode = 1, RiskNames = c("Event1", "Event2"))print(result$Results) # Summary of the Fine-Gray model#print(result$Plot) # Plot of the CIFsprint(result$CIF_Results) # CIF dataRetrieve Initialized Data from the Cmpp Model
Description
This function retrieves the data initialized in the Cmpp model, including the feature matrix, failure times,and competing risks indicators (delta1 anddelta2).
Details
This function requires the Cmpp model to be initialized using theInitialize function. It retrieves thedata stored in the Cmpp object, which includes the feature matrix, failure times, and the binary indicators forcompeting risks. If the Cmpp object is not initialized, the function will throw an error.
Value
A list containing:
features | A numeric matrix of predictor variables. Each row corresponds to an observation. |
timee | A numeric vector of failure times corresponding to observations. |
delta1 | A binary vector indicating the occurrence of the first competing event (1 for observed). |
delta2 | A binary vector indicating the occurrence of the second competing event (1 for observed). |
Examples
library(cmpp)data("fertility_data")Nam <- names(fertility_data)fertility_data$Educationdatt <- make_Dummy(fertility_data, features = c("Education"))datt <- datt$New_Data datt['Primary_Secondary'] <- datt$`Education:2`datt['Higher_Education'] <- datt$`Education:3`datt$`Education:2` <- datt$`Education:3` <- NULLdatt2 <- make_Dummy(datt, features = 'Event')$New_Datad1 <- datt2$`Event:2`d2 <- datt2$`Event:3`feat <- datt2[c('age', 'Primary_Secondary', 'Higher_Education')] |> data.matrix()timee <- datt2[['time']]Initialize(feat, timee, d1, d2, 1e-10)data <- GetData()print(data$features) # Feature matrixprint(data$timee) # Failure timesprint(data$delta1) # Indicator for the first competing eventprint(data$delta2) # Indicator for the second competing eventGet Dimensions of the Cmpp Object
Description
This function returns the number of samples and features stored in the Cmpp object.It is primarily used to retrieve the dimensions of the data within the class.
Usage
GetDim()Details
TheGetDim function allows users to access the internal dimensions of theCmpp class instance, such as the number of samples (Nsamp) and the number of features(Nfeature). This is useful when working with large datasets, especially forchecking the size of the data without needing to manually access the underlyingEigen::MatrixXdorEigen::VectorXd objects directly.
Value
A list containing:
Nsamp | Number of samples (rows in the feature matrix). |
Nfeature | Number of features (columns in the feature matrix). |
Examples
# Initialize Cmpp objectlibrary(cmpp)data("fertility_data")Nam <- names(fertility_data)fertility_data$Educationdatt <- make_Dummy(fertility_data, features = c("Education"))datt <- datt$New_Data datt['Primary_Secondary'] <- datt$`Education:2`datt['Higher_Education'] <- datt$`Education:3`datt$`Education:2` <- datt$`Education:3` <- NULLdatt2 <- make_Dummy(datt, features = 'Event')$New_Datad1 <- datt2$`Event:2`d2 <- datt2$`Event:3`feat <- datt2[c('age', 'Primary_Secondary', 'Higher_Education')] |> data.matrix()timee <- datt2[['time']]Initialize(feat, timee, d1, d2, 1e-10)# Get dimensionsdims <- GetDim()dims$Nsamp # Number of samplesdims$Nfeature # Number of featuresInitialize Data for the Cmpp Model
Description
This function initializes the data used in the Cmpp model by storing the feature matrix, failure times,and the competing risks indicators in the model environment. These are required for subsequent computations.
Usage
Initialize(features, x, delta1, delta2, h)Arguments
features | A numeric matrix of predictor variables. Each row corresponds to an observation. |
x | A numeric vector of failure times corresponding to observations. |
delta1 | A binary vector indicating the occurrence of the first competing event (1 for observed). |
delta2 | A binary vector indicating the occurrence of the second event (1 for observed). |
h | A numeric value specifying the step size for numerical gradient computations. |
Details
This function does not return any value but sets up internal data structures required for model computation.Ensure thatfeatures,x,delta1, anddelta2 have matching lengths or dimensions.
Value
This function returnsNULL. The initialized data is stored in the package environment.
Examples
library(cmpp)features <- matrix(rnorm(100), ncol = 5)x <- rnorm(20)delta1 <- sample(0:1, 20, replace = TRUE)delta2 <- 1 - delta1Initialize(features, x, delta1, delta2, h = 1e-5)Compute the Log-Likelihood for the Model
Description
Computes the negative log-likelihood of the Cmpp model given parameters and the initialized data.The log-likelihood considers Gompertz distributions for competing risks.
Usage
LogLike1(param)Arguments
param | A numeric vector of model parameters: |
Details
This function requires the data to be initialized usingInitialize before being called.The log-likelihood is based on survival probabilities derived from the Gompertz distributions.
Value
A single numeric value representing the negative log-likelihood.
Examples
library(cmpp)data("fertility_data")Nam <- names(fertility_data)fertility_data$Educationdatt <- make_Dummy(fertility_data, features = c("Education"))datt <- datt$New_Data datt['Primary_Secondary'] <- datt$`Education:2`datt['Higher_Education'] <- datt$`Education:3`datt$`Education:2` <- datt$`Education:3` <- NULLdatt2 <- make_Dummy(datt, features = 'Event')$New_Datad1 <- datt2$`Event:2`d2 <- datt2$`Event:3`feat <- datt2[c('age', 'Primary_Secondary', 'Higher_Education')] |> data.matrix()timee <- datt2[['time']]Initialize(feat, timee, d1, d2, 1e-10)param <- c(0.01, 0.01, 0.01, 0.01)LogLike1(param)Estimate Variance of Parameters Using Bootstrap Method
Description
This function estimates the variance of model parameters using the bootstrap method.It repeatedly samples the data with replacement, estimates the parameters for each sample,and computes the variance of the estimated parameters.
Usage
bootstrap_variance(features, x, delta1, delta2, initial_params, n_bootstrap, optimMethod)Arguments
features | A numeric matrix of predictor variables. Each row corresponds to an observation. |
x | A numeric vector of failure times corresponding to observations. |
delta1 | A binary vector indicating the occurrence of the first competing event (1 for observed). |
delta2 | A binary vector indicating the occurrence of the second event (1 for observed). |
initial_params | A numeric vector of initial parameter values to start the optimization. |
n_bootstrap | An integer specifying the number of bootstrap samples. |
optimMethod | A character string specifying the optimization method to use. Default is |
Details
This function performs bootstrap sampling to estimate the variance of the model parameters.It requires the data to be initialized usingInitialize before being called.
Value
A list containing:
variances: A numeric vector representing the variance of the estimated parameters.bootstrap_estimates: A matrix of parameter estimates for each bootstrap sample.
Examples
library(cmpp)features <- matrix(rnorm(100), ncol = 5)x <- rnorm(20)delta1 <- sample(0:1, 20, replace = TRUE)delta2 <- 1 - delta1initial_params <- c(0.01, 0.01, 0.01, 0.01)n_bootstrap <- 100results <- bootstrap_variance(features, x, delta1, delta2, initial_params, n_bootstrap, optimMethod = "BFGS")print(results$variances)print(results$bootstrap_estimates)Compute the CDF of the Gompertz Distribution
Description
Calculates the cumulative distribution function (CDF) of the Gompertz distributionfor given input values and parameters.
Usage
cdf_gomp(x, alpha, beta)Arguments
x | A numeric vector of non-negative input values (e.g., failure times). |
alpha | A positive numeric value representing the shape parameter. |
beta | A positive numeric value representing the scale parameter. |
Details
The Gompertz distribution is commonly used in survival analysis and reliability studies.Ensure thatalpha andbeta are positive for meaningful results.
Value
A numeric vector of the CDF values for each input inx.
Examples
library(cmpp)data("fertility_data")Nam <- names(fertility_data)fertility_data$Educationdatt <- make_Dummy(fertility_data, features = c("Education"))datt <- datt$New_Data datt['Primary_Secondary'] <- datt$`Education:2`datt['Higher_Education'] <- datt$`Education:3`datt$`Education:2` <- datt$`Education:3` <- NULLdatt2 <- make_Dummy(datt, features = 'Event')$New_Datad1 <- datt2$`Event:2`d2 <- datt2$`Event:3`feat <- datt2[c('age', 'Primary_Secondary', 'Higher_Education')] |> data.matrix()timee <- datt2[['time']]Initialize(feat, timee, d1, d2, 1e-10)x <- c(1, 2, 3)alpha <- 0.5beta <- 0.1lapply(x, cdf_gomp, alpha = alpha, beta = beta) |> unlist()Compute the Numerical Gradient of the Log-Likelihood
Description
Calculates the gradient of the negative log-likelihood using finite differences.The function uses a small step size (h) defined during initialization.
Usage
compute_grad(param)Arguments
param | A numeric vector of parameters for which the gradient is calculated. |
Details
This function approximates the gradient using central finite differences.Ensure thath is appropriately set to avoid numerical instability.
Value
A numeric vector of the same length asparam, representing the gradient at the specified parameters.
Examples
library(cmpp)data("fertility_data")Nam <- names(fertility_data)fertility_data$Educationdatt <- make_Dummy(fertility_data, features = c("Education"))datt <- datt$New_Data datt['Primary_Secondary'] <- datt$`Education:2`datt['Higher_Education'] <- datt$`Education:3`datt$`Education:2` <- datt$`Education:3` <- NULLdatt2 <- make_Dummy(datt, features = 'Event')$New_Datad1 <- datt2$`Event:2`d2 <- datt2$`Event:3`feat <- datt2[c('age', 'Primary_Secondary', 'Higher_Education')] |> data.matrix()timee <- datt2[['time']]Initialize(feat, timee, d1, d2, 1e-10)param <- c(0.5, 0.1, 0.6, 0.2)compute_grad(param)Compute the Hessian Matrix of the Log-Likelihood
Description
Calculates the Hessian matrix of the negative log-likelihood function using finite differences.This function is useful for understanding the curvature of the log-likelihood surface and for optimization purposes.
Usage
compute_hessian(param)Arguments
param | A numeric vector of parameters for which the Hessian matrix is calculated. |
Details
This function approximates the Hessian matrix using central finite differences.Ensure that the step sizeh is appropriately set during initialization to avoid numerical instability.The function requires the data to be initialized usingInitialize before being called.
Value
A numeric matrix representing the Hessian matrix at the specified parameters.
Examples
library(cmpp)data("fertility_data")Nam <- names(fertility_data)fertility_data$Educationdatt <- make_Dummy(fertility_data, features = c("Education"))datt <- datt$New_Data datt['Primary_Secondary'] <- datt$`Education:2`datt['Higher_Education'] <- datt$`Education:3`datt$`Education:2` <- datt$`Education:3` <- NULLdatt2 <- make_Dummy(datt, features = 'Event')$New_Datad1 <- datt2$`Event:2`d2 <- datt2$`Event:3`feat <- datt2[c('age', 'Primary_Secondary', 'Higher_Education')] |> data.matrix()timee <- datt2[['time']]Initialize(feat, timee, d1, d2, 1e-10)# Estimate model parameters using default initial values and the BFGS methodresult <- estimate_parameters()print(result)param <- c(0.5, 0.1, 0.6, 0.2)hessian <- compute_hessian(param)print(hessian)Compute the Gradient of the Log-Likelihood Function Generalized odds rate (GOR)
Description
This function computes the gradient of the log-likelihood function for the parametric model (GOR Approach).
Arguments
Params | A numeric vector of parameters. |
Value
A numeric vector representing the gradient of the log-likelihood.
Examples
library(cmpp)features <- matrix(rnorm(300, 1, 2), nrow = 100, ncol = 3)set.seed(1984)delta1 <- sample(c(0, 1), 100, replace = TRUE)delta2 <- 1 - delta1x <- rexp(100, rate = 1/3)Initialize(features, x, delta1, delta2, h = 1e-5)params <- rep(0.001, 2 * (ncol(features) + 3))gradient <- compute_log_f_gradient_rcpp(params)print(gradient)Compute the Gradient of the Log-Likelihood Function Proportional Odds Model (POM)
Description
This function computes the gradient of the log-likelihood function for the parametric model (POM Approach).
Arguments
Params | A numeric vector of parameters. |
Value
A numeric vector representing the gradient of the log-likelihood.
Examples
library(cmpp)set.seed(1984)features <- matrix(rnorm(300, 1, 2), nrow = 100, ncol = 3)delta1 <- sample(c(0, 1), 100, replace = TRUE)delta2 <- 1 - delta1x <- rexp(100, rate = 1/5)Initialize(features, x, delta1, delta2, h = 1e-5)params <- rep(0.001, 2 * (ncol(features) + 2))gradient <- compute_log_f_gradient_rcpp2(params)print(gradient)Compute the Gradient of the Log-Likelihood Function Proportional Hazards Model (PHM)
Description
This function computes the gradient of the log-likelihood function for the parametric model (PHM Approach).
Arguments
Params | A numeric vector of parameters. |
Value
A numeric vector representing the gradient of the log-likelihood.
Examples
library(cmpp)set.seed(1984) features <- matrix(rnorm(300, 1, 2), nrow = 100, ncol = 3)delta1 <- sample(c(0, 1), 100, replace = TRUE)delta2 <- 1 - delta1x <- rexp(100, rate = 1/10)Initialize(features, x, delta1, delta2, h = 1e-5)params <- rep(0.001, 2 * (ncol(features) + 2))gradient <- compute_log_f_gradient_rcpp3(params)print(gradient)Compute the Hessian Matrix of the Log-Likelihood Function Generalized odds rate (GOR)
Description
This function computes the Hessian matrix of the log-likelihood function for the parametric model (GOR Approach).
Arguments
Params | A numeric vector of parameters. |
Value
A numeric matrix representing the Hessian matrix of the log-likelihood.
Examples
library(cmpp)set.seed(1984)features <- matrix(rnorm(300, 1, 2), nrow = 100, ncol = 3)delta1 <- sample(c(0, 1), 100, replace = TRUE)delta2 <- 1 - delta1x <- rexp(100, rate = 1/7)Initialize(features, x, delta1, delta2, h = 1e-4)params <- rep(0.001, 2 * (ncol(features) + 3))hessian <- compute_log_f_hessian_rcpp(params)print(hessian)Estimate Model Parameters Using Optimization
Description
This function estimates the parameters of a model by minimizing the negativelog-likelihood function using the specified optimization method. It utilizestheoptim() function in R, with the provided initial parameter values andgradient computation. The optimization method can be specified, with "BFGS" beingthe default.
Usage
estimate_parameters(initial_params = rep(0.01, 4), optimMethod = 'BFGS')Arguments
initial_params | A numeric vector of initial parameter values to start the optimization.Default is a vector of four values, all set to 0.01. |
optimMethod | A character string specifying the optimization method to use.The default is |
Details
Theestimate_parameters function performs parameter estimation by minimizingthe negative log-likelihood function using the chosen optimization method. The functionrequires an initial guess for the parameters (a numeric vector) and will optimize thelog-likelihood function. The optimization also takes into account the gradient of thelog-likelihood function, which is computed using thecompute_grad function. The resultof the optimization is an object of classoptim containing the estimated parameters andother details of the optimization process.
The optimization method can be specified via theoptimMethod argument. The default methodis "BFGS", but any method supported by R'soptim() function (such as "Nelder-Mead", "CG", etc.)can be used.
Value
Anoptim object containing the optimization results, including the estimatedparameters, value of the objective function at the optimum, and other optimization details.
See Also
stats::optim for more details on optimization methods and usage.
Examples
library(cmpp)data("fertility_data")Nam <- names(fertility_data)fertility_data$Educationdatt <- make_Dummy(fertility_data, features = c("Education"))datt <- datt$New_Data datt['Primary_Secondary'] <- datt$`Education:2`datt['Higher_Education'] <- datt$`Education:3`datt$`Education:2` <- datt$`Education:3` <- NULLdatt2 <- make_Dummy(datt, features = 'Event')$New_Datad1 <- datt2$`Event:2`d2 <- datt2$`Event:3`feat <- datt2[c('age', 'Primary_Secondary', 'Higher_Education')] |> data.matrix()timee <- datt2[['time']]Initialize(feat, timee, d1, d2, 1e-10)# Estimate model parameters using default initial values and the BFGS methodresult <- estimate_parameters()print(result)Estimate Parameters for the Generalized odds rate (GOR)
Description
This function estimates the parameters of the Generalized odds rate (GOR) using maximum likelihood estimation.It computes the Hessian matrix, calculates standard errors, and derives p-values for the estimated parameters.The function ensures that the diagonal elements of the covariance matrix are positive for valid variance estimates.
Usage
estimate_parameters_GOR(initial_params, FeaturesNames = NULL)Arguments
initial_params | A numeric vector of initial parameter values to start the optimization.Default is |
FeaturesNames | A character vector specifying the names of the features (covariates).If |
Details
This function performs the following steps:
Estimates the model parameters using the
optimfunction with the BFGS method.Computes the gradient of the log-likelihood using the
compute_log_f_gradient_rcppfunction.Computes the Hessian matrix numerically using the
hessianfunction from thenumDerivpackage.Ensures that the diagonal elements of the covariance matrix are positive to avoid invalid variance estimates.
Calculates standard errors and p-values for the estimated parameters.
The Generalized odds rate (GOR) is a parametric model for cumulative incidence functions in competing risks analysis.It uses Gompertz distributions to model the failure times for competing events.
Value
A data frame containing:
Parameter | The parameter names, including |
Estimate | The estimated parameter values. |
S.E | The standard errors of the estimated parameters. |
PValue | The p-values for the estimated parameters. |
See Also
stats::optim,compute_log_f_gradient_rcpp,log_f_rcpp,compute_log_f_hessian_rcpp.
Examples
library(cmpp)# Example dataset.seed(371)features <- matrix(rnorm(300, 1, 2), nrow = 100, ncol = 3)delta1 <- sample(c(0, 1), 100, replace = TRUE)delta2 <- 1 - delta1x <- rexp(100, rate = 1/4)# Initialize the Cmpp modelInitialize(features, x, delta1, delta2, h = 1e-5)# Define initial parameter valuesinitial_params <- rep(0.001, 2 * (ncol(features) + 3))# Estimate parameters using the GORresult <- estimate_parameters_GOR(initial_params)print(result)Estimate Parameters for the Proportional Hazards Model (PHM)
Description
This function estimates the parameters of the Proportional Hazards Model (PHM) using maximum likelihood estimation.It computes the Hessian matrix, calculates standard errors, and derives p-values for the estimated parameters.The function ensures that the diagonal elements of the covariance matrix are positive for valid variance estimates.
Usage
estimate_parameters_PHM(initial_params, FeaturesNames = NULL)Arguments
initial_params | A numeric vector of initial parameter values to start the optimization.Default is |
FeaturesNames | A character vector specifying the names of the features (covariates).If |
Details
This function performs the following steps:
Estimates the model parameters using the
optimfunction with the BFGS method.Computes the gradient of the log-likelihood using the
compute_log_f_gradient_rcpp3function.Computes the Hessian matrix numerically using the
hessianfunction from thenumDerivpackage.Ensures that the diagonal elements of the covariance matrix are positive to avoid invalid variance estimates.
Calculates standard errors and p-values for the estimated parameters.
The Proportional Hazards Model (PHM) is a parametric model for cumulative incidence functions in competing risks analysis.It uses Gompertz distributions to model the failure times for competing events.
Value
A data frame containing:
Parameter | The parameter names, including |
Estimate | The estimated parameter values. |
S.E | The standard errors of the estimated parameters. |
PValue | The p-values for the estimated parameters. |
See Also
stats::optim,compute_log_f_gradient_rcpp3,log_f_rcpp3.
Examples
library(cmpp)set.seed(1984)# Example datafeatures <- matrix(rnorm(300, 1, 2), nrow = 100, ncol = 3)delta1 <- sample(c(0, 1), 100, replace = TRUE)delta2 <- 1 - delta1x <- rexp(100, rate = 1/10)# Initialize the Cmpp modelInitialize(features, x, delta1, delta2, h = 1e-5)# Define initial parameter valuesinitial_params <- rep(0.001, 2 * (ncol(features) + 2))# Estimate parameters using the PHMresult <- estimate_parameters_PHM(initial_params)print(result)Estimate Parameters for the Proportional Odds Model (POM)
Description
This function estimates the parameters of the Proportional Odds Model (POM) using maximum likelihood estimation.It computes the Hessian matrix, calculates standard errors, and derives p-values for the estimated parameters.The function ensures that the diagonal elements of the covariance matrix are positive for valid variance estimates.
Usage
estimate_parameters_POM(initial_params, FeaturesNames = NULL)Arguments
initial_params | A numeric vector of initial parameter values to start the optimization.Default is |
FeaturesNames | A character vector specifying the names of the features (covariates).If |
Details
This function performs the following steps:
Estimates the model parameters using the
optimfunction with the BFGS method.Computes the gradient of the log-likelihood using the
compute_log_f_gradient_rcpp2function.Computes the Hessian matrix numerically using the
hessianfunction from thenumDerivpackage.Ensures that the diagonal elements of the covariance matrix are positive to avoid invalid variance estimates.
Calculates standard errors and p-values for the estimated parameters.
The Proportional Odds Model (POM) is a parametric model for cumulative incidence functions in competing risks analysis.It uses Gompertz distributions to model the failure times for competing events.
Value
A data frame containing:
Parameter | The parameter names, including |
Estimate | The estimated parameter values. |
S.E | The standard errors of the estimated parameters. |
PValue | The p-values for the estimated parameters. |
See Also
stats::optim,compute_log_f_gradient_rcpp2,log_f_rcpp2.
Examples
library(cmpp)set.seed(1984)# Example datafeatures <- matrix(rnorm(300, 1, 2), nrow = 100, ncol = 3)delta1 <- sample(c(0, 1), 100, replace = TRUE)delta2 <- 1 - delta1x <- rexp(100, rate = 1/10)# Initialize the Cmpp modelInitialize(features, x, delta1, delta2, h = 1e-5)# Define initial parameter valuesinitial_params <- rep(0.001, 2 * (ncol(features) + 2))# Estimate parameters using the POMresult <- estimate_parameters_POM(initial_params)print(result)Compute the PDF of the Parametric Generalized odds rate (GOR)
Description
This function computes the probability density function (PDF) of the parametric model (GOR Approach).
Arguments
Params | A numeric vector of parameters. |
Z | A numeric vector of covariates. |
x | A numeric value representing the time point. |
Value
A numeric value representing the PDF.
Examples
library(cmpp)features <- matrix(rnorm(300, 1, 2), nrow = 100, ncol = 3)delta1 <- sample(c(0, 1), 100, replace = TRUE)delta2 <- 1 - delta1x <- rexp(100, rate = 1/10)Initialize(features, x, delta1, delta2, h = 1e-5)params <- rep(0.001, (ncol(features) + 3))pdf_value <- f_pdf_rcpp(params, features[1, ], x[3])print(pdf_value)Compute the PDF of the Parametric Proportional Odds Model (POM)
Description
This function computes the probability density function (PDF) of the parametric model (POM Approach).
Arguments
Params | A numeric vector of parameters. |
Z | A numeric vector of covariates. |
x | A numeric value representing the time point. |
Value
A numeric value representing the PDF.
Examples
library(cmpp)set.seed(1984)features <- matrix(rnorm(300, 1, 2), nrow = 100, ncol = 3)delta1 <- sample(c(0, 1), 100, replace = TRUE)delta2 <- 1 - delta1x <- rexp(100, rate = 1/9)Initialize(features, x, delta1, delta2, h = 1e-4)params <- rep(0.001, (ncol(features) + 2))pdf_value <- f_pdf_rcpp2(params, features[1, ], x[3])print(pdf_value)Compute the PDF of the Parametric Proportional Hazards Model (PHM)
Description
This function computes the probability density function (PDF) of the parametric model (PHM Approach).
Arguments
Params | A numeric vector of parameters. |
Z | A numeric vector of covariates. |
x | A numeric value representing the time point. |
Value
A numeric value representing the PDF.
Examples
library(cmpp)set.seed(21)features <- matrix(rnorm(300, -1, 2), nrow = 100, ncol = 3)delta1 <- sample(c(0, 1), 100, replace = TRUE)delta2 <- 1 - delta1x <- rexp(100, rate = 1)Initialize(features, x, delta1, delta2, h = 1e-5)params <- rep(0.0001, (ncol(features) + 2))pdf_value <- f_pdf_rcpp3(params, features[4, ], x[4])print(pdf_value)Fertility History of Rural Women in Shiraz
Description
This dataset includes fertility history information from a cross-sectional studyof 652 women aged 15–49 years in rural areas of Shiraz, Iran.It was used in the article "A parametric method for cumulative incidence modeling with a new four-parameter log-logistic distribution"to model the cumulative incidence of live births and competing risks (stillborn fetus or abortion).
Usage
fertility_dataFormat
A data frame with 652 rows and the following variables:
- id
Unique identifier for each case.
- time
Time from marriage to event (live birth, competing event, or censoring).
- Event
Event indicator:
0= censored,1= live birth,2= stillborn fetus or abortion.- age
Age of the woman at the time of the event or censoring.
- Education
Education level:
1= Illiterate,2= Primary/Secondary,3= Higher Education.
Note
To view the article, follow this link:https://tbiomed.biomedcentral.com/articles/10.1186/1742-4682-8-43
Source
doi:10.1186/1742-4682-8-43https://doi.org/10.1186/1742-4682-8-43
References
Shayan, Z., Ayatollahi, S. M. T., & Zare, N. (2011)."A parametric method for cumulative incidence modeling with a new four-parameter log-logistic distribution."Theoretical Biology and Medical Modelling, 8:43.doi:10.1186/1742-4682-8-43
Examples
data(fertility_data)head(fertility_data)Compute the Log-Likelihood Function Generalized odds rate (GOR)
Description
This function computes the log-likelihood function for the parametric model (GOR Approach).
Arguments
Params | A numeric vector of parameters. |
Value
A numeric value representing the log-likelihood.
Examples
library(cmpp)set.seed(1984)features <- matrix(rnorm(300, 1, 2), nrow = 100, ncol = 3)delta1 <- sample(c(0, 1), 100, replace = TRUE)delta2 <- 1 - delta1x <- rexp(100, rate = 1/4)Initialize(features, x, delta1, delta2, h = 1e-5)params <- rep(0.001, 2 * (ncol(features) + 3))log_likelihood <- log_f_rcpp(params)print(log_likelihood)Compute the Log-Likelihood Function Proportional Odds Model (POM)
Description
This function computes the log-likelihood function for the parametric model (POM Approach).
Arguments
Params | A numeric vector of parameters. |
Value
A numeric value representing the log-likelihood.
Examples
library(cmpp)set.seed(1984)features <- matrix(rnorm(300, 1, 2), nrow = 100, ncol = 3)delta1 <- sample(c(0, 1), 100, replace = TRUE)delta2 <- 1 - delta1x <- rexp(100, rate = 1/8)Initialize(features, x, delta1, delta2, h = 1e-5)params <- rep(0.001, 2 * (ncol(features) + 2))log_likelihood <- log_f_rcpp2(params)print(log_likelihood)Compute the Log-Likelihood Function Proportional Hazards Model (PHM)
Description
This function computes the log-likelihood function for the parametric model (PHM Approach).
Arguments
Params | A numeric vector of parameters. |
Value
A numeric value representing the log-likelihood.
Examples
library(cmpp)set.seed(1984)features <- matrix(rnorm(300, 1, 2), nrow = 100, ncol = 3)delta1 <- sample(c(0, 1), 100, replace = TRUE)delta2 <- 1 - delta1x <- rexp(100, rate = 1/10)Initialize(features, x, delta1, delta2, h = 1e-5)params <- rep(0.001, 2 * (ncol(features) + 2))log_likelihood <- log_f_rcpp3(params)print(log_likelihood)Create a matrix of given size filled with a constant value
Description
This function creates ann x m matrix of typeEigen::MatrixXd, where eachelement is set to the specified constant value. This is useful for generatingmatrices with uniform values for testing, initialization, or other purposes incomputational tasks where a matrix filled with a constant is needed.
Usage
makeMat(n, m, value)Arguments
n | An integer representing the number of rows in the matrix. |
m | An integer representing the number of columns in the matrix. |
value | A numeric value that will be used to fill the matrix. |
Details
ThemakeMat function generates a matrix with the given dimensionsn x mwhere all elements are initialized to the same constant value. It is usefulin scenarios where a specific value needs to be assigned to all elements of thematrix, for example in machine learning algorithms, matrix manipulations, or tests.
Value
A numeric matrix of dimensionsn x m filled with the specified value.
Examples
library(cmpp)# Create a 3x3 matrix filled with 5mat <- makeMat(3, 3, 5)print(mat)Create Dummy Variables
Description
This function creates dummy variables for specified features in a dataset.
Usage
make_Dummy(Data = dat, features = c("sex", "cause_burn"), reff = "first")Arguments
Data | A data frame containing the data. |
features | A character vector of feature names for which dummy variables are to be created. |
reff | A character string indicating the reference level. It can be either "first" or "last". |
Value
A list containing two elements:
New_Data | A data frame with the original data and the newly created dummy variables. |
Original_Data | The original data frame. |
Examples
dat <- data.frame(sex = c('M', 'F', 'M'), cause_burn = c('A', 'B', 'A'))result <- make_Dummy(Data = dat, features = c('sex', 'cause_burn'), reff = "first")print(result$New_Data)Compute the PDF of the Gompertz Distribution
Description
Calculates the probability density function (PDF) of the Gompertz distributionfor given input values and parameters.
Usage
pdf_gomp(x, alpha, beta)Arguments
x | A numeric vector of non-negative input values (e.g., failure times). |
alpha | A positive numeric value representing the shape parameter. |
beta | A positive numeric value representing the scale parameter. |
Details
The PDF provides the relative likelihood of a failure or event occurring at specific time points.Ensure thatalpha andbeta are positive for meaningful results.
Value
A numeric vector of the PDF values for each input inx.
Examples
library(cmpp)data("fertility_data")Nam <- names(fertility_data)fertility_data$Educationdatt <- make_Dummy(fertility_data, features = c("Education"))datt <- datt$New_Data datt['Primary_Secondary'] <- datt$`Education:2`datt['Higher_Education'] <- datt$`Education:3`datt$`Education:2` <- datt$`Education:3` <- NULLdatt2 <- make_Dummy(datt, features = 'Event')$New_Datad1 <- datt2$`Event:2`d2 <- datt2$`Event:3`feat <- datt2[c('age', 'Primary_Secondary', 'Higher_Education')] |> data.matrix()timee <- datt2[['time']]Initialize(feat, timee, d1, d2, 1e-10)x <- c(1, 2, 3)alpha <- 0.5beta <- 0.1lapply(x, pdf_gomp, alpha = alpha, beta = beta) |> unlist()