The goal of Colossus is to provide an open-source means of performingsurvival analysis on big data with complex risk formulas. Colossus isdesigned to perform Cox Proportional Hazard regressions and Poissonregressions on datasets loaded as data.tables or data.frames. The riskmodels allowed are sums or products of linear, log-linear, or severalother radiation dose response formulas highlighted in the vignettes.Additional plotting capabilities are available.
By default, a fully portable version of the code is compiled, which doesnot support OpenMP on every system. Note that Colossus requires OpenMPsupport to perform parallel calculations. The environment variable“R_COLOSSUS_NOT_CRAN” is checked to determine if OpenMP should bedisabled for linux compiling with clang. The number of cores is set to 1if the environment variable is empty, the operating system is detectedas linux, and the default compiler or R compiler is clang. Colossustesting checks for the “NOT_CRAN” variable to determine if additionaltests should be run. Setting “NOT_CRAN” to “false” will disable thelonger tests. Currently, OpenMP support is not configured for linuxcompiling with clang.
Note: From versions 1.3.1 to 1.4.1 the expected inputs changed.Regressions are now run with CoxRun and PoisRun and formula inputs.Please see the “Unified Equation Representation” vignette for moredetails.
library(data.table)library(parallel)library(Colossus)## basic example code reproduced from the starting-description vignettedf<- data.table("UserID"= c(112,114,213,214,115,116,117),"Starting_Age"= c(18,20,18,19,21,20,18),"Ending_Age"= c(30,45,57,47,36,60,55),"Cancer_Status"= c(0,0,1,0,1,0,0),"a"= c(0,1,1,0,1,0,1),"b"= c(1,1.1,2.1,2,0.1,1,0.2),"c"= c(10,11,10,11,12,9,11),"d"= c(0,0,0,1,1,1,1))model<- Cox(Starting_Age,Ending_Age,Cancer_Status)~ loglinear(a,0)+ linear(b,c,1)+ plinear(d,2)+ multiplicative()a_n<- c(0.1,0.1,0.1,0.1)keep_constant<- c(0,0,0,0)control<-list("lr"=0.75,"maxiter"=100,"halfmax"=5,"epsilon"=1e-9,"deriv_epsilon"=1e-9,"step_max"=1.0,"verbose"=2,"ties"="breslow")e<- CoxRun(model,df,a_n=a_n,control=control)print(e)#> |-------------------------------------------------------------------|#> Final Results#> Covariate Subterm Term Number Central Estimate Standard Error 2-tail p-value#> <char> <char> <int> <num> <num> <num>#> 1: a loglin 0 21.67085 NaN NaN#> 2: b lin 1 0.10000 NaN NaN#> 3: c lin 1 0.10000 NaN NaN#> 4: d plin 2 0.10000 Inf 1#>#> Cox Model Used#> -2*Log-Likelihood: 2.64, AIC: 10.64#> Iterations run: 27#> maximum step size: 7.50e-01, maximum first derivative: 5.49e-10#> Analysis converged#> Run finished in 0.04 seconds#> |-------------------------------------------------------------------|