In this vignette we briefly compare themmrm::mmrm,SAS’sPROC GLIMMIX,nlme::gls,lme4::lmer, andglmmTMB::glmmTMB functions forfitting mixed models for repeated measures (MMRMs). A primary differencein these implementations lies in the covariance structures that aresupported “out of the box”. In particular,PROC GLIMMIX andmmrm are the only procedures which provide support for manyof the most common MMRM covariance structures. Most covariancestructures can be implemented ingls, though users arerequired to define them manually.lmer andglmmTMB are more limited. We find thatmmmrmconverges more quickly than other R implementations while also producingestimates that are virtually identical toPROC GLIMMIX’s.
Two datasets are used to illustrate model fitting with themmrm,lme4,nlme,glmmTMB R packages as well asPROC GLIMMIX.These data are also used to compare these implementations’ operatingcharacteristics.
The FEV dataset contains measurements of FEV1 (forced expired volumein one second), a measure of how quickly the lungs can be emptied. Lowlevels of FEV1 may indicate chronic obstructive pulmonary disease(COPD). It is summarized below.
Stratified by ARMCD Overall PBO TRT n 800 420 380 USUBJID (%) PT[1-200] 200 105 (52.5) 95 (47.5) AVISIT VIS1 200 105 95 VIS2 200 105 95 VIS3 200 105 95 VIS4 200 105 95 RACE (%) Asian 280 (35.0) 152 (36.2) 128 (33.7) Black or African American 300 (37.5) 184 (43.8) 116 (30.5) White 220 (27.5) 84 (20.0) 136 (35.8) SEX = Female (%) 424 (53.0) 220 (52.4) 204 (53.7) FEV1_BL (mean (SD)) 40.19 (9.12) 40.46 (8.84) 39.90 (9.42) FEV1 (mean (SD)) 42.30 (9.32) 40.24 (8.67) 44.45 (9.51) WEIGHT (mean (SD)) 0.52 (0.23) 0.52 (0.23) 0.51 (0.23) VISITN (mean (SD)) 2.50 (1.12) 2.50 (1.12) 2.50 (1.12) VISITN2 (mean (SD)) -0.02 (1.03) 0.01 (1.07) -0.04 (0.98)The BCVA dataset contains data from a randomized longitudinalophthalmology trial evaluating the change in baseline corrected visualacuity (BCVA) over the course of 10 visits. BCVA corresponds to thenumber of letters read from a visual acuity chart. A summary of the datais given below:
Stratified by ARMCD Overall CTL TRT n 8605 4123 4482 USUBJID (%) PT[1-1000] 1000 494 (49.4) 506 (50.6) AVISIT VIS1 983 482 501 VIS2 980 481 499 VIS3 960 471 489 VIS4 946 458 488 VIS5 925 454 471 VIS6 868 410 458 VIS7 816 388 428 VIS8 791 371 420 VIS9 719 327 392 VIS10 617 281 336 RACE (%) Asian 297 (29.7) 151 (30.6) 146 (28.9) Black or African American 317 (31.7) 149 (30.1) 168 (33.2) White 386 (38.6) 194 (39.3) 192 (37.9) BCVA_BL (mean (SD)) 75.12 (9.93) 74.90 (9.76) 75.40 (10.1) BCVA_CHG (mean (SD)) VIS1 5.59 (1.31) 5.32 (1.23) 5.86 (1.33) VIS10 9.18 (2.91) 7.49 (2.58) 10.60 (2.36)Listed below are some of the most commonly used covariance structuresused when fitting MMRMs. We indicate which matrices are available “outof the box” for each implementation considered in this vignette. Notethat this table is not exhaustive;PROC GLIMMIX andglmmTMB support additional spatial covariancestructures.
| Covariance structures | mmrm | PROC GLIMMIX | gls | lmer | glmmTMB |
|---|---|---|---|---|---|
| Ante-dependence (heterogeneous) | X | X | |||
| Ante-dependence (homogeneous) | X | ||||
| Auto-regressive (heterogeneous) | X | X | X | ||
| Auto-regressive (homogeneous) | X | X | X | X | |
| Compound symmetry (heterogeneous) | X | X | X | X | |
| Compound symmetry (homogeneous) | X | X | X | ||
| Spatial exponential | X | X | X | X | |
| Toeplitz (heterogeneous) | X | X | X | ||
| Toeplitz (homogeneous) | X | X | |||
| Unstructured | X | X | X | X | X |
Code for fitting MMRMs to the FEV data using each of the consideredfunctions and covariance structures are provided below. Fixed effectsfor the visit number, treatment assignment and the interaction betweenthe two are modeled.
PROC GLIMMIXPROC GLIMMIX DATA = fev_data;CLASS AVISIT(ref = 'VIS1') ARMCD(ref = 'PBO') USUBJID;MODEL FEV1 = AVISIT|ARMCD / ddfm=satterthwaite solution chisq;RANDOM AVISIT / subject=USUBJID type=ANTE(1);mmrmmmrm( formula = FEV1 ~ ARMCD * AVISIT +adh(VISITN | USUBJID), data = fev_data)mmrmmmrm( formula =FEV1 ~ ARMCD * AVISIT +ad(VISITN | USUBJID), data = fev_data)PROC GLIMMIXPROC GLIMMIX DATA = fev_data;CLASS AVISIT(ref = 'VIS1') ARMCD(ref = 'PBO') USUBJID;MODEL FEV1 = AVISIT|ARMCD / ddfm=satterthwaite solution chisq;RANDOM AVISIT / subject=USUBJID type=ARH(1);mmrmmmrm( formula = FEV1 ~ ARMCD * AVISIT +ar1h(VISITN | USUBJID), data = fev_data)glsgls( FEV1 ~ ARMCD * AVISIT, data = fev_data, correlation =corAR1(form = ~AVISIT | USUBJID), weights =varIdent(form = ~1|AVISIT), na.action = na.omit)PROC GLIMMIXPROC GLIMMIX DATA = fev_data;CLASS AVISIT(ref = 'VIS1') ARMCD(ref = 'PBO') USUBJID;MODEL FEV1 = ARMCD|AVISIT / ddfm=satterthwaite solution chisq;RANDOM AVISIT / subject=USUBJID type=AR(1);mmrmmmrm( formula = FEV1 ~ ARMCD * AVISIT +ar1(VISITN | USUBJID), data = fev_data)glsgls( FEV1 ~ ARMCD * AVISIT, data = fev_data, correlation =corAR1(form = ~AVISIT | USUBJID), na.action = na.omit)glmmTMBglmmTMB( FEV1 ~ ARMCD * AVISIT +ar1(0 + AVISIT | USUBJID),dispformula = ~ 0, data = fev_data)PROC GLIMMIXPROC GLIMMIX DATA = fev_data;CLASS AVISIT(ref = 'VIS1') ARMCD(ref = 'PBO') USUBJID;MODEL FEV1 = AVISIT|ARMCD / ddfm=satterthwaite solution chisq;RANDOM AVISIT / subject=USUBJID type=CSH;mmrmmmrm( formula = FEV1 ~ ARMCD * AVISIT +csh(VISITN | USUBJID), data = fev_data)glsgls( FEV1 ~ ARMCD * AVISIT, data = fev_data, correlation =corCompSymm(form = ~AVISIT | USUBJID), weights =varIdent(form = ~1|AVISIT), na.action = na.omit)glmmTMBglmmTMB( FEV1 ~ ARMCD * AVISIT +cs(0 + AVISIT | USUBJID),dispformula = ~ 0, data = fev_data)PROC GLIMMIXPROC GLIMMIX DATA = fev_data;CLASS AVISIT(ref = 'VIS1') ARMCD(ref = 'PBO') USUBJID;MODEL FEV1 = AVISIT|ARMCD / ddfm=satterthwaite solution chisq;RANDOM AVISIT / subject=USUBJID type=CS;mmrmmmrm( formula = FEV1 ~ ARMCD * AVISIT +cs(VISITN | USUBJID), data = fev_data)glsgls( FEV1 ~ ARMCD * AVISIT, data = fev_data, correlation =corCompSymm(form = ~AVISIT | USUBJID), na.action = na.omit)PROC GLIMMIXPROC GLIMMIX DATA = fev_data;CLASS AVISIT(ref = 'VIS1') ARMCD(ref = 'PBO') USUBJID;MODEL FEV1 = AVISIT|ARMCD / ddfm=satterthwaite solution chisq;RANDOM / subject=USUBJID type=sp(exp)(visitn) rcorr;mmrmmmrm( formula = FEV1 ~ ARMCD * AVISIT +sp_exp(VISITN | USUBJID), data = fev_data)glsgls( FEV1 ~ ARMCD * AVISIT, data = fev_data, correlation =corExp(form = ~AVISIT | USUBJID), weights = varIdent(form = ~1|AVISIT), na.action = na.omit)glmmTMB# NOTE: requires use of coordinatesglmmTMB( FEV1 ~ ARMCD * AVISIT +exp(0 + AVISIT | USUBJID),dispformula = ~ 0, data = fev_data)PROC GLIMMIXPROC GLIMMIX DATA = fev_data;CLASS AVISIT(ref = 'VIS1') ARMCD(ref = 'PBO') USUBJID;MODEL FEV1 = AVISIT|ARMCD / ddfm=satterthwaite solution chisq;RANDOM AVISIT / subject=USUBJID type=TOEPH;mmrmmmrm( formula = FEV1 ~ ARMCD * AVISIT +toeph(AVISIT | USUBJID), data = fev_data)glmmTMB glmmTMB( FEV1 ~ ARMCD * AVISIT +toep(0 + AVISIT | USUBJID),dispformula = ~ 0, data = fev_data)PROC GLIMMIXPROC GLIMMIX DATA = fev_data;CLASS AVISIT(ref = 'VIS1') ARMCD(ref = 'PBO') USUBJID;MODEL FEV1 = AVISIT|ARMCD / ddfm=satterthwaite solution chisq;RANDOM AVISIT / subject=USUBJIDtype=TOEP;mmrmmmrm( formula = FEV1 ~ ARMCD * AVISIT +toep(AVISIT | USUBJID), data = fev_data)PROC GLIMMIXPROC GLIMMIX DATA = fev_data;CLASS AVISIT(ref = 'VIS1') ARMCD(ref = 'PBO') USUBJID;MODEL FEV1 = ARMCD|AVISIT / ddfm=satterthwaite solution chisq;RANDOM AVISIT / subject=USUBJIDtype=un;mmrmmmrm( formula = FEV1 ~ ARMCD * AVISIT +us(AVISIT | USUBJID), data = fev_data)glsgls( FEV1 ~ ARMCD * AVISIT, data = fev_data, correlation =corSymm(form = ~AVISIT | USUBJID), weights = varIdent(form = ~1|AVISIT), na.action = na.omit)lmerlmer( FEV1 ~ ARMCD * AVISIT +(0 + AVISIT | USUBJID), data = fev_data, control = lmerControl(check.nobs.vs.nRE = "ignore"), na.action = na.omit)glmmTMBglmmTMB( FEV1 ~ ARMCD * AVISIT +us(0 + AVISIT | USUBJID),dispformula = ~ 0, data = fev_data)Next, the MMRM fitting procedures are compared using the FEV and BCVAdatasets. FEV1 measurements are modeled as a function of race, treatmentarm, visit number, and the interaction between the treatment arm and thevisit number. Change in BCVA is assumed to be a function of race,baseline BCVA, treatment arm, visit number, and the treatment–visitinteraction. In both datasets, repeated measures are modeled using anunstructured covariance matrix. The implementations’ convergence timesare evaluated first, followed by a comparison of their estimates.Finally, we fit these procedures on simulated BCVA-like data to assessthe impact of missingness on convergence rates.
Themmrm,PROC GLIMMIX,gls,lmer, andglmmTMB functions are applied to theFEV dataset 10 times. The convergence times are recorded for eachreplicate and are reported in the table below.
| Implementation | Median | First Quartile | Third Quartile |
|---|---|---|---|
| mmrm | 56.15 | 55.76 | 56.30 |
| PROC GLIMMIX | 100.00 | 100.00 | 100.00 |
| lmer | 247.02 | 245.25 | 257.46 |
| gls | 687.63 | 683.50 | 692.45 |
| glmmTMB | 715.90 | 708.70 | 721.57 |
It is clear from these results thatmmrm convergessignificantly faster than other R functions. Though not demonstratedhere, this is generally true regardless of the sample size andcovariance structure used.mmrm is faster thanPROC GLIMMIX.
The MMRM implementations are now applied to the BCVA dataset 10times. The convergence times are presented below.
| Implementation | Median | First Quartile | Third Quartile |
|---|---|---|---|
| mmrm | 3.36 | 3.32 | 3.46 |
| glmmTMB | 18.65 | 18.14 | 18.87 |
| PROC GLIMMIX | 36.25 | 36.17 | 36.29 |
| gls | 164.36 | 158.61 | 165.93 |
| lmer | 165.26 | 157.46 | 166.42 |
We again find thatmmrm produces the fastest convergencetimes on average.
We next estimate the marginal mean treatment effects for each visitin the FEV and BCVA datasets using the MMRM fitting procedures. All Rimplementations’ estimates are reported relative toPROC GLIMMIX’s estimates. Convergence status is alsoreported.
The R procedures’ estimates are very similar to those output byPROC GLIMMIX, thoughmmrm andglsgenerate the estimates that are closest to those produced when usingSAS. All methods converge using their default optimizationarguments.
mmrm,gls andlmer produceestimates that are virtually identical toPROC GLIMMIX’s,whileglmmTMB does not. This is likely explained byglmmTMB’s failure to converge. Note too thatlmer fails to converge.
The results of the previous benchmark suggest that the amount ofpatients missing from later time points affect certain implementations’capacity to converge. We investigate this further by simulating datausing a data-generating process similar to that of the BCVA datasets,though with various rates of patient dropout.
Ten datasets of 200 patients are generated each of the followinglevels of missingness: none, mild, moderate, and high. In all scenarios,observations are missing at random. The number patients observed at eachvisit is obtained for one replicated dataset at each level ofmissingness is presented in the table below.
| none | mild | moderate | high | |
|---|---|---|---|---|
| VIS01 | 200 | 196.7 | 197.6 | 188.1 |
| VIS02 | 200 | 195.4 | 194.4 | 182.4 |
| VIS03 | 200 | 195.1 | 190.7 | 175.2 |
| VIS04 | 200 | 194.1 | 188.4 | 162.8 |
| VIS05 | 200 | 191.6 | 182.5 | 142.7 |
| VIS06 | 200 | 188.2 | 177.3 | 125.4 |
| VIS07 | 200 | 184.6 | 168.0 | 105.9 |
| VIS08 | 200 | 178.5 | 155.4 | 82.6 |
| VIS09 | 200 | 175.3 | 139.9 | 58.1 |
| VIS10 | 200 | 164.1 | 124.0 | 39.5 |
The convergence rates of all implementations for stratified bymissingness level is presented in the plot below.
mmrm,gls, andPROC GLIMMIXare resilient to missingness, only exhibiting some convergence problemsin the scenarios with the most missingness. These implementationsconverged in all the other scenarios’ replicates.glmmTMB,on the other hand, has convergence issues in the no-, mild-, andhigh-missingness datasets, with the worst convergence rate occurring inthe datasets with the most dropout. Finally,lmer isunreliable in all scenarios, suggesting that it’s convergence issuesstem from something other than the missing observations.
Note that the default optimization schemes are used for each method;these schemes can be modified to potentially improve convergencerates.
A more comprehensive simulation study using data-generating processessimilar to the one used here is outlined in thesimulations/missing-data-benchmarkssubdirectory. In addition to assessing the effect of missing data onsoftware convergence rates, we also evaluate these methods’ fit timesand empirical bias, variance, 95% coverage rates, type I error rates andtype II error rates.mmrm is found to be the most mostrobust software for fitting MMRMs in scenarios where a large proportionof patients are missing from the last time points. Additionally,mmrm has the fastest average fit times regardless of theamount of missingness. All implementations considered produce similarempirical biases, variances, 95% coverage rates, type I error rates andtype II error rates.
#> R version 4.5.1 (2025-06-13 ucrt)#> Platform: x86_64-w64-mingw32/x64#> Running under: Windows 11 x64 (build 26200)#> #> Matrix products: default#> LAPACK version 3.12.1#> #> locale:#> [1] LC_COLLATE=C LC_CTYPE=German_Switzerland.utf8 #> [3] LC_MONETARY=German_Switzerland.utf8 LC_NUMERIC=C #> [5] LC_TIME=German_Switzerland.utf8 #> #> time zone: Asia/Taipei#> tzcode source: internal#> #> attached base packages:#> [1] stats graphics grDevices utils datasets methods base #> #> other attached packages:#> [1] knitr_1.50 sasr_0.1.5 glmmTMB_1.1.13 #> [4] nlme_3.1-168 lme4_1.1-37 Matrix_1.7-3 #> [7] stringr_1.6.0 microbenchmark_1.5.0 clusterGeneration_1.3.8#> [10] MASS_7.3-65 yardstick_1.3.2 workflowsets_1.1.1 #> [13] workflows_1.3.0 tune_2.0.1 tidyr_1.3.1 #> [16] tailor_0.1.0 rsample_1.3.1 recipes_1.3.1 #> [19] purrr_1.1.0 parsnip_1.3.3 modeldata_1.5.1 #> [22] infer_1.0.9 ggplot2_4.0.1 dplyr_1.1.4 #> [25] dials_1.4.2 scales_1.4.0 broom_1.0.11 #> [28] tidymodels_1.4.1 car_3.1-3 carData_3.0-5 #> [31] emmeans_1.11.2-8 mmrm_0.3.16 #> #> loaded via a namespace (and not attached):#> [1] Rdpack_2.6.4 sandwich_3.1-1 rlang_1.1.6 #> [4] magrittr_2.0.4 multcomp_1.4-29 furrr_0.3.1 #> [7] compiler_4.5.1 mgcv_1.9-3 png_0.1-8 #> [10] vctrs_0.6.5 lhs_1.2.0 pkgconfig_2.0.3 #> [13] fastmap_1.2.0 backports_1.5.0 labeling_0.4.3 #> [16] utf8_1.2.6 rmarkdown_2.30 prodlim_2025.04.28 #> [19] nloptr_2.2.1.9000 xfun_0.54 cachem_1.1.0 #> [22] jsonlite_2.0.0 parallel_4.5.1 R6_2.6.1 #> [25] bslib_0.9.0 stringi_1.8.7 RColorBrewer_1.1-3 #> [28] reticulate_1.43.0 parallelly_1.45.1 boot_1.3-31 #> [31] rpart_4.1.24 numDeriv_2016.8-1.1 lubridate_1.9.4 #> [34] jquerylib_0.1.4 estimability_1.5.1 Rcpp_1.1.0 #> [37] future.apply_1.20.0 zoo_1.8-14 splines_4.5.1 #> [40] nnet_7.3-20 timechange_0.3.0 tidyselect_1.2.1 #> [43] rstudioapi_0.17.1 abind_1.4-8 yaml_2.3.10 #> [46] timeDate_4051.111 TMB_1.9.18 codetools_0.2-20 #> [49] listenv_0.10.0 lattice_0.22-7 tibble_3.3.0 #> [52] withr_3.0.2 S7_0.2.0 coda_0.19-4.1 #> [55] evaluate_1.0.5 future_1.68.0 survival_3.8-3 #> [58] pillar_1.11.1 checkmate_2.3.3 reformulas_0.4.2 #> [61] generics_0.1.4 minqa_1.2.8 globals_0.18.0 #> [64] xtable_1.8-4 class_7.3-23 glue_1.8.0 #> [67] tools_4.5.1 data.table_1.17.8 gower_1.0.2 #> [70] mvtnorm_1.3-3 grid_4.5.1 rbibutils_2.4 #> [73] ipred_0.9-15 Formula_1.2-5 cli_3.6.5 #> [76] DiceDesign_1.10 lava_1.8.1 gtable_0.3.6 #> [79] GPfit_1.0-9 sass_0.4.10 digest_0.6.37 #> [82] TH.data_1.1-4 farver_2.1.2 htmltools_0.5.8.1 #> [85] lifecycle_1.0.4 hardhat_1.4.2 sparsevctrs_0.3.4