Movatterモバイル変換


[0]ホーム

URL:


Simulation-based calibration checking

1 About

This vignette shows the results of a simulation-based calibration(SBC) checking study to validate the implementation of the models inbrms.mmrm. SBC checking tests the ability of a Bayesianmodel to recapture the parameters used to simulate prior predictivedata. For details on SBC checking, please readModrák et al. (2024) and theSBC R package(Kim et al. 2022). This particular SBCchecking study uses thetargetspipeline in theviggnettes/sbc/subdirectory of thebrms.mmrmpackage source code.

2 Conclusion

From the results below, the SBC rank statistics are approximatelyuniformly distributed. In other words, the posterior distribution fromthebrms/Stan MMRM modeling code matches the prior fromwhich the datasets were simulated. This is evidence that both thesubgroup and non-subgroup models inbrms.mmrm areimplemented correctly.

3 Setup

To show the SBC checking results in this vignette, we first load codefrom the SBC checking study, and we use custom functions to read andplot rank statistics:

library(dplyr)library(ggplot2)library(tibble)library(tidyr)source("sbc/R/prior.R")source("sbc/R/response.R")source("sbc/R/scenarios.R")read_ranks<-function(path) {  fst::read_fst(path)|>    tibble::as_tibble()|>pivot_longer(cols =everything(),names_to ="parameter",values_to ="rank"    )}plot_ranks<-function(ranks) {ggplot(ranks)+geom_histogram(aes(x = rank),breaks =seq(from =0,to =max(ranks$rank),length.out =10)    )+facet_wrap(~parameter)}

Each section below is its own SBC checking study based on a givenmodeling scenario. Each scenario shows results from 1000 independentsimulations from the prior.

4 Subgroup scenario

The subgroup scenario distinguishes itself from the others by thepresence of a subgroup factor. Assumptions:

Model formula:

#> response ~ group + group:subgroup + group:subgroup:time + group:time + subgroup + subgroup:time + time + continuous1 + continuous2 + balanced + unbalanced + unstr(time = time, gr = patient) #> sigma ~ 0 + time

The prior was randomly generated and used for both simulation andanalysis:

setup_prior(subgroup)|>select(prior, class, coef, dpar)|>as.data.frame()#>                      prior     class                                       coef#> 1   normal(0.0267, 0.9478) Intercept#> 3    normal(0.2253, 2.218)         b                             balancedlevel2#> 4  normal(-0.2149, 1.1265)         b                             balancedlevel3#> 5   normal(0.2277, 2.1154)         b                                continuous1#> 6   normal(0.2009, 2.0754)         b                                continuous2#> 7    normal(0.2442, 0.694)         b                               groupgroup_2#> 8   normal(0.1478, 2.1664)         b            groupgroup_2:subgroupsubgroup_2#> 9  normal(-0.0688, 0.3147)         b groupgroup_2:subgroupsubgroup_2:timetime_2#> 10   normal(0.0493, 1.656)         b groupgroup_2:subgroupsubgroup_2:timetime_3#> 11 normal(-0.1995, 1.0938)         b                    groupgroup_2:timetime_2#> 12 normal(-0.2115, 0.5998)         b                    groupgroup_2:timetime_3#> 13  normal(0.1978, 2.6558)         b                         subgroupsubgroup_2#> 14  normal(0.0661, 0.6617)         b              subgroupsubgroup_2:timetime_2#> 15  normal(0.1861, 2.2234)         b              subgroupsubgroup_2:timetime_3#> 16   normal(-0.08, 0.6091)         b                                 timetime_2#> 17  normal(0.1083, 1.7101)         b                                 timetime_3#> 18  normal(0.0165, 2.5845)         b                           unbalancedlevel2#> 19  normal(0.0535, 0.4968)         b                           unbalancedlevel3#> 20               lkj(1.12)   cortime#> 22    normal(0.08, 1.3733)         b                                 timetime_1#> 23  normal(0.1067, 1.4465)         b                                 timetime_2#> 24 normal(-0.0887, 0.8012)         b                                 timetime_3#>     dpar#> 1#> 3#> 4#> 5#> 6#> 7#> 8#> 9#> 10#> 11#> 12#> 13#> 14#> 15#> 16#> 17#> 18#> 19#> 20#> 22 sigma#> 23 sigma#> 24 sigma

The following histograms show the SBC rank statistics which comparethe prior parameter draws draws to the posterior draws. If the datasimulation code and modeling code are both correct and consistent, thenthe rank statistics should be approximately uniformly distributed.

ranks_subgroup<-read_ranks("sbc/results/subgroup.fst")

Fixed effect parameter ranks:

ranks_subgroup|>filter(grepl("^b_", parameter))|>filter(!grepl("^b_sigma", parameter))|>plot_ranks()

Standard deviation parameter ranks:

ranks_subgroup|>filter(grepl("b_sigma", parameter))|>plot_ranks()

Correlation parameter ranks:

ranks_subgroup|>filter(grepl("cortime_", parameter))|>plot_ranks()

5 Unstructuredscenario

This scenario uses unstructured correlation and does not use asubgroup variable. Assumptions:

Model formula:

#> response ~ 0 + group + time + unstr(time = time, gr = patient) #> sigma ~ 0 + time

The prior was randomly generated and used for both simulation andanalysis:

setup_prior(unstructured)|>select(prior, class, coef, dpar)|>as.data.frame()#>                      prior   class         coef  dpar#> 2    normal(0.1433, 0.403)       b groupgroup_1#> 3   normal(0.2197, 2.4527)       b groupgroup_2#> 4  normal(-0.0344, 2.3087)       b groupgroup_3#> 5   normal(0.0434, 0.5924)       b   timetime_2#> 6   normal(-0.1839, 2.639)       b   timetime_3#> 7   normal(0.2186, 0.9006)       b   timetime_4#> 8              lkj(1.4196) cortime#> 10  normal(-0.1644, 2.893)       b   timetime_1 sigma#> 11 normal(-0.2093, 2.7022)       b   timetime_2 sigma#> 12 normal(-0.0947, 0.4162)       b   timetime_3 sigma#> 13  normal(0.0528, 1.1022)       b   timetime_4 sigma

SBC checking rank statistics:

ranks_unstructured<-read_ranks("sbc/results/unstructured.fst")

Fixed effect parameter ranks:

ranks_unstructured|>filter(grepl("^b_", parameter))|>filter(!grepl("^b_sigma", parameter))|>plot_ranks()

Log-scale standard deviation parameter ranks:

ranks_unstructured|>filter(grepl("b_sigma", parameter))|>plot_ranks()

Correlation parameter ranks:

ranks_unstructured|>filter(grepl("cortime_", parameter))|>plot_ranks()

6 Autoregressive movingaverage scenario

This scenario uses an autoregressive moving average (ARMA) model withautoregressive order 1 and moving average order 1. Assumptions:

Model formula:

#> response ~ 0 + group + time + arma(time = time, gr = patient, p = 1L, q = 1L, cov = FALSE) #> sigma ~ 0 + time

The prior was randomly generated and used for both simulation andanalysis:

setup_prior(autoregressive_moving_average)|>select(prior, class, coef, dpar)|>as.data.frame()#>                      prior class         coef  dpar#> 1        uniform(0.1, 0.9)    ar#> 3  normal(-0.0789, 0.6718)     b groupgroup_1#> 4    normal(0.078, 2.5557)     b groupgroup_2#> 5    normal(0.046, 0.7607)     b   timetime_2#> 6   normal(0.0105, 1.5174)     b   timetime_3#> 7        uniform(0.1, 0.9)    ma#> 9    normal(0.1917, 0.647)     b   timetime_1 sigma#> 10 normal(-0.1232, 1.2625)     b   timetime_2 sigma#> 11  normal(0.1052, 1.2691)     b   timetime_3 sigma

SBC checking rank statistics:

ranks_autoregressive_moving_average<-read_ranks("sbc/results/autoregressive_moving_average.fst")

Fixed effect parameter ranks:

ranks_autoregressive_moving_average|>filter(grepl("^b_", parameter))|>filter(!grepl("^b_sigma", parameter))|>plot_ranks()

Log-scale standard deviation parameter ranks:

ranks_autoregressive_moving_average|>filter(grepl("b_sigma", parameter))|>plot_ranks()

Correlation parameter ranks:

ranks_autoregressive_moving_average|>filter(parameter%in%c("ar[1]","ma[1]"))|>plot_ranks()

7 Autoregressivescenario

This scenario is the same as above, but the correlation structure isautoregressive with order 2. Model formula:

#> response ~ 0 + group + time + ar(time = time, gr = patient, p = 2L, cov = FALSE) #> sigma ~ 0 + time

The prior was randomly generated and used for both simulation andanalysis:

setup_prior(autoregressive)|>select(prior, class, coef, dpar)|>as.data.frame()#>                      prior class         coef  dpar#> 1        uniform(0.1, 0.9)    ar#> 3  normal(-0.1299, 2.7448)     b groupgroup_1#> 4   normal(-0.062, 2.4358)     b groupgroup_2#> 5    normal(0.1115, 2.184)     b   timetime_2#> 6   normal(0.0616, 0.9486)     b   timetime_3#> 8   normal(0.0484, 1.2511)     b   timetime_1 sigma#> 9   normal(0.1327, 1.5393)     b   timetime_2 sigma#> 10   normal(0.1074, 1.795)     b   timetime_3 sigma

SBC checking rank statistics:

ranks_autoregressive<-read_ranks("sbc/results/autoregressive.fst")

Fixed effect parameter ranks:

ranks_autoregressive|>filter(grepl("^b_", parameter))|>filter(!grepl("^b_sigma", parameter))|>plot_ranks()

Log-scale standard deviation parameter ranks:

ranks_autoregressive|>filter(grepl("b_sigma", parameter))|>plot_ranks()

Correlation parameter ranks:

ranks_autoregressive|>filter(parameter%in%c("ar[1]","ar[2]"))|>plot_ranks()

8 Moving averagescenario

This scenario is the same as above, but it uses a moving averagecorrelation structure with order 2. Model formula:

#> response ~ 0 + group + time + ma(time = time, gr = patient, q = 2L, cov = FALSE) #> sigma ~ 0 + time

The prior was randomly generated and used for both simulation andanalysis:

setup_prior(moving_average)|>select(prior, class, coef, dpar)|>as.data.frame()#>                      prior class         coef  dpar#> 2   normal(0.2346, 0.4199)     b groupgroup_1#> 3   normal(0.2109, 1.8085)     b groupgroup_2#> 4   normal(0.1955, 1.3087)     b   timetime_2#> 5  normal(-0.1645, 2.5088)     b   timetime_3#> 6        uniform(0.1, 0.9)    ma#> 8   normal(0.1624, 2.1076)     b   timetime_1 sigma#> 9   normal(0.1165, 0.6724)     b   timetime_2 sigma#> 10 normal(-0.0014, 1.0286)     b   timetime_3 sigma

SBC checking rank statistics:

ranks_moving_average<-read_ranks("sbc/results/moving_average.fst")

Fixed effect parameter ranks:

ranks_moving_average|>filter(grepl("^b_", parameter))|>filter(!grepl("^b_sigma", parameter))|>plot_ranks()

Log-scale standard deviation parameter ranks:

ranks_moving_average|>filter(grepl("b_sigma", parameter))|>plot_ranks()

Correlation parameter ranks:

ranks_moving_average|>filter(parameter%in%c("ma[1]","ma[2]"))|>plot_ranks()

9 Compound symmetryscenario

This scenario is the same as above, but it uses a compound symmetrycorrelation structure. Model formula:

#> response ~ 0 + group + time + cosy(time = time, gr = patient) #> sigma ~ 0 + time

The prior was randomly generated and used for both simulation andanalysis:

setup_prior(compound_symmetry)|>select(prior, class, coef, dpar)|>as.data.frame()#>                      prior class         coef  dpar#> 2  normal(-0.0199, 2.3874)     b groupgroup_1#> 3  normal(-0.1953, 2.3133)     b groupgroup_2#> 4   normal(0.1091, 1.0249)     b   timetime_2#> 5   normal(0.0072, 2.6298)     b   timetime_3#> 6        uniform(0.1, 0.9)  cosy#> 8   normal(-0.1885, 0.251)     b   timetime_1 sigma#> 9   normal(0.2114, 2.6677)     b   timetime_2 sigma#> 10  normal(0.0093, 1.0933)     b   timetime_3 sigma

SBC checking rank statistics:

ranks_compound_symmetry<-read_ranks("sbc/results/compound_symmetry.fst")

Fixed effect parameter ranks:

ranks_compound_symmetry|>filter(grepl("^b_", parameter))|>filter(!grepl("^b_sigma", parameter))|>plot_ranks()

Log-scale standard deviation parameter ranks:

ranks_compound_symmetry|>filter(grepl("b_sigma", parameter))|>plot_ranks()

Correlation parameter ranks:

ranks_compound_symmetry|>filter(parameter=="cosy")|>plot_ranks()

10 Diagonal scenario

This scenario is the same as above, but it uses a diagonalcorrelation structure (independent time points within patients). Modelformula:

#> response ~ 0 + group + time #> sigma ~ group + group:time + time

The prior was randomly generated and used for both simulation andanalysis:

setup_prior(diagonal)|>select(prior, class, coef, dpar)|>as.data.frame()#>                      prior     class                    coef  dpar#> 2   normal(0.1858, 1.6609)         b            groupgroup_1#> 3   normal(0.1542, 1.0235)         b            groupgroup_2#> 4     normal(0.1078, 2.43)         b              timetime_2#> 5   normal(0.1545, 1.4261)         b              timetime_3#> 6   normal(0.2037, 1.2606) Intercept                         sigma#> 8   normal(-0.2353, 2.852)         b            groupgroup_2 sigma#> 9   normal(0.2399, 1.0988)         b groupgroup_2:timetime_2 sigma#> 10 normal(-0.2134, 0.5962)         b groupgroup_2:timetime_3 sigma#> 11  normal(0.2072, 0.7107)         b              timetime_2 sigma#> 12  normal(0.1294, 1.1781)         b              timetime_3 sigma

SBC checking rank statistics:

ranks_diagonal<-read_ranks("sbc/results/diagonal.fst")

Fixed effect parameter ranks:

ranks_diagonal|>filter(grepl("^b_", parameter))|>filter(!grepl("^b_sigma", parameter))|>plot_ranks()

Log-scale standard deviation parameter ranks:

ranks_diagonal|>filter(grepl("b_sigma", parameter))|>plot_ranks()

References

Kim, S., Moon, H., Modrák, M., and Säilynoja, T. (2022),SBC:Simulation based calibration for rstan/cmdstanr models.
Modrák, M., Moon, A. H., Kim, S., Bürkner, P., Huurre, N., Faltejsková,K., Gelman, A., and Vehtari, A. (2024),Simulation-BasedCalibrationChecking forBayesianComputation:TheChoice ofTestQuantitiesShapesSensitivity,”Bayesian Analysis,forthcoming.https://doi.org/10.1214/23-BA1404.

[8]ページ先頭

©2009-2025 Movatter.jp