Movatterモバイル変換


[0]ホーム

URL:


Title:Detect Clinical Trial Sites Over- or Under-Reporting ClinicalEvents
Version:1.0.0
Description:Monitoring reporting rates of subject-level clinical events (e.g. adverse events, protocol deviations) reported by clinical trial sites is an important aspect of risk-based quality monitoring strategy. Sites that are under-reporting or over-reporting events can be detected using bootstrap simulations during which patients are redistributed between sites. Site-specific distributions of event reporting rates are generated that are used to assign probabilities to the observed reporting rates. (Koneswarakantha 2024 <doi:10.1007/s43441-024-00631-8>).
URL:https://openpharma.github.io/simaerep/,https://github.com/openpharma/simaerep/
License:MIT + file LICENSE
Encoding:UTF-8
Depends:R (≥ 4.0), ggplot2
Imports:dplyr (≥ 1.1.0), tidyr (≥ 1.1.0), magrittr, purrr, rlang,stringr, forcats, cowplot, RColorBrewer, furrr (≥ 0.2.1),progressr, knitr, tibble, dbplyr, glue
Suggests:testthat, devtools, pkgdown, spelling, haven, vdiffr, lintr,DBI, duckdb, ggExtra
RoxygenNote:7.3.2
Language:en-US
Config/testthat/edition:3
NeedsCompilation:no
Packaged:2025-10-28 11:22:11 UTC; koneswab
Author:Bjoern KoneswarakanthaORCID iD [aut, cre, cph], F. Hoffmann-La Roche Ltd [cph]
Maintainer:Bjoern Koneswarakantha <bjoern.koneswarakantha@roche.com>
Repository:CRAN
Date/Publication:2025-10-28 11:40:02 UTC

Pipe operator

Description

Seemagrittr::%>% for details.

Usage

lhs %>% rhs

Value

returns output of rhs function


Aggregate duplicated visits.

Description

Internal function called bycheck_df_visit().

Usage

aggr_duplicated_visits(df_visit, event_names = "ae")

Arguments

df_visit

dataframe with columns: study_id, site_number, patnum, visit,n_ae

event_names

vector, contains the event names, default = "ae"

Value

df_visit corrected


Integrity check for df_visit.

Description

Internal function used by all functions that accept df_visit as a parameter.Checks for NA columns, numeric visits and AEs, implicitly missing andduplicated visits.

Usage

check_df_visit(df_visit, event_names = c("event"))

Arguments

df_visit

dataframe with columns: study_id, site_number, patnum, visit,n_ae

event_names

vector, contains the event names, default = "ae"

Value

corrected df_visit

See Also

simaerep

Examples

df_visit <- sim_test_data_study(  n_pat = 100,  n_sites = 5,  ratio_out = 0.4,  factor_event_rate = 0.6  ) %>%  # internal functions require internal column names  dplyr::rename(    site_number = site_id,    patnum = patient_id  )df_visit_filt <- df_visit %>%  dplyr::filter(visit != 3)df_visit_corr <- check_df_visit(df_visit_filt)3 %in% df_visit_corr$visitnrow(df_visit_corr) == nrow(df_visit)df_visit_corr <- check_df_visit(dplyr::bind_rows(df_visit, df_visit))nrow(df_visit_corr) == nrow(df_visit)

Evaluate sites.

Description

Correct under-reporting probabilities usingp.adjust.

Usage

eval_sites(  df_sim_sites,  method = "BH",  under_only = TRUE,  visit_med75 = TRUE,  ...)

Arguments

df_sim_sites

dataframe generated bysim_sites orsim_inframe()

method

character, passed to stats::p.adjust(), if NULL no multiplicity correctionwill be made.

under_only

Logical, compute under-reporting probabilities only.only applies to the classic algorithm in which a one-sided evaluation cansave computation time. Default: FALSE

visit_med75

Logical, should evaluation point visit_med75 be used. Compatiblewith inframe and classic version of the algorithm.Default: FALSE

...

use to pass r_sim_sites parameter to eval_sites_deprecated()

Value

dataframe with the following columns:

study_id

study identification

site_number

site identification

visit_med75

median(max(visit)) * 0.75

mean_ae_site_med75

mean AE at visit_med75 site level

mean_ae_study_med75

mean AE at visit_med75 study level

pval

p-value as returned bypoisson.test

prob

bootstrapped probability

See Also

site_aggr,sim_sites,sim_inframe,p.adjust

Examples

df_visit <- sim_test_data_study(  n_pat = 100,  n_sites = 5,  ratio_out = 0.4,  factor_event_rate = 0.6  ) %>%  # internal functions require internal column names  dplyr::rename(    n_ae = n_event,    site_number = site_id,    patnum = patient_id  )df_site <- site_aggr(df_visit)df_sim_sites <- sim_sites(df_site, df_visit, r = 100)df_eval <- eval_sites(df_sim_sites)df_eval

Expose implicitly missing visits.

Description

Internal function called bycheck_df_visit().

Usage

exp_implicit_missing_visits(df_visit, event_names = "ae")

Arguments

df_visit

dataframe with columns: study_id, site_number, patnum, visit,n_ae

event_names

vector, contains the event names, default = "ae"

Value

df_visit corrected


Get cumulative mean event development

Description

Calculate average increase of events per visit and cumulative average increase.

Usage

get_cum_mean_event_dev(  df_visit,  group = c("site_number", "study_id"),  event_names = c("ae"))

Arguments

df_visit

Data frame with columns: study_id, site_number, patnum, visit,n_ae.

group

character, grouping variable, one of: c("site_number", "study_id")

event_names

vector, contains the event names, default = "event"

Details

This is more stable than using mean cumulative patient count per visitas only a few patients will contribute to later visits. Here the impact of thelater visits is reduced as they can only add or subtract to the results fromearlier visits and not shift the mean independently.

Examples

df_visit <- sim_test_data_study(n_pat = 1000, n_sites = 10) %>%  dplyr::rename(    site_number = site_id,    patnum = patient_id,    n_ae = n_event  )get_cum_mean_event_dev(df_visit)get_cum_mean_event_dev(df_visit, group = "study_id")

Get df_visit_test

Description

Get df_visit_test

Usage

get_df_visit_test()

Get df_visit_test mapped

Description

Get df_visit_test mapped

Usage

get_df_visit_test_mapped()

replace cowplot::get_legend, to silence warningMultiple components found; returning the first one. To return all, use 'return_all = TRUE

Description

replace cowplot::get_legend, to silence warningMultiple components found; returning the first one. To return all, use 'return_all = TRUE

Usage

get_legend(p)

Get Portfolio Configuration

Description

Get Portfolio configuration from a df_visit input dataframe. Will. filter studies with only a few sites and patients and will anonymize IDs.. Portfolio configuration can beused bysim_test_data_portfolio to generate data for anartificial portfolio.

Usage

get_portf_config(  df_visit,  check = TRUE,  min_pat_per_study = 100,  min_sites_per_study = 10,  anonymize = TRUE,  pad_width = 4)

Arguments

df_visit

input dataframe with columns study_id, site_id, patient_id, visit, n_events.Can also be a lazy database table.

check

logical, perform standard checks on df_visit, Default: TRUE

min_pat_per_study

minimum number of patients per study, Default: 100

min_sites_per_study

minimum number of sites per study, Default: 10

anonymize

logical, Default: TRUE

pad_width

padding width for newly created IDs, Default: 4

Value

dataframe with the following columns:

study_id

study identification

event_per_visit_mean

meanevent per visit per study

site_id

site

max_visit_sd

standard deviation of maximum patient visits persite

max_visit_mean

mean of maximum patient visits per site

n_pat

number of patients

See Also

sim_test_data_studyget_portf_configsim_test_data_portfolio

Examples

df_visit1 <- sim_test_data_study(n_pat = 100, n_sites = 10,                                 ratio_out = 0.4, factor_event_rate = - 0.6,                                 study_id = "A")df_visit2 <- sim_test_data_study(n_pat = 100, n_sites = 10,                                 ratio_out = 0.2, factor_event_rate = - 0.1,                                 study_id = "B")df_visit <- dplyr::bind_rows(df_visit1, df_visit2)get_portf_config(df_visit)# Database examplecon <- DBI::dbConnect(duckdb::duckdb(), dbdir = ":memory:")dplyr::copy_to(con, df_visit, "visit")tbl_visit <- dplyr::tbl(con, "visit")get_portf_config(tbl_visit)DBI::dbDisconnect(con)

Get Portfolio Event RatesCalculates mean event rates per study and visit in a df_visit simaerep inputdataframe.

Description

Get Portfolio Event RatesCalculates mean event rates per study and visit in a df_visit simaerep inputdataframe.

Usage

get_portf_event_rates(df_visit, check = TRUE, anonymize = TRUE, pad_width = 4)

Arguments

df_visit

input dataframe with columns study_id, site_id, patient_id, visit, n_events.Can also be a lazy database table.

check

logical, perform standard checks on df_visit, Default: TRUE

anonymize

logical, Default: TRUE

pad_width

padding width for newly created IDs, Default: 4

Examples

df_visit1 <- sim_test_data_study(n_pat = 100, n_sites = 10,                                 ratio_out = 0.4, factor_event_rate = - 0.6,                                 study_id = "A")df_visit2 <- sim_test_data_study(n_pat = 100, n_sites = 10,                                 ratio_out = 0.2, factor_event_rate = - 0.1,                                 study_id = "B")df_visit <- dplyr::bind_rows(df_visit1, df_visit2)get_portf_event_rates(df_visit)# Database examplecon <- DBI::dbConnect(duckdb::duckdb(), dbdir = ":memory:")dplyr::copy_to(con, df_visit, "visit")tbl_visit <- dplyr::tbl(con, "visit")get_portf_event_rates(tbl_visit)DBI::dbDisconnect(con)

Get site mean ae development.

Description

Internal function used bysite_aggr(),returns mean AE development from visit 0 to visit_med75.

Usage

get_site_mean_ae_dev(df_visit, df_pat, df_site, event_names = c("ae"))

Arguments

df_visit

dataframe

df_pat

dataframe as returned by pat_aggr()

df_site

dataframe as returned by site_aggr()

event_names

vector, contains the event names, default = "ae"

Value

dataframe


Get visit_med75.

Description

Internal function used bysite_aggr().

Usage

get_visit_med75(df_pat, method = "med75_adj", min_pat_pool = 0.2)

Arguments

df_pat

dataframe as returned bypat_aggr()

method

character, one of c("med75", "med75_adj", "max") defining method fordefining evaluation point visit_med75 (see details), Default: "med75_adj"

min_pat_pool

double, minimum ratio of available patients available forsampling. Determines maximum visit_med75 value see Details. Default: 0.2

Value

dataframe


is orivisit class

Description

internal function

Usage

is_orivisit(x)

Arguments

x

object

Value

logical


is simaerep class

Description

internal function

Usage

is_simaerep(x)

Arguments

x

object

Value

logical


Calculate Max Rank

Description

like rank() with ties.method = "max", works on tbl objects

Usage

max_rank(df, col, col_new)

Arguments

df

dataframe

col

character column name to rank y

col_new

character column name for rankings

Details

this is needed for hochberg p value adjustment. We need to assign higherrank when multiple sites have same p value

Examples

df <- tibble::tibble(s = c(1, 2, 2, 2, 5, 10)) %>% dplyr::mutate(   rank = rank(s, ties.method = "max") )df %>% simaerep:::max_rank("s", "max_rank")# Databasecon <- DBI::dbConnect(duckdb::duckdb(), dbdir = ":memory:")dplyr::copy_to(con, df, "df")simaerep:::max_rank(dplyr::tbl(con, "df"), "s", "max_rank")DBI::dbDisconnect(con)

create orivisit object

Description

Internal S3 object, stores lazy reference to original visitdata.

Usage

orivisit(  df_visit,  call = NULL,  env = parent.frame(),  event_names = c("event"),  col_names = list(study_id = "study_id", site_id = "site_id", patient_id = "patient_id",    visit = "visit"))

Arguments

df_visit

Data frame with columns: study_id, site_number, patnum, visit,n_ae.

call

optional, provide call, Default: NULL

env

Optional, provide environment of original visit data. Default:parent.frame().

event_names

vector, contains the event names, default = "event"

col_names

named list, indicate study_id, site_id, patient_id and visitcolumn in df_visit input dataframe. Default: list(study_id = "study_id",site_id = "site_id",patient_id = "patient_id",visit = "visit")

Details

Saves variable name of original visit data, checks whether it can beretrieved from parent environment and stores summary. Original data can beretrieved using as.data.frame(x).

Value

orivisit object

Examples

df_visit <- sim_test_data_study(  n_pat = 100,  n_sites = 5,  ratio_out = 0.4,  factor_event_rate = - 0.6)#'visit <- orivisit(df_visit)object.size(df_visit)object.size(visit)as.data.frame(visit)

benjamini hochberg p value correction using table operations

Description

benjamini hochberg p value correction using table operations

Usage

p_adjust_bh_inframe(df_eval, cols)

Aggregate visit to patient level.

Description

Internal function used bysite_aggr() andplot_visit_med75(), adds the maximum visit for each patient.

Usage

pat_aggr(df_visit)

Arguments

df_visit

dataframe

Value

dataframe


Create a study specific patient pool for sampling

Description

Internal function forsim_sites,filter all visits greater than max_visit_med75_studyreturns dataframe with one column for studies and one column with nestedpatient data.

Usage

pat_pool(df_visit, df_site)

Arguments

df_visit

dataframe, created bysim_sites

df_site

dataframe created bysite_aggr

Value

dataframe with nested pat_pool column

Examples

df_visit <- sim_test_data_study(  n_pat = 100,  n_sites = 5,  ratio_out = 0.4,  factor_event_rate = 0.6  ) %>%  # internal functions require internal column names  dplyr::rename(    n_ae = n_event,    site_number = site_id,    patnum = patient_id  )df_site <- site_aggr(df_visit)df_pat_pool <- simaerep:::pat_pool(df_visit, df_site)df_pat_pool

plot AE under-reporting simulation results

Description

generic plot function for simaerep objects

Usage

## S3 method for class 'simaerep'plot(  x,  ...,  study = NULL,  what = c("prob", "med75"),  n_sites = 16,  df_visit = NULL,  env = parent.frame(),  plot_event = x$event_names[1])

Arguments

x

simaerep object

...

additional parameters passed toplot_study() orplot_visit_med75()

study

character specifying study to be plotted, Default: NULL

what

one of c("ur", "med75"), specifying whether to plot site AEunder-reporting or visit_med75 values, Default: 'ur'

n_sites

number of sites to plot, Default: 16

df_visit

optional, pass original visit data if it cannot be retrievedfrom parent environment, Default: NULL

env

optional, pass environment from which to retrieve original visitdata, Default: parent.frame()

plot_event

vector containing the events that should be plotted, default = "ae"

Details

seeplot_study() andplot_visit_med75()

Value

ggplot object

Examples

df_visit <- sim_test_data_study(  n_pat = 100,  n_sites = 5,  ratio_out = 0.4,  factor_event_rate = - 0.6)evrep <- simaerep(df_visit)plot(evrep, what = "prob", study = "A")plot(evrep, what = "med75", study = "A")

Plots AE per site as dots.

Description

This plot is meant to supplement the package documentation.

Usage

plot_dots(  df,  nrow = 10,  ncols = 10,  col_group = "site",  thresh = NULL,  color_site_a = "#BDBDBD",  color_site_b = "#757575",  color_site_c = "gold3",  color_high = "#00695C",  color_low = "#25A69A",  size_dots = 10)

Arguments

df

dataframe, cols = c('site', 'patients', 'n_ae')

nrow

integer, number of rows, Default: 10

ncols

integer, number of columns, Default: 10

col_group

character, grouping column, Default: 'site'

thresh

numeric, threshold to determine color of mean_ae annotation, Default: NULL

color_site_a

character, hex color value, Default: '#BDBDBD'

color_site_b

character, hex color value, Default: '#757575'

color_site_c

character, hex color value, Default: 'gold3'

color_high

character, hex color value, Default: '#00695C'

color_low

character, hex color value, Default: '#25A69A'

size_dots

integer, Default: 10

Value

ggplot object

Examples

study <- tibble::tibble(  site = LETTERS[1:3],  patients = c(list(seq(1, 50, 1)), list(seq(1, 40, 1)), list(seq(1, 10, 1)))) %>%  tidyr::unnest(patients) %>%  dplyr::mutate(n_ae = as.integer(runif(min = 0, max = 10, n = nrow(.))))plot_dots(study)

Plot simulation example.

Description

This plots supplements the package documentation.

Usage

plot_sim_example(  substract_ae_per_pat = 0,  size_dots = 10,  size_raster_label = 12,  color_site_a = "#BDBDBD",  color_site_b = "#757575",  color_site_c = "gold3",  color_high = "#00695C",  color_low = "#25A69A",  title = TRUE,  legend = TRUE,  seed = 5)

Arguments

substract_ae_per_pat

integer, subtract aes from patients at site C, Default: 0

size_dots

integer, Default: 10

size_raster_label

integer, Default: 12

color_site_a

character, hex color value, Default: '#BDBDBD'

color_site_b

character, hex color value, Default: '#757575'

color_site_c

character, hex color value, Default: 'gold3'

color_high

character, hex color value, Default: '#00695C'

color_low

character, hex color value, Default: '#25A69A'

title

logical, include title, Default: T

legend

logical, include legend, Default: T

seed

pass seed for simulations Default: 5

Details

usesplot_dots() and adds 2 simulation panels, uses made-upsite config with three sites A,B,C simulating site C

Value

ggplot

See Also

get_legend,plot_grid

Examples

plot_sim_example(size_dots = 5)

Plot multiple simulation examples.

Description

This plot is meant to supplement the package documentation.

Usage

plot_sim_examples(substract_ae_per_pat = c(0, 1, 3), ...)

Arguments

substract_ae_per_pat

integer, Default: c(0, 1, 3)

...

parameters passed to plot_sim_example()

Details

This function is a wrapper for plot_sim_example()

Value

ggplot

See Also

ggdraw,draw_label,plot_grid

Examples

plot_sim_examples(size_dot = 3, size_raster_label = 10)plot_sim_examples()

Plot ae development of study and sites highlighting at risk sites.

Description

Most suitable visual representation of the AE under-reporting statistics.

Usage

plot_study(  df_visit,  df_site,  df_eval,  study,  n_sites = 16,  prob_col = "prob",  event_names = c("ae"),  plot_event = "ae",  mult_corr = FALSE,  delta = TRUE)

Arguments

df_visit

dataframe, created bysim_sites()

df_site

dataframe created bysite_aggr()

df_eval

dataframe created byeval_sites()

study

study

n_sites

integer number of most at risk sites, Default: 16

prob_col

character, denotes probability column, Default: "prob_low_prob_ur"

event_names

vector, contains the event names, default = "ae"

plot_event

vector containing the events that should be plotted, default = "ae"

mult_corr

Logical, multiplicity correction, Default: TRUE

delta

logical, show delta events on plot

Details

Left panel shows mean AE reporting per site (lightblue and darkbluelines) against mean AE reporting of the entire study (golden line). Singlesites are plotted in descending order by AE under-reporting probability onthe right panel in which grey lines denote cumulative AE count of singlepatients. Grey dots in the left panel plot indicate sites that were pickedfor single plotting. AE under-reporting probability of dark blue linescrossed threshold of 95%. Numbers in the upper left corner indicate theratio of patients that have been used for the analysis against the totalnumber of patients. Patients that have not been on the study long enough toreach the evaluation point (visit_med75) will be ignored.

Value

ggplot

Examples

df_visit <- sim_test_data_study(  n_pat = 100,  n_sites = 5,  ratio_out = 0.4,  factor_event_rate = 0.6  ) %>%  # internal functions require internal column names  dplyr::rename(    n_ae = n_event,    site_number = site_id,    patnum = patient_id  )df_site <- site_aggr(df_visit)df_sim_sites <- sim_sites(df_site, df_visit, r = 100)df_eval <- eval_sites(df_sim_sites)simaerep:::plot_study(df_visit, df_site, df_eval, study = "A")

Plot patient visits against visit_med75.

Description

Plots cumulative AEs against visits for patients at sites ofgiven study and compares against visit_med75.

Usage

plot_visit_med75(  df_visit,  df_site = NULL,  study_id_str,  n_sites = 6,  min_pat_pool = 0.2,  verbose = TRUE,  event_names = "ae",  plot_event = "ae",  ...)

Arguments

df_visit

dataframe

df_site

dataframe, as returned bysite_aggr()

study_id_str

character, specify study in study_id column

n_sites

integer, Default: 6

min_pat_pool

double, minimum ratio of available patients available forsampling. Determines maximum visit_med75 value see Details. Default: 0.2

verbose

logical, Default: TRUE

event_names

vector, contains the event names, default = "ae"

plot_event

vector containing the events that should be plotted, default = "ae"

...

not used

Value

ggplot

Examples

df_visit <- sim_test_data_study(  n_pat = 120,  n_sites = 6,  ratio_out = 0.4,  factor_event_rate = - 0.6 ) %>% dplyr::rename(  site_number = site_id,  patnum = patient_id,  n_ae = n_event )df_site <- site_aggr(df_visit)simaerep:::plot_visit_med75(df_visit, df_site, study_id_str = "A", n_site = 6)

Poisson test for vector with site AEs vs vector with study AEs.

Description

Internal function used bysimaerep.

Usage

poiss_test_site_ae_vs_study_ae(site_ae, study_ae, visit_med75)

Arguments

site_ae

vector with AE numbers

study_ae

vector with AE numbers

visit_med75

integer

Details

sets pvalue=1 if mean AE site is greater than mean AE study or ttest gives error

Value

pval

See Also

sim_sites()

Examples

simaerep:::poiss_test_site_ae_vs_study_ae(   site_ae = c(5, 3, 3, 2, 1, 6),   study_ae = c(9, 8, 7, 9, 6, 7, 8),   visit_med75 = 10)simaerep:::poiss_test_site_ae_vs_study_ae(   site_ae = c(11, 9, 8, 6, 3),   study_ae = c(9, 8, 7, 9, 6, 7, 8),   visit_med75 = 10)

Prepare data for simulation.

Description

Internal function called bysim_sites.Collect AEs per patient at visit_med75 for site and study as a vector ofintegers.

Usage

prep_for_sim(df_site, df_visit)

Arguments

df_site

dataframe created bysite_aggr

df_visit

dataframe, created bysim_sites

Value

dataframe

See Also

sim_sites,sim_after_prep

Examples

df_visit <- sim_test_data_study(  n_pat = 100,  n_sites = 5,  ratio_out = 0.4,  factor_event_rate = 0.6  ) %>%  # internal functions require internal column names  dplyr::rename(    n_ae = n_event,    site_number = site_id,    patnum = patient_id  )df_site <- site_aggr(df_visit)df_prep <- simaerep:::prep_for_sim(df_site, df_visit)df_prep

Print method for orivisit objects

Description

Print method for orivisit objects

Usage

## S3 method for class 'orivisit'print(x, ..., n = 10)

Arguments

x

An object of class 'orivisit'

...

Additional arguments passed to print (not used)

n

Number of rows to display from the data frame (default: 10)


Print method for simaerep objects

Description

Print method for simaerep objects

Usage

## S3 method for class 'simaerep'print(x, ..., n = 10)

Arguments

x

An object of class 'simaerep'

...

Additional arguments passed to print (not used)

n

Number of rows to display from df_eval (default: 5)


Calculate bootstrapped probability for obtaining a lower site mean AE number.

Description

Internal function used bysim_sites()

Usage

prob_lower_site_ae_vs_study_ae(site_ae, study_ae, r = 1000, under_only = TRUE)

Arguments

site_ae

vector with AE numbers

study_ae

vector with AE numbers

r

integer, denotes number of simulations, default = 1000

under_only

compute under-reporting probabilities only, default = TRUE

Details

sets pvalue=1 if mean AE site is greater than mean AE study

Value

pval

See Also

sim_sites()

Examples

simaerep:::prob_lower_site_ae_vs_study_ae(  site_ae = c(5, 3, 3, 2, 1, 6),  study_ae = c(9, 8, 7, 9, 6, 7, 8))

prune visits to visit_med75 using table operations

Description

prune visits to visit_med75 using table operations

Usage

prune_to_visit_med75_inframe(df_visit, df_site)

Arguments

df_visit

Data frame with columns: study_id, site_number, patnum, visit,n_ae.

df_site

dataframe, as returned bysite_aggr()


Execute a purrr or furrr function with a progressbar.

Description

Internal utility function.

Usage

purrr_bar(  ...,  .purrr,  .f,  .f_args = list(),  .purrr_args = list(),  .steps,  .slow = FALSE,  .progress = TRUE)

Arguments

...

iterable arguments passed to .purrr

.purrr

purrr or furrr function

.f

function to be executed over iterables

.f_args

list of arguments passed to .f, Default: list()

.purrr_args

list of arguments passed to .purrr, Default: list()

.steps

integer number of iterations

.slow

logical slows down execution, Default: FALSE

.progress

logical, show progress bar, Default: TRUE

Details

Call still needs to be wrapped inwith_progressorwith_progress_cnd()

Value

result of function passed to .f

Examples

# purrr::mapprogressr::with_progress(  purrr_bar(rep(0.25, 5), .purrr = purrr::map, .f = Sys.sleep, .steps = 5))# purrr::walkprogressr::with_progress( purrr_bar(rep(0.25, 5), .purrr = purrr::walk,.f = Sys.sleep, .steps = 5))# progress bar offprogressr::with_progress(  purrr_bar(    rep(0.25, 5), .purrr = purrr::walk,.f = Sys.sleep, .steps = 5, .progress = FALSE  ))# purrr::map2progressr::with_progress(  purrr_bar(    rep(1, 5), rep(2, 5),    .purrr = purrr::map2,    .f = `+`,    .steps = 5,    .slow = TRUE ))# purrr::pmapprogressr::with_progress(  purrr_bar(    list(rep(1, 5), rep(2, 5)),    .purrr = purrr::pmap,    .f = `+`,    .steps = 5,    .slow = TRUE ))# define function within purr_bar() callprogressr::with_progress(  purrr_bar(    list(rep(1, 5), rep(2, 5)),    .purrr = purrr::pmap,    .f = function(x, y) {      paste0(x, y)    },    .steps = 5,    .slow = TRUE ))# with mutateprogressr::with_progress( tibble::tibble(x = rep(0.25, 5)) %>%  dplyr::mutate(x = purrr_bar(x, .purrr = purrr::map, .f = Sys.sleep, .steps = 5)))

renames internal simaerep col_names to externally applied colnames

Description

renames internal simaerep col_names to externally applied colnames

Usage

remap_col_names(df, col_names)

Start simulation after preparation.

Description

Internal function called bysim_sitesafterprep_for_sim

Usage

sim_after_prep(  df_sim_prep,  r = 1000,  poisson_test = FALSE,  prob_lower = TRUE,  progress = FALSE,  under_only = TRUE)

Arguments

df_sim_prep

dataframe as returned byprep_for_sim

r

integer, denotes number of simulations, default = 1000

poisson_test

logical, calculates poisson.test pvalue

prob_lower

logical, calculates probability for getting a lower value

progress

logical, display progress bar, Default = TRUE

under_only

compute under-reporting probabilities only, default = TRUEcheck_df_visit(), computationally expensive on large datasets. Default: TRUE

Value

dataframe

See Also

sim_sites,prep_for_sim

Examples

df_visit <- sim_test_data_study(  n_pat = 100,  n_sites = 5,  ratio_out = 0.4,  factor_event_rate = 0.6  ) %>%  # internal functions require internal column names  dplyr::rename(    n_ae = n_event,    site_number = site_id,    patnum = patient_id  )df_site <- site_aggr(df_visit)df_prep <- simaerep:::prep_for_sim(df_site, df_visit)df_sim <- simaerep:::sim_after_prep(df_prep)df_sim

Calculate prob for study sites using table operations

Description

Calculate prob for study sites using table operations

Usage

sim_inframe(df_visit, r = 1000, df_site = NULL, event_names = c("ae"))

Arguments

df_visit

Data frame with columns: study_id, site_number, patnum, visit,n_ae.

r

Integer or tbl_object, number of repetitions for bootstrapsimulation. Pass a tbl object referring to a table with one column and asmany rows as desired repetitions. Default: 1000.

df_site

dataframe as returned besite_aggr(), Will switch to visit_med75.Default: NULL

event_names

vector, contains the event names, default = "event"

Examples

df_visit <- sim_test_data_study(  n_pat = 100,  n_sites = 5,  ratio_out = 0.4,  factor_event_rate = - 0.6) %>%dplyr::rename(  site_number = site_id,  patnum = patient_id,  n_ae = n_event)df_sim <- simaerep:::sim_inframe(df_visit)

simulate under-reporting

Description

we remove a fraction of events from a specific site

Usage

sim_out(df_visit, study_id, site_id, factor_event)

Arguments

df_visit

dataframe

study_id

character

site_id

character

factor_event

double, negative values for under-reporting positive forfor over-reporting.

Details

we determine the absolute number of events per patient for removal.Then them remove them at the first visit.We intentionally allow fractions

Examples

df_visit <- sim_test_data_study(n_pat = 100, n_sites = 10)df_ur <- sim_out(df_visit, "A", site_id = "S0001", factor_event = - 0.35)# Example cumulated event for first patient with 35% under-reportingdf_ur[df_ur$site_id == "S0001" & df_ur$patient_id == "P000001",]$n_event# Example cumulated event for first patient with no under-reportingdf_visit[df_visit$site_id == "S0001" & df_visit$patient_id == "P000001",]$n_event

simulate patients and events for sitessupports constant and non-constant event rates

Description

simulate patients and events for sitessupports constant and non-constant event rates

Usage

sim_pat(vs_max, vs_sd, is_out, event_rates, event_names, factor_event_rate)

Calculate prob_lower and poisson.test pvalue for study sites.

Description

Collects the number of AEs of all eligible patients thatmeet visit_med75 criteria of site. Then calculates poisson.test pvalue andbootstrapped probability of having a lower mean value. Used bysimaerep_classic()

Usage

sim_sites(  df_site,  df_visit,  r = 1000,  poisson_test = TRUE,  prob_lower = TRUE,  progress = TRUE,  under_only = TRUE)

Arguments

df_site

dataframe created bysite_aggr

df_visit

dataframe, created bysim_sites

r

integer, denotes number of simulations, default = 1000

poisson_test

logical, calculates poisson.test pvalue

prob_lower

logical, calculates probability for getting a lower value

progress

logical, display progress bar, Default = TRUE

under_only

compute under-reporting probabilities only, default = TRUEcheck_df_visit(), computationally expensive on large datasets. Default: TRUE

Value

dataframe with the following columns:

study_id

study identification

site_number

site identification

n_pat

number of patients at site

visit_med75

median(max(visit)) * 0.75

n_pat_with_med75

number of patients at site with med75

mean_ae_site_med75

mean AE at visit_med75 site level

mean_ae_study_med75

mean AE at visit_med75 study level

n_pat_with_med75_study

number of patients at study with med75 excl. site

pval

p-value as returned bypoisson.test

prob_low

bootstrapped probability for having mean_ae_site_med75 or lower

See Also

sim_sites,site_aggr,pat_pool,prob_lower_site_ae_vs_study_ae,poiss_test_site_ae_vs_study_ae,sim_sites,prep_for_simsimaerep_classic

Examples

df_visit <- sim_test_data_study(  n_pat = 100,  n_sites = 5,  ratio_out = 0.4,  factor_event_rate = 0.6  ) %>%  # internal functions require internal column names  dplyr::rename(    n_ae = n_event,    site_number = site_id,    patnum = patient_id  )df_site <- site_aggr(df_visit)df_sim_sites <- sim_sites(df_site, df_visit, r = 100)df_sim_sites %>% knitr::kable(digits = 2)

simulate test data events

Description

generates multi-event data using sim_test_data_study()

Usage

sim_test_data_events(  n_pat = 100,  n_sites = 5,  event_rates = c(NULL),  event_names = list("event"))

Arguments

n_pat

integer, number of patients, Default: 100

n_sites

integer, number of sites, Default: 5

event_rates

vector with visit-specific event rates, Default: Null

event_names

vector, contains the event names, default = "event"

Value

tibble with columns site_id, patient_id, is_ur, max_visit_mean,max_visit_sd, visit, and event data (events_per_visit_mean and n_events)


simulate patient event reporting test data

Description

helper function forsim_test_data_study()

Usage

sim_test_data_patient(  .f_sample_max_visit = function() rnorm(1, mean = 20, sd = 4),  .f_sample_event_per_visit = function(max_visit) rpois(max_visit, 0.5))

Arguments

.f_sample_max_visit

function used to sample the maximum number of events,Default: function() rnorm(1, mean = 20, sd = 4)

.f_sample_event_per_visit

function used to sample the events for each visit,Default: function(x) rpois(x, 0.5)

Details

""

Value

vector containing cumulative events

Examples

replicate(5, sim_test_data_patient())replicate(5, sim_test_data_patient(    .f_sample_event_per_visit = function(x) rpois(x, 1.2))  )replicate(5, sim_test_data_patient(    .f_sample_max_visit = function() rnorm(1, mean = 5, sd = 5))  )

Simulate Portfolio Test Data

Description

Simulate visit level data from a portfolio configuration.

Usage

sim_test_data_portfolio(  df_config,  df_event_rates = NULL,  progress = TRUE,  parallel = TRUE)

Arguments

df_config

dataframe as returned byget_portf_config

df_event_rates

dataframe with event rates. Default: NULL

progress

logical, Default: TRUE

parallel

logical activate parallel processing, see details, Default: FALSE

Details

usessim_test_data_study.We use thefurrr package toimplement parallel processing as these simulations can take a long time torun. For this to work we need to specify the plan for how the code shouldrun, e.g. 'plan(multisession, workers = 3)

Value

dataframe with the following columns:

study_id

study identification

event_per_visit_mean

meanevent per visit per study

site_id

site

max_visit_sd

standard deviation of maximum patient visits persite

max_visit_mean

mean of maximum patient visits per site

patient_id

number of patients

visit

visit number

n_event

cumulative sum of events

See Also

sim_test_data_studyget_portf_configsim_test_data_portfolio

Examples

df_visit1 <- sim_test_data_study(n_pat = 100, n_sites = 10,                                 ratio_out = 0.4, factor_event_rate = 0.6,                                 study_id = "A")df_visit2 <- sim_test_data_study(n_pat = 100, n_sites = 10,                                 ratio_out = 0.2, factor_event_rate = 0.1,                                 study_id = "B")df_visit <- dplyr::bind_rows(df_visit1, df_visit2)df_config <- get_portf_config(df_visit)df_configdf_portf <- sim_test_data_portfolio(df_config)df_portf

simulate study test data

Description

evenly distributes a number of given patients across a number ofgiven sites. Then simulates event reporting of each patient reducing thenumber of reported events for patients distributed to event-under-reportingsites.

Usage

sim_test_data_study(  n_pat = 1000,  n_sites = 20,  ratio_out = 0,  factor_event_rate = 0,  max_visit_mean = 20,  max_visit_sd = 4,  event_rates = dgamma(seq(1, 20, 0.5), shape = 5, rate = 2) * 5 + 0.1,  event_names = c("event"),  study_id = "A")

Arguments

n_pat

integer, number of patients, Default: 1000

n_sites

integer, number of sites, Default: 20

ratio_out

ratio of sites with outlier, Default: 0

factor_event_rate

event reporting rate factor for site outlier, willmodify mean event per visit rate used for outlier sites. Negative Valueswill simulate under-reporting, positive values over-reporting, e.g. -0.4 ->40% under-reporting, +0.4 -> 40% over-reporting Default: 0

max_visit_mean

mean of the maximum number of visits of each patient,Default: 20

max_visit_sd

standard deviation of maximum number of visits of eachpatient, Default: 4

event_rates

list or vector with visit-specific event rates. Use listfor multiple event names, Default: dgamma(seq(1, 20, 0.5), shape = 5, rate =2) * 5 + 0.1

event_names

vector, contains the event names, default = "event"

study_id

character, Default: "A"

Details

maximum visit number will be sampled from normal distribution withcharacteristics derived from max_visit_mean and max_visit_sd, while theevents per visit will be sampled from a poisson distribution described byevents_per_visit_mean.

Value

tibble with columns site_id, patient_id, is_out, max_visit_mean,max_visit_sd, event_per_visit_mean, visit, n_event

Examples

set.seed(1)# no outlierdf_visit <- sim_test_data_study(n_pat = 100, n_sites = 5)df_visit[which(df_visit$patient_id == "P000001"),]# under-reporting outlierdf_visit <- sim_test_data_study(n_pat = 100, n_sites = 5,    ratio_out = 0.2, factor_event_rate = -0.5)df_visit[which(df_visit$patient_id == "P000001"),]# constant event ratessim_test_data_study(n_pat = 100, n_sites = 5, event_rates = 0.5)# non-constant event rates for two event typesevent_rates_ae <- c(0.7, rep(0.5, 8), rep(0.3, 5))event_rates_pd <- c(0.3, rep(0.4, 6), rep(0.1, 5))sim_test_data_study(n_pat = 100,n_sites = 5,event_names = c("ae", "pd"),event_rates = list(event_rates_ae, event_rates_pd))

Create simaerep object

Description

Simulate AE under-reporting probabilities.

Usage

simaerep(  df_visit,  r = 1000,  check = TRUE,  under_only = FALSE,  visit_med75 = FALSE,  inframe = TRUE,  progress = TRUE,  mult_corr = TRUE,  poisson_test = FALSE,  env = parent.frame(),  event_names = c("event"),  col_names = list(study_id = "study_id", site_id = "site_id", patient_id = "patient_id",    visit = "visit"))simaerep_inframe(  df_visit,  r = 1000,  under_only = FALSE,  visit_med75 = FALSE,  check = TRUE,  env = parent.frame(),  event_names = c("event"),  mult_corr = FALSE,  col_names = list(study_id = "study_id", site_id = "site_id", patient_id = "patient_id",    visit = "visit"))simaerep_classic(  df_visit,  check = TRUE,  progress = TRUE,  env = parent.frame(),  under_only = TRUE,  r = 1000,  mult_corr = FALSE,  poisson_test = FALSE,  event_names = "event",  col_names = list(study_id = "study_id", site_id = "site_id", patient_id = "patient_id",    visit = "visit"))

Arguments

df_visit

Data frame with columns: study_id, site_number, patnum, visit,n_ae.

r

Integer or tbl_object, number of repetitions for bootstrapsimulation. Pass a tbl object referring to a table with one column and asmany rows as desired repetitions. Default: 1000.

check

Logical, perform data check and attempt repair withcheck_df_visit(). Computationally expensive on large data sets. Default:TRUE.

under_only

Logical, compute under-reporting probabilities only.only applies to the classic algorithm in which a one-sided evaluation cansave computation time. Default: FALSE

visit_med75

Logical, should evaluation point visit_med75 be used. Compatiblewith inframe and classic version of the algorithm.Default: FALSE

inframe

Logical, when FALSE classic simaerep algorithm will be used. Thedefault inframe method uses only table operations and is compatible withdbplyr supported database backends. Default: TRUE

progress

Logical, display progress bar. Default: TRUE.

mult_corr

Logical, multiplicity correction, Default: TRUE

poisson_test

logical, compute p-value with poisson test, only supportedby the classic algorithm using visit_med75. Default: FALSE

env

Optional, provide environment of original visit data. Default:parent.frame().

event_names

vector, contains the event names, default = "event"

col_names

named list, indicate study_id, site_id, patient_id and visitcolumn in df_visit input dataframe. Default: list(study_id = "study_id",site_id = "site_id",patient_id = "patient_id",visit = "visit")

Details

Executessite_aggr(),sim_sites(), andeval_sites() on originalvisit data and stores all intermediate results. Stores lazy reference tooriginal visit data for facilitated plotting using generic plot(x).

Value

A simaerep object. Results are contained in the attached df_eval dataframe.

Column Name Description Type
study_id The study ID Character
site_id. The site ID Character
(event)_count Site event count Numeric
(event)_per_visit_site Site Ratio of event count divided by visits Numeric
visits Site visit count Numeric
n_pat Site patient count Numeric
(event)_per_visit_study Simulated study ratio Numeric
(event)_prob Site event ratio probability from -1 to 1 Numeric
(event)_delta Difference expected vs reported events Numeric

See Also

site_aggr,sim_sites,eval_sites,orivisit,plot.simaerep,print.simaerep,simaerep_inframe

Examples

df_visit <- sim_test_data_study(  n_pat = 100,  n_sites = 5,  ratio_out = 0.4,  factor_event_rate = - 0.6)evrep <- simaerep(df_visit)evrepstr(evrep)# simaerep classic algorithmevrep <- simaerep(df_visit, inframe = FALSE, under_only = TRUE, mult_corr = TRUE)evrep# multiple eventsdf_visit_events_test <- sim_test_data_study(  n_pat = 100,  n_sites = 5,  ratio_out = 0.4,  factor_event_rate = - 0.6,  event_rates = list(0.5, 0.3),  event_names = c("ae", "pd"))evsrep <- simaerep(df_visit_events_test, inframe = TRUE, event_names = c("ae", "pd"))evsrep# Database examplecon <- DBI::dbConnect(duckdb::duckdb(), dbdir = ":memory:")df_r <- tibble::tibble(rep = seq(1, 1000))dplyr::copy_to(con, df_visit, "visit")dplyr::copy_to(con, df_r, "r")tbl_visit <- dplyr::tbl(con, "visit")tbl_r <- dplyr::tbl(con, "r")simaerep(tbl_visit, r = tbl_r)DBI::dbDisconnect(con)

Aggregate from visit to site level.

Description

Calculates visit_med75, n_pat_with_med75 and mean_ae_site_med75.Used bysimaerep_classic()

Usage

site_aggr(  df_visit,  method = "med75_adj",  min_pat_pool = 0.2,  event_names = c("ae"))

Arguments

df_visit

dataframe with columns: study_id, site_number, patnum, visit,n_ae

method

character, one of c("med75", "med75_adj", "max") defining method fordefining evaluation point visit_med75 (see details), Default: "med75_adj"

min_pat_pool

double, minimum ratio of available patients available forsampling. Determines maximum visit_med75 value see Details. Default: 0.2

event_names

vector, contains the event names, default = "ae"

Details

For determining the visit number at which we are going to evaluate AEreporting we take the maximum visit of each patient at the site and take themedian. Then we multiply with 0.75 which will give us a cut-off pointdetermining which patient will be evaluated. Of those patients we willevaluate we take the minimum of all maximum visits hence ensuring that wetake the highest visit number possible without excluding more patients fromthe analysis. In order to ensure that the sampling pool for that visit islarge enough we limit the visit number by the 80% quantile of maximum visitsof all patients in the study. "max" will determine site max visit, flag patientsthat concluded max visit and count patients and patients that concluded max visit.

Value

dataframe with the following columns:

study_id

study identification

site_number

siteidentification

n_pat

number of patients, site level

visit_med75

adjusted median(max(visit)) * 0.75 see Details

n_pat_with_med75

number of patients that meet visit_med75criterion, site level

mean_ae_site_med75

mean AE at visit_med75,site level

See Also

simaerep_classic()

Examples

df_visit <- sim_test_data_study(  n_pat = 100,  n_sites = 5,  ratio_out = 0.4,  factor_event_rate = 0.6  ) %>%  # internal functions require internal column names  dplyr::rename(    n_ae = n_event,    site_number = site_id,    patnum = patient_id  )df_site <- site_aggr(df_visit)df_site %>%  knitr::kable(digits = 2)

Conditionalwith_progress.

Description

Internal function. Use instead ofwith_progress within custom functions with progressbars.

Usage

with_progress_cnd(ex, progress = TRUE)

Arguments

ex

expression

progress

logical, Default: TRUE

Details

This wrapper adds a progress parameter towith_progressso that we can control the progress bar in the user facing functions. The progressbaronly shows in interactive mode.

Value

No return value, called for side effects

See Also

with_progress

Examples

if (interactive()) { with_progress_cnd(   purrr_bar(rep(0.25, 5), .purrr = purrr::map, .f = Sys.sleep, .steps = 5),   progress = TRUE ) with_progress_cnd(   purrr_bar(rep(0.25, 5), .purrr = purrr::map, .f = Sys.sleep, .steps = 5),   progress = FALSE )# wrap a function with progress bar with another call with progress barf1 <- function(x, progress = TRUE) {  with_progress_cnd(    purrr_bar(x, .purrr = purrr::walk, .f = Sys.sleep, .steps = length(x), .progress = progress),    progress = progress  )}# inner progress bar blocks outer progress barprogressr::with_progress(  purrr_bar(    rep(rep(1, 3),3), .purrr = purrr::walk, .f = f1, .steps = 3,    .f_args = list(progress = TRUE)  ))# inner progress bar turned offprogressr::with_progress(  purrr_bar(    rep(list(rep(0.25, 3)), 5), .purrr = purrr::walk, .f = f1, .steps = 5,    .f_args = list(progress = FALSE)  ))}

[8]ページ先頭

©2009-2025 Movatter.jp