| Type: | Package |
| Title: | Disproportionality Functions for Pharmacovigilance |
| Version: | 0.0.4 |
| Description: | Tools for performing disproportionality analysis using the information component, proportional reporting rate and the reporting odds ratio. The anticipated use is passing data to the da() function, which executes the disproportionality analysis. See Norén et al (2011) <doi:10.1177/0962280211403604> and Montastruc et al (2011) <doi:10.1111/j.1365-2125.2011.04037.x> for further details. |
| License: | GPL (≥ 3) |
| Encoding: | UTF-8 |
| LazyData: | true |
| Suggests: | knitr (≥ 1.43), rmarkdown (≥ 2.24), testthat (≥ 3.1.10),writexl (≥ 1.4.2) |
| Config/testthat/edition: | 3 |
| BuildVignettes: | true |
| VignetteBuilder: | knitr |
| RoxygenNote: | 7.3.2 |
| Imports: | checkmate (≥ 2.1.0), cli (≥ 3.6.3), data.table (≥ 1.14.6),dplyr (≥ 1.0.10), dtplyr (≥ 1.2.2), glue (≥ 1.6.2), purrr(≥ 0.3.5), Rdpack (≥ 2.4), rlang (≥ 1.0.6), stats (≥4.1.3), stringr (≥ 1.5.0), tibble (≥ 3.1.8), tidyr (≥1.3.0), tidyselect (≥ 1.2.0), utils (≥ 4.1.3) |
| Depends: | R (≥ 2.10) |
| URL: | https://oskargauffin.github.io/pvda/ |
| BugReports: | https://github.com/OskarGauffin/pvda/issues |
| RdMacros: | Rdpack |
| NeedsCompilation: | no |
| Packaged: | 2025-01-16 07:25:57 UTC; OskarG |
| Author: | Oskar Gauffin |
| Maintainer: | Michele Fusaroli <michele.fusaroli@who-umc.org> |
| Repository: | CRAN |
| Date/Publication: | 2025-01-17 09:10:14 UTC |
Add disproportionality estimates to data framewith expected counts
Description
Add disproportionality estimates to data framewith expected counts
Usage
add_disproportionality( df = NULL, df_syms = NULL, da_estimators = c("ic", "prr", "ror"), rule_of_N = 3, conf_lvl = 0.95)Arguments
df | Intended use is on the output tibble from |
df_syms | A list built from df_colnames through conversion to symbols. |
da_estimators | Character vector specifying which disproportionalityestimators to use, in case you don't need all implemented options. Defaultsto c("ic", "prr", "ror"). |
rule_of_N | Numeric value. Sets estimates for ROR and PRR to NA when observedcounts are strictly less than the passed value of |
conf_lvl | Confidence level of confidence or credibility intervals.Default is 0.95 (i.e. 95 % confidence interval). |
Value
The passed data frame with disproportionality point and intervalestimates.
Produces expected counts
Description
Produces various counts used in disproportionality analysis.
Usage
add_expected_counts( df = NULL, df_colnames = NULL, df_syms = NULL, expected_count_estimators = c("rrr", "prr", "ror"))Arguments
df | An object possible to convert to a data table, e.g.a tibble or data.frame, containing patient level reported drug-event-pairs.See header 'The df object' below for further details. |
df_colnames | A list of column names to use in |
df_syms | A list built from df_colnames through conversion to symbols. |
expected_count_estimators | A character vector containing the desiredexpected count estimators. Defaults to c("rrr", "prr", "ror"). |
Value
A tibble containing the various counts.
The df object
The passeddf should be (convertible to) a data table and at least contain threecolumns:report_id,drug andevent. The data table should contain one rowper reported drug-event-combination, i.e. receiving a single additional reportfor drug X and event Y would add one row to the table. If the single reportcontained drug X for event Y and event Z, two rows would be added, with thesamereport_id anddrug on both rows. Columnreport_id must be of typenumeric or character. Columnsdrug andevent must be of type character.If columngroup_by is provided, it can be either numeric or character.You can use adf with column names of your choosing, as long as youconnect role and name in thedf_colnames-parameter.
apply_rule_of_N
Description
Internal function to set disproportionality cells for ROR and PRR to NA when observed count < 3
Usage
apply_rule_of_N( da_df = NULL, da_estimators = c("ic", "prr", "ror"), rule_of_N = NULL)Arguments
da_df | See the intermediate object da_df in add_disproportionality |
da_estimators | Default is c("ic", "prr", "ror"). |
rule_of_N | An length one integer between 0 and 10. |
Details
Sometimes, you want to protect yourself from spurious findings basedon small observed counts combined with infinitesimal expected counts.
Value
The input data frame (da_df) with potentially some cells set to NA.
An internal function creating colnames for da confidence/credibility bounds
Description
Given the output from quantile_prob, and a da_name string,create column names such as PRR025, ROR025 and IC025
Usage
build_colnames_da( quantile_prob = list(lower = 0.025, upper = 0.975), da_name = NULL)Arguments
quantile_prob | A list with two parameters, lower and upper. Default: list(lower = 0.025, upper = 0.975) |
da_name | A string, such as "ic", "prr" or "ror". Default: NULL |
Value
A list with two symbols, to be inserted in the dtplyr-chain
Confidence intervals for Information Component (IC)
Description
Mainly used in functionic. Produces quantiles of theposterior gamma distribution. Called twice inic to createcredibility intervals.
Usage
ci_for_ic(obs, exp, conf_lvl_probs, shrinkage)Arguments
obs | A numeric vector with observed counts, i.e. number of reportsfor the selected drug-event-combination. Note that shrinkage (e.g. +0.5) is addedinside the function and should not be included here. |
exp | A numeric vector with expected counts, i.e. number of reportsto be expected given a comparator orbackground. Note that shrinkage(e.g. +0.5) is added inside the function and should not be included here. |
conf_lvl_probs | The probabilities of the posterior, based ona passed confidence level ( |
shrinkage | A non-negative numeric value, to be added toobserved and expected count. Default is 0.5. |
Value
The credibility interval specified by input parameters.
See Also
Confidence intervals for Proportional Reporting Rate
Description
Mainly for use inprr. Produces (symmetric,normality based) confidence bounds for the PRR, for a passed probability.Called twice inprr to create confidence intervals.
Usage
ci_for_prr( obs = NULL, n_drug = NULL, n_event_prr = NULL, n_tot_prr = NULL, conf_lvl_probs = 0.95)Arguments
obs | Number of reports for the specific drug and event (i.e. theobserved count). |
n_drug | Number of reports with the drug of interest. |
n_event_prr | Number of reports with the event in the background. |
n_tot_prr | Number of reports in the background. |
conf_lvl_probs | The probabilities of the normal distribution, based ona passed confidence level ( |
Value
The confidence interval specified by input parameters.
See Also
Confidence intervals for Reporting Odds Ratio
Description
Mainly for use inror. Produces (symmetric,normality based) confidence bounds for the ROR, for a passed probability.Called twice inror to create confidence intervals.
Usage
ci_for_ror(a, b, c, d, conf_lvl_probs)Arguments
a | Number of reports for the specific drug and event (i.e. theobserved count). |
b | Number of reports with the drug, without the event |
c | Number of reports without the drug, with the event |
d | Number of reports without the drug, without the event |
conf_lvl_probs | The probabilities of the normal distribution, based ona passed confidence level ( |
Value
The credibility interval specified by input parameters.
See Also
Quantile probabilities from confidence level
Description
Calculates equi-tailed quantile probabilities from aconfidence level
Usage
conf_lvl_to_quantile_prob(conf_lvl = 0.95)Arguments
conf_lvl | Confidence level of confidence or credibility intervals.Default is 0.95 (i.e. 95 % confidence interval). |
Value
A list with two numerical vectors, "lower" and "upper".
Examples
conf_lvl_to_quantile_prob(0.95)Count expected for Proportional Reporting Rate
Description
Internal function to provide expected counts related to the PRR
Usage
count_expected_prr(count_dt)Arguments
count_dt | A data table, output from count_expected_rrr |
Value
A data table with added columns for n_event_prrn_tot_prr and expected_prr@export
Count expected for Reporting Odds Ratio
Description
Internal function to provide expected counts related to the ROR
Usage
count_expected_ror(count_dt)Arguments
count_dt | A data table, output from count_expected_rrr |
Details
DETAILS
Value
A data table with added columns for n_event_prr,n_tot_prr and expected_prr
OUTPUT_DESCRIPTION
See Also
Count Expected for Relative Reporting Rate
Description
Internal function to provide expected counts related to the RRR
Usage
count_expected_rrr(df, df_colnames, df_syms)Arguments
df | See documentation for add_expected_counts |
df_colnames | See documentation for da |
df_syms | A list built from df_colnames through conversion to symbols. |
Value
A data frame with columns for obs, n_drug,n_event, n_tot and (RRR) expected
Disproportionality Analysis
Description
The functionda executes disproportionality analyses,i.e. compares the proportion of reports with a specific adverse event for a drug,against an event proportion from a comparator based on the passed data frame.See the vignette for a brief introduction to disproportionality analysis.Furthermore,da supports three estimators: Information Component (IC),Proportional Reporting Rate (PRR) and the Reporting Odds Ratio (ROR).
Usage
da( df = NULL, df_colnames = list(report_id = "report_id", drug = "drug", event = "event", group_by = NULL), da_estimators = c("ic", "prr", "ror"), sort_by = "ic", number_of_digits = 2, rule_of_N = 3, conf_lvl = 0.95, excel_path = NULL)Arguments
df | An object possible to convert to a data table, e.g.a tibble or data.frame, containing patient level reported drug-event-pairs.See header 'The df object' below for further details. |
df_colnames | A list of column names to use in |
da_estimators | Character vector specifying which disproportionalityestimators to use, in case you don't need all implemented options. Defaultsto c("ic", "prr", "ror"). |
sort_by | The output is sorted in descending order of the lower bound ofthe confidence/credibility interval for a passed da estimator. Any of the passed stringsin "da_estimators" is accepted, the default is "ic".If a grouping variable is passed, sorting is made by the sample average across each drug-event-combination (ignoring NAs). |
number_of_digits | Round decimal columns to specified precision, default is two decimals. |
rule_of_N | Numeric value. Sets estimates for ROR and PRR to NA when observedcounts are strictly less than the passed value of |
conf_lvl | Confidence level of confidence or credibility intervals.Default is 0.95 (i.e. 95 % confidence interval). |
excel_path | Intended for users who prefer to work in excel with minimal work in R.To write the output of |
Value
da returns a data frame (invisibly) containing counts andestimates related to supported disproportionality estimators. Each rowcorresponds to a drug-event pair.
The df object
The passeddf should be (convertible to) a data table and at least contain threecolumns:report_id,drug andevent. The data table should contain one rowper reported drug-event-combination, i.e. receiving a single additional reportfor drug X and event Y would add one row to the table. If the single reportcontained drug X for event Y and event Z, two rows would be added, with thesamereport_id anddrug on both rows. Columnreport_id must be of typenumeric or character. Columnsdrug andevent must be of type character.If columngroup_by is provided, it can be either numeric or character.You can use adf with column names of your choosing, as long as youconnect role and name in thedf_colnames-parameter.
Examples
### Run a disproportionality analysisda_1 <- tiny_dataset |> da()### Run a disproportionality across subgroupslist_of_colnames <- list( report_id = "report_id", drug = "drug", event = "event", group_by = "group" ) da_2 <- tiny_dataset |> da(df_colnames = list_of_colnames)# If columns in your df have different names than the default ones,# you can specify the column names in the df_colnames parameter list:renamed_df <- tiny_dataset |> dplyr::rename(ReportID = report_id)list_of_colnames$report_id <- "ReportID"da_3 <- renamed_df |> da(df_colnames = list_of_colnames)A simulated ICSR database
Description
drug_event_df is a simulated dataset, slightly larger than the "tiny_dataset"which is also contained in this package.
Usage
drug_event_dfFormat
'drug_event_df'A data frame with 3,971 rows and 3 columns. In total 1000 unique report_ids,i.e. the same report_id can have several drugs and events.
Number of drugs per report_id is sampled as 1 + Pois(3), with increasingprobability as the drug letter closes in on Z. Every drug is assignedan event, with decreasing probability as the event index number increasestowards 1000. See the DATASET.R file in the data-raw folder for details.
- report_id
A patient or report identifier
- drug
One of 26 fake drugs (Drug_A - Drug_Z)
- event
Sampled events (Event_1 - Event_1000)
Source
Simulated data.
Disproportionality Analysis by Subgroups
Description
A package internal wrapper for executing da across subgroups
Usage
grouped_da( df = NULL, df_colnames = NULL, df_syms = NULL, expected_count_estimators = NULL, da_estimators = NULL, sort_by = NULL, conf_lvl = NULL, rule_of_N = NULL, number_of_digits = NULL)Arguments
df | See the da function |
df_colnames | See the da function |
df_syms | A list built from df_colnames through conversion to symbols. |
expected_count_estimators | See the da function |
da_estimators | See the da function |
sort_by | See the da function |
conf_lvl | See the da function |
rule_of_N | See the da function |
number_of_digits | See the da function |
Details
See the da documentation
Value
See the da function
Information component
Description
Calculates the information component ("IC") and credibilityinterval, used in disproportionality analysis.
Usage
ic(obs = NULL, exp = NULL, shrinkage = 0.5, conf_lvl = 0.95)Arguments
obs | A numeric vector with observed counts, i.e. number of reportsfor the selected drug-event-combination. Note that shrinkage (e.g. +0.5) is addedinside the function and should not be included here. |
exp | A numeric vector with expected counts, i.e. number of reportsto be expected given a comparator orbackground. Note that shrinkage(e.g. +0.5) is added inside the function and should not be included here. |
shrinkage | A non-negative numeric value, to be added toobserved and expected count. Default is 0.5. |
conf_lvl | Confidence level of confidence or credibility intervals.Default is 0.95 (i.e. 95 % confidence interval). |
Details
The IC is a log2-transformed observed-to-expected ratio, based onthe relative reporting rate (RRR) for counts, but modified with an additionof "shrinkage" to protect against spurious associations.
\hat{IC} = log_{2}(\frac{\hat{O}+k}{\hat{E}+k})
where\hat{O} = observed number of reports,k is the shrinkage(typically +0.5), and expected\hat{E} is (for RRR, and using theentire database as comparator orbackground) estimated as
\hat{E} = \frac{\hat{N}_{drug} \times \hat{N}_{event}}{\hat{N}_{TOT}}
where\hat{N}_{drug},\hat{N}_{event} and\hat{N}_{TOT} are the number ofreports with the drug, the event, and in the whole database respectively.
The credibility interval is created from the quantiles of the posteriorgamma distribution with shape (\hat{S}) and rate (\hat{R}) parameters as
\hat{S} = \hat{O} + k
\hat{R} = \hat{E} + k
using thestats::qgamma function. Parameterk is the shrinkage definedearlier. For completeness, a credibility interval of the gamma distributedX (i.e.X \sim \Gamma(\hat{S}, \hat{R}) where\hat{S} and\hat{R} are shape and rate parameters)with associated quantile functionQ_X(p) for a significance level\alpha isconstructed as
[Q_X(\alpha/2), Q_X(1-\alpha/2)]
Value
A tibble with three columns (point estimate and credibility bounds).
Further details
From a bayesian point-of-view, the credibility interval of the IC is constructedfrom the poisson-gamma conjugacy. The shrinkage constitutes a prior ofobserved and expected of 0.5. A shrinkage of +0.5 with a gamma-quantile based 95 %credibility interval cannot have lower bound above 0 unless the observed countexceeds 3. One benefit oflog_{2} is to providea log-scale for convenient plotting of multiple IC values side-by-side.
References
Norén GN, Hopstadius J, Bate A (2011).“Shrinkage observed-to-expected ratios for robust and transparent large-scale pattern discovery.”Statistical Methods in Medical Research,22(1), 57–69.doi:10.1177/0962280211403604,https://doi.org/10.1177/0962280211403604.
Examples
ic(obs = 20, exp = 10)# Note that obs and exp can be vectors (of equal length, no recycling allowed)ic(obs = c(20, 30), exp = c(10, 10))print function for da objects
Description
print function for da objects
Usage
## S3 method for class 'da'print(x, n = 10, ...)Arguments
x | A S3 obj of class "da", output from |
n | Control the number of rows to print. |
... | For passing additional parameters to extended classes. |
Value
Nothing, but prints the tibble da_df in the da object.
Examples
da_1 <-tiny_dataset |>da()print(da_1)Proportional Reporting Rate
Description
Calculates Proportional Reporting Rate ("PRR") withconfidence intervals, used in disproportionality analysis.
Usage
prr( obs = NULL, n_drug = NULL, n_event_prr = NULL, n_tot_prr = NULL, conf_lvl = 0.95)Arguments
obs | Number of reports for the specific drug and event (i.e. theobserved count). |
n_drug | Number of reports with the drug of interest. |
n_event_prr | Number of reports with the event in the background. |
n_tot_prr | Number of reports in the background. |
conf_lvl | Confidence level of confidence or credibility intervals.Default is 0.95 (i.e. 95 % confidence interval). |
Details
The PRR is the proportion of reports with an event in set of exposedcases, divided with the proportion of reports with the event in a backgroundor comparator, which does not include the exposed.
The PRR is estimated from a observed-to-expected ratio, based onsimilar to the RRR and IC, but excludes the exposure of interest from thecomparator.
\hat{PRR} = \frac{\hat{O}}{\hat{E}}
where\hat{O} is the observed number of reports, and expected\hat{E}is estimated as
\hat{E} = \frac{\hat{N}_{drug} \times (\hat{N}_{event} - \hat{O})}{\hat{N}_{TOT}-\hat{N}_{drug}}
where\hat{N}_{drug},\hat{N}_{event},\hat{O} and\hat{N}_{TOT} arethe number of reports with the drug, the event, the drug and event, andin the whole database respectively.
A confidence interval is derived in Gravel (2009) using the delta method:
\hat{s} = \sqrt{ 1/\hat{O} - 1/(\hat{N}_{drug}) + 1/(\hat{N}_{event} - \hat{O}) - 1/(\hat{N}_{TOT} - \hat{N}_{drug})}
and
[\hat{CI}_{\alpha/2}, \hat{CI}_{1-\alpha/2}] =
[\frac{\hat{O}}{\hat{E}} \times \exp(Q_{\alpha/2} \times \hat{s}),\frac{\hat{O}}{\hat{E}} \times \exp(Q_{1-\alpha/2} \times \hat{s})]
whereQ_{\alpha} denotes the quantile function of astandard Normal distribution at significance level\alpha.
Note: For historical reasons, another version of this standard deviation is sometimes usedwhere the last fraction under the square root is added rather than subtracted,with negligible practical implications in large databases. This function uses the versiondeclared above, i.e. with subtraction.
Value
A tibble with three columns (point estimate and credibility bounds).Number of rows equals length of inputs obs, n_drug, n_event_prr and n_tot_prr.
References
Montastruc J, Sommet A, Bagheri H, Lapeyre-Mestre M (2011).“Benefits and strengths of the disproportionality analysis for identification of adverse drug reactions in a pharmacovigilance database.”British Journal of Clinical Pharmacology,72(6), 905–908.doi:10.1111/j.1365-2125.2011.04037.x,https://doi.org/10.1111/j.1365-2125.2011.04037.x.
Gravel C (2009).“Statistical Methods for Signal Detection in Pharmacovigilance.”https://repository.library.carleton.ca/downloads/jd472x08w.
Examples
prr( obs = 5, n_drug = 10, n_event_prr = 20, n_tot_prr = 10000)# Note that input parameters can be vectors (of equal length, no recycling)pvda::prr( obs = c(5, 10), n_drug = c(10, 20), n_event_prr = c(15, 30), n_tot_prr = c(10000, 10000))Reporting Odds Ratio
Description
Calculates Reporting Odds Ratio ("ROR") and confidenceintervals, used in disproportionality analysis.
Usage
ror(a = NULL, b = NULL, c = NULL, d = NULL, conf_lvl = 0.95)Arguments
a | Number of reports for the specific drug and event (i.e. theobserved count). |
b | Number of reports with the drug, without the event |
c | Number of reports without the drug, with the event |
d | Number of reports without the drug, without the event |
conf_lvl | Confidence level of confidence or credibility intervals.Default is 0.95 (i.e. 95 % confidence interval). |
Details
The ROR is an odds ratio calculated from reporting counts. TheR for Reporting in ROR is meant to emphasize an interpretation of reporting,as the ROR is calculated from a reporting database. Note: the function isvectorized, i.e. a, b, c and d can be vectors, see the examples.
A reporting odds ratio is simply an odds ratio based on adverse eventreports.
\hat{ROR} = \frac{a/b}{c/d}
wherea = observed count (i.e. number of reports with exposure andoutcome),b = number of reports with the drug and without the event,c = number of reports without the drug with the event andd =number of reports with neither of the drug and the event.
A confidence interval for the ROR can be derived through the delta method,with a standard deviation:
\hat{s} = \sqrt{1/a + 1/b + 1/c + 1/d}
with the resulting confidence interval for significance level\alpha
[\hat{ROR} \times exp(\Phi_{\alpha/2} \times \hat{s}), \hat{ROR} \times exp(\Phi_{1-\alpha/2} \times \hat{s})]
Value
A tibble with three columns (point estimate and credibility bounds).Number of rows equals length of inputs a, b, c, d.
References
Montastruc J, Sommet A, Bagheri H, Lapeyre-Mestre M (2011).“Benefits and strengths of the disproportionality analysis for identification of adverse drug reactions in a pharmacovigilance database.”British Journal of Clinical Pharmacology,72(6), 905–908.doi:10.1111/j.1365-2125.2011.04037.x,https://doi.org/10.1111/j.1365-2125.2011.04037.x.
Examples
ror( a = 5, b = 10, c = 20, d = 10000)# Note that a, b, c and d can be vectors (of equal length, no recycling)pvda::ror( a = c(5, 10), b = c(10, 20), c = c(15, 30), d = c(10000, 10000))Sort a disproportionality analysis by the lower da conf. or cred. limit
Description
Sorts the output by the mean lower limit of a passed da estimator
Usage
round_and_sort_by_lower_da_limit( df = NULL, df_colnames = NULL, df_syms = NULL, conf_lvl = NULL, sort_by = NULL, da_estimators = NULL, number_of_digits = 2)Arguments
df | See add_disproportionality |
df_colnames | See add_disproportionality |
df_syms | See add_disproportionality |
conf_lvl | See add_disproportionality |
sort_by | See add_disproportionality |
da_estimators | See add_disproportionality |
number_of_digits | Numeric value. Set the number of digits to show in output by passingan integer. Default value is 2 digits. Set to NULL to avoid rounding. |
Value
The df object, sorted.
Rounds columns in da_df with many decimals
Description
Internal function containing a mutate + across
Usage
round_columns_with_many_decimals( da_df = NULL, da_estimators = NULL, number_of_digits = NULL)Arguments
da_df | See add_disproportionality |
da_estimators | See add_disproportionality |
number_of_digits | See add_disproportionality |
Value
A df with rounded columns
Summary function for disproportionality objects
Description
Provides summary counts of SDRs and shows the top five DECs
Usage
## S3 method for class 'da'summary(object, print = TRUE, ...)Arguments
object | A S3 obj of class "da", output from |
print | Do you want to print the output to the console. Defaults to TRUE. |
... | For passing additional parameters to extended classes. |
Value
Passes a tibble with the SDR counts invisibly.
A 110 reports big, simulated ICSR database
Description
The dataframe tiny_dataset is used to demonstrate the functionalityof the package in examples. The larger drug_event_df-dataset can alsobe used.
Usage
tiny_datasetFormat
'tiny_dataset'A data frame with 110 rows and 3 columns. In total 110 unique report_ids.In particular, for Drug A and Event 1 the observed count will be 4 andexp_rrr = 1.1
- report_id
A report identifier, 1-110.
- drug
Drugs named as Drug_A - Drug_Z.
- event
Events named as Event_1 - Event_97)
- group
In this example, sex of the patient, i.e. Male or Female.
Source
Simulated data.
Write to excel
Description
Writes output from a disproportionality analysis to an excel file
Usage
write_to_excel(df, write_path = NULL)Arguments
df | The data frame to export. See '?da' for details. |
write_path | A string giving the file path |
Value
Nothing.