| Type: | Package |
| Title: | Computationally-Efficient Confidence Intervals for Mean Shiftfrom Permutation Methods |
| Version: | 0.2.3 |
| Date: | 2022-06-21 |
| Description: | Implements computationally-efficient construction of confidence intervals from permutation or randomization tests for simple differences in means, based on Nguyen (2009) <doi:10.15760/etd.7798>. |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.1.2 |
| Imports: | matrixStats |
| Suggests: | knitr, rmarkdown |
| VignetteBuilder: | knitr |
| Language: | en-US |
| URL: | https://github.com/ColbyStatSvyRsch/CIPerm/ |
| BugReports: | https://github.com/ColbyStatSvyRsch/CIPerm/issues |
| NeedsCompilation: | no |
| Packaged: | 2022-06-21 12:51:24 UTC; jawieczo |
| Author: | Emily Tupaj [aut], Jerzy Wieczorek |
| Maintainer: | Jerzy Wieczorek <jawieczo@colby.edu> |
| Repository: | CRAN |
| Date/Publication: | 2022-06-21 13:10:10 UTC |
CIPerm: Computationally-Efficient Confidence Intervals for Mean Shift from Permutation Methods
Description
Implements computationally-efficient construction ofconfidence intervals from permutation tests or randomization testsfor simple differences in means.The method is based on Minh D. Nguyen's 2009 MS thesis paper,"Nonparametric Inference using Randomization and PermutationReference Distribution and their Monte-Carlo Approximation,"<doi:10.15760/etd.7798>See thenguyen vignette for a brief summary of the method.First usedset to tabulate summary statistics for each permutation.Then pass the results intocint to compute a confidence interval,or intopval to calculate p-values.
Details
Our R function arguments and outputs are structured differentlythan the similarly-named R functions in Nguyen (2009),but the results are equivalent. In thenguyen vignettewe use our functions to replicate Nguyen's results.
Following Ernst (2004) and Nguyen (2009), we use "permutation methods"to include both randomization tests and permutation tests.In the simple settings in this R package,the randomization and permutation test mechanics are identical,but their interpretations may differ.
We say "randomization test" under the model wherethe units are not necessarily a random sample, but the treatment assignmentwas random. The null hypothesis is that the treatment has no effect.In this case we can make causal inferences about thetreatment effect (difference between groups) for this set of individuals,but cannot necessarily generalize to other populations.
By contrast, we say "permutation test" under the model wherethe units were randomly sampled from two distinct subpopulations.The null hypothesis is that the two groups have identical CDFs.In this case we can make inferences about differences between subpopulations,but there's not necessarily any "treatment" to speak ofand causal inferences may not be relevant.
References
Ernst, M.D. (2004)."Permutation Methods: A Basis for Exact Inference,"Statistical Science, vol. 19, no. 4, 676-685,<doi:10.1214/088342304000000396>.
Nguyen, M.D. (2009)."Nonparametric Inference using Randomization and PermutationReference Distribution and their Monte-Carlo Approximation"[unpublished MS thesis; Mara Tableman, advisor], Portland State University.Dissertations and Theses. Paper 5927.<doi:10.15760/etd.7798>.
Permutation-methods confidence interval for difference in means
Description
Calculate confidence interval for a simple difference in meansfrom a two-sample permutation or randomization test.In other words, we set up a permutation or randomization test to evaluateH_0: \mu_A - \mu_B = 0, then use those same permutations toconstruct a CI for the parameter\delta = (\mu_A - \mu_B).
Usage
cint(dset, conf.level = 0.95, tail = c("Two", "Left", "Right"))Arguments
dset | The output of |
conf.level | Confidence level (default 0.95 corresponds to 95% confidence level). |
tail | Which tail? Either "Two"- or "Left"- or "Right"-tailed interval. |
Details
If the desiredconf.level is not exactly feasible,the achieved confidence level will be slightly anti-conservative.We use the default numeric tolerance inall.equal to checkif(1-conf.level) * nrow(dset) is an integer for one-tailed CIs,or if(1-conf.level)/2 * nrow(dset) is an integer for two-tailed CIs.If so,conf.level.achieved will be the desiredconf.level.Otherwise, we will use the next feasible integer,thus slightly reducing the confidence level.For example, in the example below the randomization test has 35 combinations,and a two-sided CI must have at least one combination value in each tail,so the largest feasible confidence level for a two-sided CI is 1-(2/35) or around 94.3%.If we request a 95% or 99% CI, we will have to settle for a 94.3% CI instead.
Value
A list containing the following components:
conf.intNumeric vector with the CI's two endpoints.
conf.level.achievedNumeric value of the achieved confidence level.
Examples
x <- c(19, 22, 25, 26)y <- c(23, 33, 40)demo <- dset(x, y)cint(dset = demo, conf.level = .95, tail = "Two")Permutation-methods summary statistics
Description
Calculate table of differences in means, medians, etc. for eachcombination (or permutation, if using Monte Carlo approx.),as needed in order to compute a confidence interval usingcintand/or a p-value usingpval.
Usage
dset(group1, group2, nmc = 10000, returnData = FALSE)Arguments
group1 | Vector of numeric values for first group. |
group2 | Vector of numeric values for second group. |
nmc | Threshold for whether to use Monte Carlo draws or completeenumeration. If the number of all possible combinations |
returnData | Whether the returned dataframe should include columns forthe permuted data itself (if TRUE), or only the derived columns that areneeded for confidence intervals and p-values (if FALSE, default). |
Value
A data frame ready to be used incint() orpval().
Examples
x <- c(19, 22, 25, 26)y <- c(23, 33, 40)demo <- dset(x, y, returnData = TRUE)knitr::kable(demo, digits = 2)Permutations-methods p-values for difference in means, medians, or Wilcoxon rank sum test
Description
Calculate p-values for a two-sample permutation or randomization test.In other words, we set up a permutation or randomization test to evaluatethe null hypothesis that groups A and B have the same distribution,then calculate p-values for several alternatives:a difference in means (value="m"),a difference in medians (value="d"),or the Wilcoxon rank sum test (value="w").
Usage
pval( dset, tail = c("Two", "Left", "Right"), value = c("m", "s", "d", "w", "a"))Arguments
dset | The output of |
tail | Which tail? Either "Two"- or "Left"- or "Right"-tailed test. |
value | Either "m" for difference in means (default);"s" for sum of Group 1 values[equivalent to "m" and included only for sake of checking results againstNguyen (2009) and Ernst (2004)];"d" for difference in medians;or "w" for Wilcoxon rank sum statistic;or "a" for a named vector of all four p-values. |
Value
Numeric p-value for the selected type of test,or a named vector of all four p-values ifvalue="a".
Examples
x <- c(19, 22, 25, 26)y <- c(23, 33, 40)demo <- dset(x, y)pval(dset = demo, tail = "Left", value = "s")pval(dset = demo, tail = "Left", value = "a")