- Notifications
You must be signed in to change notification settings - Fork0
'R6'-Based Flexible Framework for Permutation Tests
License
qddyy/LearnNonparam
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This R package implements several non-parametric tests in chapters 1-5ofHiggins (2004), including tests for one sample, twosamples, k samples, paired comparisons, blocked designs, trends andassociation. Built withRcppfor efficiency andR6 forflexible, object-oriented design, it provides a unified framework forperforming or creating custom permutation tests.
Install the stable version fromCRAN:
install.packages("LearnNonparam")
Install the development version fromGithub:
# install.packages("remotes")remotes::install_github("qddyy/LearnNonparam")
library(LearnNonparam)
Construct a test object
- from some R6 class directly
t<-Wilcoxon$new(n_permu=1e6)
- using the
pmt
(permutationtest) wrapper
# recommended for a unified APIt<- pmt("twosample.wilcoxon",n_permu=1e6)
Provide it with samples
set.seed(-1)t$test(rnorm(10,1), rnorm(10,0))
Check the results
t$statistic
t$p_value
options(digits=3)t$print()
ggplot2::theme_set(ggplot2::theme_minimal())t$plot(style="ggplot2",binwidth=1)
Modify some settings and observe the change
t$type<-"asymp"t$p_value
Seepmts()
for tests implemented in this package.
key | class | test |
---|---|---|
onesample.quantile | Quantile | Quantile Test |
onesample.cdf | CDF | Inference on Cumulative Distribution Function |
twosample.difference | Difference | Two-Sample Test Based on Mean or Median |
twosample.wilcoxon | Wilcoxon | Two-Sample Wilcoxon Test |
twosample.scoresum | ScoreSum | Two-Sample Test Based on Sum of Scores |
twosample.ansari | AnsariBradley | Ansari-Bradley Test |
twosample.siegel | SiegelTukey | Siegel-Tukey Test |
twosample.rmd | RatioMeanDeviance | Ratio Mean Deviance Test |
twosample.ks | KolmogorovSmirnov | Two-Sample Kolmogorov-Smirnov Test |
ksample.oneway | OneWay | One-Way Test for Equal Means |
ksample.kw | KruskalWallis | Kruskal-Wallis Test |
ksample.jt | JonckheereTerpstra | Jonckheere-Terpstra Test |
multcomp.studentized | Studentized | Multiple Comparison Based on Studentized Statistic |
paired.sign | Sign | Two-Sample Sign Test |
paired.difference | PairedDifference | Paired Comparison Based on Differences |
rcbd.oneway | RCBDOneWay | One-Way Test for Equal Means in RCBD |
rcbd.friedman | Friedman | Friedman Test |
rcbd.page | Page | Page Test |
association.corr | Correlation | Test for Association Between Paired Samples |
table.chisq | ChiSquare | Chi-Square Test on Contingency Table |
define_pmt
allows users to define new permutation tests. Take thetwo-sample Wilcoxon test as an example:
t_custom<- define_pmt(# this is a two-sample permutation testinherit="twosample",statistic=function(x,y) {# (optional) pre-calculate certain constants that remain invariant during permutationm<- length(x)n<- length(y)# return a closure to calculate the test statisticfunction(x,y) sum(x)/m- sum(y)/n },# reject the null hypothesis when the test statistic is too large or too smallrejection="lr",n_permu=1e5)
Also, the statistic can be written in C++. Leveraging Rcpp sugars andC++14 features, only minor modifications are needed to make itcompatible with C++ syntax.
t_cpp<- define_pmt(inherit="twosample",rejection="lr",n_permu=1e5,statistic="[](const auto& x, const auto& y) { auto m = x.length(); auto n = y.length(); return [=](const auto& x, const auto& y) { return sum(x) / m - sum(y) / n; }; }")
It’s easy to check thatt_custom
andt_cpp
are equivalent:
x<- rnorm(10,mean=0)y<- rnorm(10,mean=5)
set.seed(0)t_custom$test(x,y)$print()
set.seed(0)t_cpp$test(x,y)$print()
coin is a commonly used Rpackage for performing permutation tests. Below is a benchmark:
library(coin)data<- c(x,y)group<-factor(c(rep("x", length(x)), rep("y", length(y))))options(LearnNonparam.pmt_progress=FALSE)benchmark<-microbenchmark::microbenchmark(R=t_custom$test(x,y),Rcpp=t_cpp$test(x,y),coin= wilcox_test(data~group,distribution= approximate(nresample=1e5,parallel="no")))
benchmark
It can be seen that C++ brings significantly better performance thanpure R, even surpassing thecoin
package (under sequential execution).However, all tests in this package are currently written in R with noplans for migration to C++ in the future. This is because the primarygoal of this package is not to maximize performance but to offer aflexible framework for permutation tests.
Higgins, J. J. 2004.An Introduction to Modern NonparametricStatistics. Duxbury Advanced Series. Brooks/Cole.
About
'R6'-Based Flexible Framework for Permutation Tests