Movatterモバイル変換

The test control arguments:

Argument	Description
n_peld_mc_samples	Number of samples to be used in approximating the estimated limitingdistribution of the parameter estimate under the null. Increasing thisvalue reduces the approximation error of the test statistic.
nrm_type	The type of norm to be used for the test. Generally the l_pnorm
perf_meas	the preferred measure used to generate the test statistic.
pos_lp_norms	The index of the norms to be considered. For example if we use thel_p norm, norms_indx specifies the different p’s to try.
ld_est_meth	String indicating method for estimating the limiting distribution ofthe test statistic parametric bootstrap or permutation.
ts_ld_bs_samp	The number of test statistic limiting distribution bootstrap samplesto be drawn.
other_output	A vector indicating additional data that should be returned.Currently only`"var_est"` is supported.
…	Other arguments needed in other places.

Throughout, we will use a simple data generating mechanism:

x_data<-matrix(rnorm(500),ncol =5)y_data<-rnorm(100)+0.02* x_data[,2]obs_data<-data.frame(y_data, x_data)

Test statistic controls

There are multiple options when defining a test statistic outside ofthe specification of the parameter estimator,\(\hat{\Psi}\) and corresponding ICestimator,\(\hat{IC}\) (which isspecified in theparam_est argument. There are fourarguments arguments that control these options.

perf_meas

The first argumentperf_meas specifies theperformance measure used to define the test statistic.Loosely defined, a performance measure is a function that providesinformation about the performance of a simple test at a specifiedalternative. It takes as arguments a norm\(\varphi\), an alternative\(x\) and a limiting distribution\(P_0\) and considers the performance of atest defined by\[ \text{reject if } \varphi\left(\hat{\psi}\right) > c_{\alpha}\] if the parameter value\(\psi\) was equal to\(x\). Theperf_meas specifieswhich measure of performance to use. Currently the package hasimplemented three such measures:

p-value (specified by settingperf_meas = "pval"):The p-value of the test if\(\hat{\psi} =x\), defined by\[\Gamma(x, P_0) :=\text{pr}(\varphi(x) < \varphi(Z)) \text{ where } Z \simP_0\]
acceptance rate (specified by settingperf_meas = "est_acc"): The acceptance rate of the test if\(\hat{\psi}\), is normally distributedand centered at\(x\) defined by\[\Gamma(x, P_0) := \text{pr}(\varphi(x + Z) <c_\alpha) \text{ where } Z \sim P_0 \text{ and } c_\alpha =F_{\varphi(Z)}^{-1}(1 - \alpha)\]
multiplicative distance (specified by settingperf_meas = "mag"): The minimum\(s\) such that\[\text{pr}(\varphi(s x + Z) < c_\alpha) \text{where } Z \sim P_0 \text{ and } c_\alpha = F_{\varphi(Z)}^{-1}(1 -\alpha)\] is lower than\(0.2\).

Recommendation: Based on what we know currently, werecommend that users use the multiplicative distance performancemeasure. The other measures can have limiting distributions that arehighly concentrated near 0 which can cause issues when approximating thep-value of the test.

We will discuss specification of the norm in the next section. Formore details on the procedure, including why performance measures aregood for defining a test statistic, seeA general adaptive framework formultivariate point null testing.

Norm specification

Two arguments are used to specify the norm used in defining the teststatistic. The first isnrm_type which can either be"ssq" or"lp". These norms are defined as:

\(\ell_p\) norm("lp"):\[\ell_p: (x_1, x_2,\ldots, x_d) \mapsto \sqrt[p]{\sum_{i = 1}^d|x_i|^p} \]
Sum of squares norm ("ssq"):\[\jmath_{p}:(x_1,x_2,\ldots,x_d)\mapsto\left\{\textstyle\sum_{j=1}^{p}x^2_{(d-j+1)}\right\}^{1/2}\]

The choice of\(p\) is specified bythepos_lp_norms argument. Ifpos_lp_norm isassigned a single value, a non-adaptive version of the test will beperformed. If insteadpos_lp_norm is assigned multiplearguments an adaptive test will be carried out. More information can befound inour paper. Forthe\(\ell_p\) norm, it is possible toset\(p = \infty\). To make thisspecification in R, include"max" in the vector of valuesassigned topos_lp_norm.

ld_est_meth

The next argument we review specifies the method by which you wish toestimate the limiting distribution of the test statistic (\(\Gamma(\hat{\psi}, \hat{P}_0)\)). There aretwo options for this argument:

Parametric bootstrap (specified by settingld_est_meth = "par_boot"): When using the parametricbootstrap version of the test, the estimated limiting distribution of\(\Gamma(\hat{\psi}, \hat{P}_0)\) isapproximated by assuming that\(\hat{\psi}\) has a distribution equal to\(\hat{P}_0\) and that\(\hat{P}_0\) is normal distribution.
Permutation (specified by settingld_est_meth = "perm"): When using the permutation versionof the test, the estimated limiting distribution of\(\Gamma(\hat{\psi}, \hat{P}_0)\) isapproximated by repeatedly permuting the data and recalculating\(\hat{\psi}\) using the permuted data. Thismethod may provide better finite sample performance. However, it comesat the cost computational efficiency.Also note that dependingon the parameter of interest, the permutation based test may not havethe same null hypothesis as is desired. Thus, care must be taken whenusing this method.

Approximation controls

The next two controls specify the accuracy of the approximation ofthe testing procedure.

n_peld_mc_samples

To understand this control argument it is important to distinguishbetween our parameter estimator\(\hat{\psi}\) and our test statistic, whichis a function of\(\hat{\psi}\) and theestimated limiting distribution of\(\hat{\psi}\) under the null hypothesis(that\(\psi = 0\)), denoted by\(\hat{P}_0\). Letting\(\Gamma\) denote our performance measure,conditional on our observations, the true value of the test statistic isfixed and equal to\(\Gamma(\hat{\psi},\hat{P}_0)\).

Then_peld_mc_samples argument determines how accuratethe approximation of test statistic\(\Gamma(\hat{\psi}, \hat{P}_0)\) will be.The performance measure is frequently a function of\(\hat{P}_0\) through some probabilitystatement (see theperf_meas for examples). To approximatethese probabilities, a MC approximation is used andn_peld_mc_samples determines how many MC draws aretaken.

Considering this argument in practice, note that the testingprocedure only approximates the test statistic:

tc<- amp::test.control(n_peld_mc_samples =50,pos_lp_norms ="2")set.seed(10)test_1<- amp::mv_pn_test(obs_data = obs_data,param_est = amp::ic.pearson,control = tc)set.seed(20)test_2<- amp::mv_pn_test(obs_data = obs_data,param_est = amp::ic.pearson,control = tc)print(c(test_1$test_stat, test_2$test_stat))#> [1] 0.92 0.94

In order to better approximate the test statistic, one may increasethe value of this control argument:

mc_draws<-c(10,50)all_res<-list()for (mc_drawsinc(10,50)) {set.seed(121)  tc<- amp::test.control(n_peld_mc_samples = mc_draws,pos_lp_norms =2,perf_meas ="est_acc")  test_stat<-replicate(50, amp::mv_pn_test(obs_data = obs_data,param_est = amp::ic.pearson,control = tc)$test_stat)  all_res[[as.character(mc_draws)]]<-data.frame("mc_draws"= mc_draws, test_stat)}oldpar<-par(mfrow =c(1,2))yl<-25hist(all_res[[1]]$test_stat,main ="MC draws = 10",xlab ="Test Statistic",xlim =c(0,1),ylim =c(0, yl),breaks =seq(0,1,0.1))hist(all_res[[2]]$test_stat,main ="MC draws = 50",xlab ="Test Statistic",xlim =c(0,1),ylim =c(0, yl),breaks =seq(0,1,0.1))

par(oldpar)

ts_ld_bs_samp

The other parameter that determines the approximation accuracy of thetesting procedure ists_ld_bs_samp. This argumentdetermines the number of draws taken from the estimated limitingdistribution of\(\Gamma(\hat{\psi},\hat{P}_0)\). This is different thatn_peld_mc_samples that determines the accuracy of thesedraws and the test statistic.

Controlling the output of`mv_pn_test`

The last argument determines the output of themv_pn_test function. The standard output of the testfunction is a list containing the following:

pvalue: The approximate p-value of the test
test_stat: The approximate value of the test statistic(\(\Gamma(\hat{\psi},\hat{P}_0)\)).
test_st_eld: The approximate limiting distribution ofthe test statistic (with length equal tots_ld_bs_samp).
chosen_norm: A vector indicating which norm was chosenby the adaptive test
param_ests: The parameter estimate (\(\hat{\psi}\)).
param_ses: An estimate of the standard error off eachelement of\(\hat{\psi}\)
oth_ic_inf: Any other information provided by theparam_est function when calculating the IC and parameterestimates.

other_output

other_output is a character vector. Currentlyother_output only provides the option of returning twoadditional output elements.

If"var_est" is contained inother_output,the test output will contain will havevar_mat returnedwhich is the empirical second moment of the IC (equal asymptotically tothe variance estimator). However, this matrix can be quite large forlarger dimensions, which is why there is a separate control for thisoption.
If"obs_data" is contained in theother_output, the test output will return the data passedto the testing function.