Movatterモバイル変換

The Generalized Berk-Jones statistic was developed to performset-based inference in genetic association studies. It is an alternativeto tests such as the Sequence Kernel Association Test (SKAT),Generalized Higher Criticism (GHC), and Minimum p-value (minP).

Why use GBJ?

GBJ is a generalization of the Berk-Jones (BJ) statistic, whichoffers - in a certain sense - asymptotic power guarantees for detectionof rare and weak signals. GBJ modifies BJ to account for correlationbetween factors in a set. GBJ has been demonstrated to outperform othertests when signals are moderately sparse (more precisely, when thenumber of signals is betweend^1/4 andd^1/2, whered is the number of factors inthe set).

Other advantages include: 1. Analytic p-value calculation (no needfor permutation inference). 2. Can be applied to individual-levelgenotype data or GWAS summary statistics. 3. No tuning parameters.Accepts standard inputs (similar to glm() function).

Example

We show a simple example for testing the association between a set of50 SNPs (which could be, for example, from the same gene or pathway) anda binary outcome.

library(GBJ)set.seed(1000)# Case-control study, 1000 subjectscancer_status<-c(rep(1,500),rep(0,500))# We have 50 SNPs each with minor allele frequency of 0.3 in this examplegenotype_data<-matrix(data=rbinom(n=1000*50,size=2,prob=0.3),nrow=1000)age<-round(runif(n=1000,min=30,max=80) )gender<-rbinom(n=1000,size=1,prob=0.5)# Fit the null model, calculate marginal score statistics for each SNP# (asymptotically equivalent to those calculated by, for example, PLINK)null_mod<-glm(cancer_status~age+gender,family=binomial(link="logit"))log_reg_stats<-calc_score_stats(null_model=null_mod,factor_matrix=genotype_data,link_function="logit")# Run the testGBJ(test_stats=log_reg_stats$test_stats,cor_mat=log_reg_stats$cor_mat)#> $GBJ#> [1] 1.43984#>#> $GBJ_pvalue#> [1] 0.330911#>#> $err_code#> [1] 0

What else is in here?

We may not have convinced you that GBJ is the best option for yourapplication. If that is the case, then you may still be interested intrying the Berk-Jones (BJ), Generalized Higher Criticism (GHC), HigherCriticism (HC), or Minimum p-value (minP) tests, which can be run withthe same inputs, i.e. GHC(test_stats=score_stats, cor_mat=cor_Z) to runthe GHC. We also have developed an omnibus test which information frommultiple different methods. Please see the vignette for moredetails.

Movatterモバイル変換

What is GBJ?

Why use GBJ?

Example

What else is in here?