- Notifications
You must be signed in to change notification settings - Fork4
czheluo/MBSA
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
SO this pipeline that aiming to useMutMap andQTL-seq method which rapidly mapping of quantitative trait loci in animal or plant species by whole genome resequencing of DNA from two bulked populations (F1, F2, DH or RIL). For more details, Please see the mutmap and QTL-seq paper.
QTL-seq method:
Takagi, H., Abe, A., Yoshida, K., Kosugi, S., Natsume, S., Mitsuoka, C., Uemura, A., Utsushi, H., Tamiru, M., Takuno, S., Innan, H., Cano, L. M., Kamoun, S. and Terauchi, R. (2013), QTL-seq: rapid mapping of quantitative trait loci in rice by whole genome resequencing of DNA from two bulked populations.Plant J, 74: 174–183.doi:10.1111/tpj.12105
G prime method:
Magwene PM, Willis JH, Kelly JK (2011) The Statistics of Bulk Segregant Analysis Using Next Generation Sequencing.PLOS Computational Biology 7(11): e1002255.doi.org/10.1371/journal.pcbi.1002255
Euclidean distance calculation (ED->MAPPR)
Hill J T , Demarest B L , Bisgrove B W , et al. MMAPPR: Mutation Mapping Analysis Pipeline for Pooled RNA-seq.Genome Research, 2013, 23(4):687-697.Bulked Segregant RNA-Seq (BSR-Seq)Liu S, Yeh CT, Tang HM, Nettleton D, Schnable PS (2012) Gene Mapping via Bulked Segregant RNA-Seq (BSR-Seq). PLOS ONE 7(5): e36406.https://doi.org/10.1371/journal.pone.0036406
MBSA pipeline depends on some R packages, before using it make sure you alread install these packages :
#install QTLseqr# install devtools first to download packages from githubinstall.packages("devtools")# use devtools to install QTLseqrdevtools::install_github("bmansfeld/QTLseqr")# and basics R packageslibrary(magrittr)library(dplyr)library(tidyr)library(parallel)library(grid)library(ggplot2)library(gridExtra)
$perl bsa.pipeline.plContact: czheluo@gmail.comScript: bsa.pipeline.plDescription: Multple methodsfor BSA Pipeline Usage: Options: -vcfpop.final.vcf -annpop.summaryor anno.summary -refref.fa fileorif you want touse the GATK to generate TABLE (default is NO) -pid parental name A,B -bid bulk name C,D -out out dir -bs bulk sizedefault was 30 -ws window sizedefault was 1M -pta pemutation test confidence interval -ptb pemutation test confidence interval -tgl total geneticlength (default was 2000) -bs Set N to be the number of individuals in the mutant pool -step which step you want -stop control the step -h Help
1.methods
MBSA provides several different statistics for analysis as follows:
ED^4 = EuclideanDist (default) -> Euclidean Distance
SNPindex -> Delta SNPindex :The analysis is based on calculatingthe allele frequency differences, or ∆(SNP-index), from the allele depths at each SNP. To determine regionsof the genome that significantly differ from the expected ∆(SNP-index) of 0.
G’ analysis :An alternate approach to determine statistical significance of QTL from NGS-BSA was proposed by Magwene et al. (2011) – calculating a modified G statistic for each SNP based on the observed and expected alleledepths and smoothing this value using a tricube smoothing kernel. Using the smoothed G statistic, or G’, Magwene et al.
Bulked Segregant RNA-Seq (BSR-Seq) :An empirical Bayesian approach was used to estimate, for each SNP, the conditional probability of no recombination between the SNP marker and the causal gene in the mutant pool, given the SNP allele-specific counts.
2.Inputs
The main input file is the VCF file which contains genomic variants for two bulks and parental bulks, and your species genome annotation (blast to GO, KEGG, interproscan, NT/NR etc.). For the genomic variant calling, for WGS, I'd love to recommendate usingGATK, and RNAseq you can just using thesamtools.