- Notifications
You must be signed in to change notification settings - Fork2
vezzi/qaTools
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
A couple of useful qa tools for sequencing data.
I. Setup:
Usemake SAMTOOLS=/PATH/TO/SAMTOOLS/SOURCE VERSION=SAMTOOLS_VERSION
If you don't have samtools, download it from here (and run make):http://samtools.sourceforge.netAny version should work. However, 1.3 is verified to do such.
II. Tools:
qaComputeComputes normal and span coverage from a bam/sam file.Also counts unmapped and sub-par quality reads.Parameters:m -Compute median coverage for each contig/chromosome.Will make running a bit slower. Off by default.
q [INT] - Quality threshold. Any read with a mapping quality underINT will be ignored when computing the coverage.
NOTE: bwa outputs mapping quality 0 for reads that map with equal quality in multiple places. If you want to condier this, set q to 0.
d - Print coverage histrogram over each individual contig/chromosome.These details will be printed in file .detail
p [INT] - Print coverage profile to bed file, averaged over given window size.
i - Silent run. Will not print running info to stdout.
s [INT] - Compute span coverage. (Use for mate pair libs)Instead of actual read coverage, using the options will considerthe entire span of the insert as a read, if insert size islower than INT.For an accurate estimation of span coverage, I recommendsetting an insert size limit INT around 3*std_dev of your lib'sinsert size distribution.
c [INT] - Maximum X coverage to consider in histogram.
h [STR] - Use different header.Because mappers sometimes break the headers or simply don't output them,this is provieded as a non-kosher way around it. Use with care!
For more info on the parameteres try ./qaCompute
removeUnmappedRemove unmapped and sub-par quality reads from a bam/sam file.For more info on the parameters try ./removeUnmapped
computeInsertSizeHistogramCompute the insert size distribution from a bam/sam file.For more info on the parameters try ./computeInsertSizeHistogram