- Notifications
You must be signed in to change notification settings - Fork0
chhylp123/ksw2
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
KSW2 is a library to align a pair of biological sequences based on dynamicprogramming (DP). So far it comes with global alignment and alignment extension(no local alignment yet) under an affine gap cost function: gapCost(k) =q+k*e, or a two-piece affine gap cost: gapCost2(k) = min{q+k*e,q2+k*e2}. For the latter cost function, ifq+e<q2+e2 ande>e2,(q,e) is effectively applied to short gaps only, while (q2,e2) appliedto gaps no shorter than ceil((q2-q)/(e-e2)-1). It helps to retain longgaps. The algorithm behind the two-piece cost is close toGotoh(1990).
KSW2 supports fixed banding and optionally produces alignment paths (i.e.CIGARs) with gaps either left- or right-aligned. It provides implementationsusing SSE2 and SSE4.1 intrinsics based onHajime Suzuki's diagonalformulation which enables 16-way SSE parallelization for the most partof the inner loop, regardless of the maximum score of the alignment.
KSW2 implements the Suzuki-Kasahara algorithm and is a component ofminimap2. If you use KSW2 in your work, please cite:
- Suzuki, H. and Kasahara, M. (2018). Introducing difference recurrence relations for faster semi-global alignment of long sequences.BMC Bioinformatics,19:45.
- Li, H (2018) Minimap2: pairwise alignment for nucleotide sequences.Bioinformatics,34:3094-3100.
Eachksw2_*.c
file implements a single function and is independent of eachother. Here are brief descriptions about what each file implements:
- ksw2_gg.c: global alignment; Green's standard formulation
- ksw2_gg2.c: global alignment; Suzuki's diagonal formulation
- ksw2_gg2_sse.c: global alignment with SSE intrinsics; Suzuki's
- ksw2_extz.c: global and extension alignment; Green's formulation
- ksw2_extz2_sse.c: global and extension with SSE intrinsics; Suzuki's
- ksw2_extd.c: global and extension alignment, dual gap cost; Green's formulation
- ksw2_extd2_sse.c: global and extension, dual gap cost, with SSE intrinsics; Suzuki's
Users are encouraged to copy the header fileksw2.h
and relevantksw2_*.c
file to their own source code trees. On x86 CPUs with SSE2intrinsics,ksw2_extz2_sse.c
is recommended in general. It supports globalalignment, alignment extension with Z-drop, score-only alignment, global-onlyalignment and right-aligned CIGARs.ksw2_gg*.c
are mostly for demonstrationand comparison purposes. They are annotated with more comments and easier tounderstand thanksw2_ext*.c
. Header fileksw2.h contains briefdocumentations. TeX fileksw2.tex gives brief derivation.
To compile the test programksw-test
, just typemake
. It takes theadvantage of SSE4.1 when available. To compile with SSE2 only, usemake sse2=1
instead. If you have installedparasail, usemake parasail=prefix
, whereprefix
points to the parasail install directory (e.g./usr/local
).
The following shows a complete example about how to use the library.
#include<string.h>#include<stdio.h>#include"ksw2.h"voidalign(constchar*tseq,constchar*qseq,intsc_mch,intsc_mis,intgapo,intgape){inti,a=sc_mch,b=sc_mis<0?sc_mis :-sc_mis;// a>0 and b<0int8_tmat[25]= {a,b,b,b,0,b,a,b,b,0,b,b,a,b,0,b,b,b,a,0,0,0,0,0,0 };inttl=strlen(tseq),ql=strlen(qseq);uint8_t*ts,*qs,c[256];ksw_extz_tez;memset(&ez,0,sizeof(ksw_extz_t));memset(c,4,256);c['A']=c['a']=0;c['C']=c['c']=1;c['G']=c['g']=2;c['T']=c['t']=3;// build the encoding tablets= (uint8_t*)malloc(tl);qs= (uint8_t*)malloc(ql);for (i=0;i<tl;++i)ts[i]=c[(uint8_t)tseq[i]];// encode to 0/1/2/3for (i=0;i<ql;++i)qs[i]=c[(uint8_t)qseq[i]];ksw_extz(0,ql,qs,tl,ts,5,mat,gapo,gape,-1,-1,0,&ez);for (i=0;i<ez.n_cigar;++i)// print CIGARprintf("%d%c",ez.cigar[i]>>4,"MID"[ez.cigar[i]&0xf]);putchar('\n');free(ez.cigar);free(ts);free(qs);}intmain(intargc,char*argv[]){align("ATAGCTAGCTAGCAT","AGCTAcCGCAT",1,-2,2,1);return0;}
The following table shows timing on two pairs of long sequences (both in the"test" directory).
Data set | Command line options | Time (s) | CIGAR | Ext | SIMD | Source |
---|---|---|---|---|---|---|
50k | -t gg -s | 7.3 | N | N | N | ksw2 |
-t gg2 -s | 19.8 | N | N | N | ksw2 | |
-t extz -s | 9.2 | N | Y | N | ksw2 | |
-t ps_nw | 9.8 | N | N | N | parasail | |
-t ps_nw_striped_sse2_128_32 | 2.9 | N | N | SSE2 | parasail | |
-t ps_nw_striped_32 | 2.2 | N | N | SSE4 | parasail | |
-t ps_nw_diag_32 | 3.0 | N | N | SSE4 | parasail | |
-t ps_nw_scan_32 | 3.0 | N | N | SSE4 | parasail | |
-t extz2_sse -sg | 0.96 | N | N | SSE2 | ksw2 | |
-t extz2_sse -sg | 0.84 | N | N | SSE4 | ksw2 | |
-t extz2_sse -s | 3.0 | N | Y | SSE2 | ksw2 | |
-t extz2_sse -s | 2.7 | N | Y | SSE4 | ksw2 | |
16.5k | -t gg -s | 0.84 | N | N | N | ksw2 |
-t gg | 1.6 | Y | N | N | ksw2 | |
-t gg2 | 3.3 | Y | N | N | ksw2 | |
-t extz | 2.0 | Y | Y | N | ksw2 | |
-t extz2_sse | 0.40 | Y | Y | SSE4 | ksw2 | |
-t extz2_sse -g | 0.18 | Y | N | SSE4 | ksw2 |
The standard DP formulation is about twice as fast as Suzuki's diagonalformulation (-tgg
vs-tgg2
), but SSE-based diagonal formulationis several times faster than the standard DP. If we only want to compute oneglobal alignment score, we can use 16-way parallelization in the entire innerloop. For extension alignment, though, we need to keep an array of 32-bitscores and have to use 4-way parallelization for part of the inner loop. Thissignificantly reduces performance (-sg
vs-s
). KSW2 is faster thanparasail partly because the former uses one score for all matches and anotherscore for all mismatches. For diagonal formulations, vectorization is morecomplex given a generic scoring matrix.
It is possible to further accelerate global alignment with dynamic banding asis implemented inedlib. However, it is not as effective for extensionalignment. Another idea isadaptive banding, which might be worthtrying at some point.
Library | CIGAR | Intra-seq | Affine-gap | Local | Global | Glocal | Extension |
---|---|---|---|---|---|---|---|
edlib | Yes | Yes | No | Very fast | Very fast | Very fast | N/A |
KSW | Yes | Yes | Yes | Fast | Slow | N/A | Slow |
KSW2 | Yes | Yes | Yes/dual | N/A | Fast | N/A | Fast |
libgaba | Yes | Yes | Yes | N/A? | N/A? | N/A? | Fast |
libssa | No | No? | Yes | Fast | Fast | N/A | N/A |
Opal | No | No | Yes | Fast | Fast | Fast | N/A |
Parasail | No | Yes | Yes | Fast | Fast | Fast | N/A |
SeqAn | Yes | Yes | Yes | Slow | Slow | Slow | N/A |
SSW | Yes | Yes | Yes | Fast | N/A | N/A | N/A |
SWIPE | Yes | No | Yes | Fast | N/A? | N/A? | N/A |
SWPS3 | No | Yes | Yes | Fast | N/A? | N/A | N/A |