seqfu deinterleave

deinterleave (ordei) is one of the core subprograms ofSeqFu. It's used to produce two separate FASTQ files from an interleaved file.

ilv: interleave FASTQ files  Usage: dei [options] -o basename <interleaved-fastq>  -o --output-basename "str"     save output to output_R1.fq and output_R2.fq  -f --for-ext "R1"              extension for R1 file [default: _R1.fq]  -r --rev-ext "R2"              extension for R2 file [default: _R2.fq]  -c --check                     enable careful mode (check sequence names and numbers)  -v --verbose                   print verbose output  -s --strip-comments            skip comments  -p --prefix "string"           rename sequences (append a progressive number) notes:    use "-" as input filename to read from STDINexample:    dei -o newfile file.fq

Streaming

If a program produce an interleaved output,seqfu deinterleave can be used in a pipe (specifying "-" as input):

fu-primers-1 file_R1.fq-2 file_R2.fq | seqfu deinterleave-o fileNoPrimers -

What are interleaved files?

Paired end sequences can be stored in two separate files (usually denoted with the_R1 and_R2 strings) or in a single sequence where each sequence pair is stored as two subsequent sequences.

A simple example is depicted below:

=======================================================================File_R1.fq                File_R2.fq                Interleaved.fq=======================================================================@seq1                     @seq1                     @seq1TTTCATTCTGACTGCAACG       GGATTAAAAAAAGAGTGTC       TTTCATTCTGACTGCAACG+                         +                         +IIIIIIIIIIIIIIIIIII       IIIIIIIIIIIIIIIIIII       IIIIIIIIIIIIIIIIIII@seq2                     @seq2                     @seq1GTGTGGATTAAAAAAAAAA       TTTTTTTTTTTTTTTTTTT       GGATTAAAAAAAGAGTGTC+                         +                         +IIIIIIIIIIIIIIIIIII       IIIIIIIIIIIIIIIIIII       IIIIIIIIIIIIIIIIIII@seq3                     @seq3                     @seq2 AGAGTGTCTGATAGCA          GATAGCAG                  GTGTGGATTAAAAAAAAAA+                         +                         +IIIIIIIIIIIIIIII          IIIIIIII                  IIIIIIIIIIIIIIIIIII                                                    @seq2                                                    TTTTTTTTTTTTTTTTTTT                                                    +                                                    IIIIIIIIIIIIIIIIIII                                                    @seq3                                                    AGAGTGTCTGATAGCA                                                    +                                                    IIIIIIIIIIIIIIII                                                    @seq3                                                    GATAGCAG                                                    +                                                    IIIIIIII

Screenshot

Screenshot of