- Notifications
You must be signed in to change notification settings - Fork5
Bioinformatics 101 tool for counting unique k-length substrings in DNA
License
suchapalaver/krust
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
krust
is ak-mer counter - a bioinformatics 101 tool for counting the frequency of substrings of lengthk
within strings of DNA data.krust
is written in Rust and run from the command line. It takes a FASTA file of DNA sequences and will output all canonical k-mers (the double helix means each k-mer has areverse complement) and their frequency across all records in the given data.krust
is tested for accuracy againstjellyfish.
krust: counts k-mers, writtenin rustUsage: krust<k><path>Arguments:<k> provides k length, e.g. 5<path> path to a FASTA file, e.g. /home/lisa/bio/cerevisiae.pan.faOptions: -h, --help Printhelp information -V, --version Print version information
krust
supports eitherrust-bio
orneedletail
to read FASTA record. Use the--features
flag to select.
Runkrust
withrust-bio
's fasta reader to count5-mers like this:
cargo run --release --features rust-bio -- 5 your/local/path/to/fasta_data.fa
or, searching for21-mers withneedletail
as the fasta reader, like this:
cargo run --release --features needletail -- 21 your/local/path/to/fasta_data.fa
krust
prints tostdout
, writing, on alternate lines:
>114928ATGCC>289495AATCA...
About
Bioinformatics 101 tool for counting unique k-length substrings in DNA
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Uh oh!
There was an error while loading.Please reload this page.