Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Bioinformatics 101 tool for counting unique k-length substrings in DNA

License

NotificationsYou must be signed in to change notification settings

suchapalaver/krust

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

krust is ak-mer counter - a bioinformatics 101 tool for counting the frequency of substrings of lengthk within strings of DNA data.krust is written in Rust and run from the command line. It takes a FASTA file of DNA sequences and will output all canonical k-mers (the double helix means each k-mer has areverse complement) and their frequency across all records in the given data.krust is tested for accuracy againstjellyfish.

krust: counts k-mers, writtenin rustUsage: krust<k><path>Arguments:<k>     provides k length, e.g. 5<path>  path to a FASTA file, e.g. /home/lisa/bio/cerevisiae.pan.faOptions:  -h, --help     Printhelp information  -V, --version  Print version information

krust supports eitherrust-bio orneedletail to read FASTA record. Use the--features flag to select.

Runkrust withrust-bio's fasta reader to count5-mers like this:

cargo run --release --features rust-bio -- 5 your/local/path/to/fasta_data.fa

or, searching for21-mers withneedletail as the fasta reader, like this:

cargo run --release --features needletail -- 21 your/local/path/to/fasta_data.fa

krust prints tostdout, writing, on alternate lines:

>114928ATGCC>289495AATCA...

[8]ページ先頭

©2009-2025 Movatter.jp