Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

A secure encryption tool for genomic data

License

NotificationsYou must be signed in to change notification settings

cobilab/cryfa

Repository files navigation

Cryfa

License: GPL v3CI

Cryfa is an ultrafast encryption tool specifically designed for genomic data. Besides providing robust security, it also compresses FASTA/FASTQ sequences by a factor of three, making it an efficient solution for managing genomic data.

Installation

Conda

conda install -c bioconda -y cryfa

Docker

# Pull & Run the imagedocker pull smortezah/cryfa;docker run -it smortezah/cryfa;

Build from source

Linux

# Install git and cmakesudo apt update;sudo apt install git cmake;# Clone and install Cryfagit clone https://github.com/cobilab/cryfa.git;cd cryfa;sh install.sh;

macOS

# Install Homebrew, git and cmake/usr/bin/ruby -e"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)";brew install git cmake;# Clone and install Cryfagit clone https://github.com/cobilab/cryfa.git;cd cryfa;sh install.sh;

Note

Pre-compiled versions of Cryfa, optimized for 64-bit Linux and macOS, can be found in thebin/ directory.

Usage

To execute Cryfa in stand-alone mode, utilize the command below:

./cryfa [OPTION]... -k [KEY_FILE] [-d] [IN_FILE]> [OUT_FILE]

For instance, to compact and encrypt data, execute the following command:

./cryfa -k pass.txt in.fq> comp

To decrypt and unpack the data, execute the command below:

./cryfa -k pass.txt -d comp> orig.fq

A sample file, "in.fq", is available in the example/ directory. Detailed descriptions of the options are provided in the subsequent sections.

Note

Cryfa supports a maximum file size of 64 GB. For larger files, consider splitting them into smaller chunks, e.g. using thesplit command in Linux, and then encrypt each chunk separately. After decryption, you can reassemble the chunks using thecat command.

Input file format

Cryfa identifies the format of a genomic data file by examining its content, not its extension. For instance, a FASTA file named "test" can be input into Cryfa with any extension, such as "test", "test.fa", "test.fasta", "test.fas", "test.fsa", etc. Based on this, executing the command

./cryfa -k pass.txttest> comp

is equivalent to running

./cryfa -k pass.txt test.fa> comp

Note

The password file extension is not a limiting factor for Cryfa. It can have any extension or even no extension at all. For instance, "pass", "pass.txt", "pass.dat", and so on, are all valid and yield the same result.

Benchmarking Cryfa Against Other Methods

To benchmark Cryfa against other methods, configure the parameters in therun.sh bash script and execute it:

./run.sh

This script automates the process of downloading datasets, installing dependencies, setting up compression and encryption tools, executing these tools, and finally, displaying the results.

Options

To explore the available options in Cryfa, execute the command below:

./cryfa

which will yield the following:

SYNOPSIS      ./cryfa [OPTION]... -k [KEY_FILE] [-d] [IN_FILE] > [OUT_FILE]SAMPLE      Encrypt and compact:  ./cryfa -k pass.txt in.fq > comp           Decrypt and unpack:   ./cryfa -k pass.txt -d comp > orig.fq            Encrypt:              ./cryfa -k pass.txt in > enc      Decrypt:              ./cryfa -k pass.txt -d enc > origOPTIONS      Compact & encrypt FASTA/FASTQ files.      Encrypt any text-based genomic data, e.g., VCF/SAM/BAM.      -k [KEY_FILE],  --key [KEY_FILE]           key file name -- MANDATORY           The KEY_FILE should contain a password.           To make a strong password, the "keygen" program can be           used via the command "./keygen".      -d,  --dec           decrypt & unpack                 -f,  --force           force to consider input as non-FASTA/FASTQ           Forces Cryfa not to compact, but shuffle and encrypt.           If the input is FASTA/FASTQ, it is considered as           non-FASTA/FASTQ; so, compaction will be ignored, but            shuffling and encryption will be performed.                 -s,  --stop_shuffle           stop shuffling the input      -t [NUMBER],  --thread [NUMBER]           number of threads      -v,  --verbose           verbose mode (more information)            -h,  --help           usage guide      --version           version information

Cryfa leverages the standard output stream, allowing seamless integration with existing data processing pipelines.

Creating a Key File

There are two approaches to create a "KEY_FILE" that can be used with the-k or--key flags. You can either save a raw password in a file or use the provided "keygen" program to generate a robust password. The latter method is strongly recommended for enhanced security.

To utilize the first method, use the commands below to save a raw password in a file, which can then be passed to Cryfa. In this example, "Such a strong password!" is the raw password and "pass.txt" is the file where the password is stored. Alternatively, you can use a text editor to save the password in a file:

echo"Such a strong password!"> pass.txt./cryfa -k pass.txt IN_FILE> OUT_FILE

While the password must contain at least 8 characters, it's highly recommended to use a strong password for better security. A strong password:

  • Is at least 12 characters long
  • Includes a mix of lowercase (a-z) and uppercase (A-Z) letters, digits (0-9), and symbols (e.g., !, #, $, %, and })
  • Is not a simple repetition of characters (e.g., zzzzzz), a keyboard pattern (e.g., qwerty), or a sequence of digits (e.g., 123456)

Alternatively, you can leverage the "keygen" program to automatically generate a robust password. To do this, execute:

./keygen

You'll be prompted with:

Enter a password, then press 'Enter':

At this point, input a raw password, for example, "A keygen raw pass!", and press "Enter". Subsequently, you'll see:

Enter a file name to save the generated key, then press 'Enter':

The robust password generated by the "keygen" program will be stored in the file you specify, such as "key.txt". Note that "keygen" requires an initial raw password to generate a strong password, but this initial password doesn't need to be particularly strong. Once the key file is created, you can use it with Cryfa as shown below:

./cryfa -k key.txt IN_FILE> OUT_FILE

For a deeper understanding of "key management" - which encompasses the generation, exchange, storage, usage, and replacement of keys - consider exploring[1],[2],[3] and[4].

Citation

If you utilize Cryfa in your research, please acknowledge the tool by citing the following references:

  • M. Hosseini, D. Pratas and A.J. Pinho, "Cryfa: a secure encryption tool for genomic data,"Bioinformatics, vol. 35, no. 1, pp. 146--148, 2018.DOI: 10.1093/bioinformatics/bty645
  • [OPTIONAL] D. Pratas, M. Hosseini and A.J. Pinho, "Cryfa: a tool to compact and encrypt FASTA files,"11'th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB), Springer, June 2017.DOI: 10.1007/978-3-319-60816-7_37

Codebase

Visualization of this repo

License

Cryfa is licensed under theGPLv3.


[8]ページ先頭

©2009-2025 Movatter.jp