- Notifications
You must be signed in to change notification settings - Fork4
Snakemake workflow for comparison of differential abundance ranks
License
biocore/qadabra
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Qadabra is a Snakemake workflow for running and comparing several differential abundance (DA) methods on the same microbiome dataset.
Importantly, Qadabra focuses on both FDR corrected p-valuesandfeature ranks and generates visualizations of differential abundance results.
Please note this software is currently a work in progress. Your patience is appreciated as we continue to develop and enhance its features. Please leave an issue on GitHub should you run into any errors.
Option 1: Pip install fromPyPI
pip install qadabra
Qadabra requires the following dependencies:
- snakemake
- click
- biom-format
- pandas
- numpy
- cython
- iow
Check out thetutorial for more in-depth instructions on installation.
Prerequisites
Before you begin, ensure you have Git and the necessary build tools installed on your system.
Clone the Repository
git clone https://github.com/biocore/qadabra.git
Navigate to repo root directory where thesetup.py
file is located and then install QADABRA in editable mode
cd qadabrapip install -e .
Qadabra can be used on multiple datasets at once.First, we want to create the workflow directory to perform differential abundance with all methods:
qadabra create-workflow --workflow-dest <directory_name>
This command will initialize the workflow, but we still need to point to our dataset(s) of interest.
We can add datasets one-by-one with theadd-dataset
command:
qadabra add-dataset \ --workflow-dest <directory_name> \ --table <directory_name>/data/table.biom \ --metadata <directory_name>/data/metadata.tsv \ --tree <directory_name>/data/my_tree.nwk \ --name my_dataset \ --factor-name case_control \ --target-level case \ --reference-level control \ --confounder confounding_variable(s) <confounding_var> \ --verbose
Let's walkthrough the arguments provided here, which represent the inputs to Qadabra:
workflow-dest
: The location of the workflow that we created earliertable
: Feature table (features by samples) inBIOM formatmetadata
: Sample metadata in TSV formattree
: Phylogenetic tree in .nwk or other tree format (optional)name
: Name to give this datasetfactor-name
: Metadata column to use for differential abundancetarget-level
: The value in the chosen factor to use as the targetreference-level
: The reference level to which we want to compare our targetconfounder
: Any confounding variable metadata columns (optional)verbose
: Flag to show all preprocessing performed by Qadabra
Your dataset should now be added as a line inmy_qadabra/config/datasets.tsv
.
You can useqadabra add-dataset --help
for more details.To add another dataset, just run this command again with the new dataset information.
The previous commands will create a subdirectory,my_qadabra
in which the workflow structure is contained.From the command line, execute the following to start the workflow:
snakemake --use-conda --cores <number of cores preferred> <other options>
Please read theSnakemake documentation for how to run Snakemake best on your system.
When this process is completed, you should have directoriesfigures
,results
, andlog
.Each of these directories will have a separate folder for each dataset you added.
After Qadabra has finished running, you can generate a Snakemake report of the workflow with the following command:
snakemake --report report.zip
This will create a zipped directory containing the report.Unzip this file and open thereport.html
file to view the report containing results and visualizations in your browser.
See thetutorial page for a walkthrough on using Qadabra workflow with a microbiome dataset.
Coming soon: AnFAQs page of commonly asked question on the statistics and code pertaining to Qadabra.
The manuscript for Qadabra is currently in progress. Please cite this GitHub page if Qadabra is used for your analysis. This project is licensed under the BSD-3 License. See thelicense file for details.
About
Snakemake workflow for comparison of differential abundance ranks