- Notifications
You must be signed in to change notification settings - Fork0
💎 An easy-to-use workflow for generating context specific genome-scale metabolic models and predicting metabolic interactions within microbial communities directly from metagenomic data
License
joelfnogueira/metaGEM
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
An easy-to-use workflow for generating context specific genome-scale metabolic models and predicting metabolic interactions within microbial communities directly from metagenomic data.
metaGEM
is a Snakemake workflow that integrates an array of existing bioinformatics and metabolic modeling tools, for the purpose of predicting metabolic interactions within bacterial communities of microbiomes. From whole metagenome shotgun datasets, metagenome assembled genomes (MAGs) are reconstructed, which are then converted into genome-scale metabolic models (GEMs) forin silico simulations. Additional outputs include abundance estimates, taxonomic assignment, growth rate estimation, pangenome analysis, and eukaryotic MAG identification.
You can set up and usemetaGEM
on the cloud by following along the google colab notebook.
Please note that google colab does not provide the computational resources necessary to fully runmetaGEM
on a real dataset. This notebook demonstrates how to set up and usemetaGEM
by perfoming the first steps in the workflow on a toy dataset.
You can set upmetaGEM
on your cluster with just one line of code 😉
git clone https://github.com/franciscozorrilla/metaGEM.git && cd metaGEM && rm -r .git && bash env_setup.sh
Congratulations, you can now start usingmetaGEM
. Verify your installation by using thecheck
task:
bash metaGEM.sh --task check
Please consult the setup page in the wiki for further configuration instructions.
RunmetaGEM
without any arguments to see usage instructions:
bash metaGEM.sh
Usage: bash metaGEM.sh [-t|--task TASK] [-j|--nJobs NUMBER OF JOBS] [-c|--cores NUMBER OF CORES] [-m|--mem GB RAM] [-h|--hours MAX RUNTIME] [-l|--local] Options: -t, --task Specify task to complete: SETUP createFolders downloadToy organizeData check CORE WORKFLOW fastp megahit crossMap concoct metabat maxbin binRefine binReassemble extractProteinBins carveme memote organizeGEMs smetana extractDnaBins gtdbtk abundance BONUS grid prokka roary eukrep eukcc VISUALIZATION (in development) stats qfilterVis assemblyVis binningVis taxonomyVis modelVis interactionVis growthVis -j, --nJobs Specify number of jobs to run in parallel -c, --nCores Specify number of cores per job -m, --mem Specify memory in GB required for job -h, --hours Specify number of hours to allocated to job runtime -l, --local Run jobs on local machine for non-cluster usage
metaGEM
can be used to explore your own gut microbiome sequencing data from at-home-test-kit services such asunseen bio. The following tutorial showcases themetaGEM
workflow on two unseenbio samples.
Refer to the wiki for additional usage tips, frequently asked questions, and implementation details.
- Quality filter reads withfastp
- Assembly withmegahit
- Draft bin sets withCONCOCT,MaxBin2, andMetaBAT2
- Refine & reassemble bins withmetaWRAP
- Taxonomic assignment withGTDB-tk
- Relative abundances withbwa andsamtools
- Reconstruct & evaluate genome-scale metabolic models withCarveMe andmemote
- Species metabolic coupling analysis withSMETANA
- Growth rate estimation withGRiD,SMEG orCoPTR
- Pangenome analysis withroary
- Eukaryotic draft bins withEukRep andEukCC
If you want to see any new additional or alternative tools incorporated into themetaGEM
workflow please raise an issue or create a pull request. Snakemake allows workflows to be very flexible, so adding new rules is as easy as filling out the following template and adding it to the Snakefile:
rule package-name: input: rules.rulename.output output: f'{config["path"]["root"]}/{config["folder"]["X"]}/{{IDs}}/output.file' message: """ Helpful and descriptive message detailing goal of this rule/package. """ shell: """ # Well documented command line instructions go here # Load conda environment set +u;source activate {config[envs][package]};set -u; # Run tool package-name -i {input} -o {output} """
ThemetaGEM
workflow was used in the following publication(s):
Plastic-degrading potential across the global microbiome correlates with recent pollution trendsJan Zrimec, Mariia Kokina, Sara Jonasson, Francisco Zorrilla, Aleksej ZelezniakbioRxiv 2020.12.13.422558; doi: https://doi.org/10.1101/2020.12.13.422558
metaGEM: reconstruction of genome scale metabolic models directly from metagenomesFrancisco Zorrilla, Kiran R. Patil, Aleksej ZelezniakbioRxiv 2020.12.31.424982; doi: https://doi.org/10.1101/2020.12.31.424982
Please reach out with any comments, concerns, or discussions regardingmetaGEM
.
About
💎 An easy-to-use workflow for generating context specific genome-scale metabolic models and predicting metabolic interactions within microbial communities directly from metagenomic data
Resources
License
Stars
Watchers
Forks
Packages0
Languages
- Python72.1%
- Shell14.1%
- R13.8%