Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

WES HLA Typing based on multiple alternative tools

License

NotificationsYou must be signed in to change notification settings

lkuchenb/MultiHLA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Workflow Diagram

Scope of this workflow

This workflow enables the concurrent analysis of WES or WGS data usingpublicly available software to derive HLA haplotypes from this type of data.

Currently available software tools

  • xHLA

    Xie, C., Yeo, Z. X., Wong, M., Piper, J., Long, T., Kirkness, E. F., ... & Brady, C. (2017). Fast and accurate HLA typing from short-read next-generation sequence data with xHLA. Proceedings of the National Academy of Sciences, 114(30), 8059-8064.

    The workflow implements read mapping the reads against hg38 without altcontigs usingbwa mem as instructed by the authors. The mapped reads are thensorted and index using samtools.

    The workflow utilizes theDocker Image provided by the authors toperform the actual HLA typing.

  • HLA-VBSeq

    Nariai, N., Kojima, K., Saito, S., Mimori, T., Sato, Y., Kawai, Y., ... & Nagasaki, M. (2015, December). HLA-VBSeq: accurate HLA typing at full resolution from whole-genome sequencing data. In BMC genomics (Vol. 16, No. S2, p. S7). BioMed Central.

    Wang, Y. Y., Mimori, T., Khor, S. S., Gervais, O., Kawai, Y., Hitomi, Y., ... & Nagasaki, M. (2019). HLA-VBSeq v2: improved HLA calling accuracy with full-length Japanese class-I panel. Human Genome Variation, 6(1), 1-5.

    The workflow implements read mapping the reads against hg19 without altcontigs. The authors instructions merely state to "map against hg19"without any further specifics, but mapping against hg19 with alt contigsyielded very poor typing results with missing HLA class I genes, thusthe workflow uses hg19 without alt contigs.

    HLA-VBSeq released two reference database versions:

    • v1 database based on IMGT/HLA database, Release 3.15.0
    • v2 database based on IMGT/HLA database Release 3.31.0 and Japanese HLA reference dataset
  • OptiType

    Szolek, A., Schubert, B., Mohr, C., Sturm, M., Feldhahn, M., & Kohlbacher, O. (2014). OptiType: precision HLA typing from next-generation sequencing data. Bioinformatics, 30(23), 3310-3316.

    The workflow invokes theOptiType snakemake wrapper without prior filteringof reads.

  • HLA-LA

    Dilthey, A. T., Mentzer, A. J., Carapito, R., Cutland, C., Cereb, N., Madhi, S. A., ... & Phillippy, A. M. (2019). HLA*LA - HLA typing from linearly projected graph alignments. Bioinformatics, 35(21), 4394-4396.

    The workflow uses reads mapped against the human genome (hg38) withoutalt contigs as input for HLA-LA. A corresponding reference txt file for HLA-LAis part of this workflow repository. The preprocessed graph directoryPRG_MHC_GRCh38_withIMGT can be either placed manually intyping/hla_la/hla_la.graphs/ or it will be downloaded and preprocessedautomatically.

    The workflow uses theHLA-LA bioconda package for graph preprocessing and HLA typing.

  • arcasHLA

    Orenbuch, R., Filip, I., Comito, D., Shaman, J., Pe’er, I., & Rabadan, R. (2020). arcasHLA: high-resolution HLA typing from RNAseq. Bioinformatics, 36(1), 33-40.

    The workflow maps RNAseq reads against the human genome (hg38) withoutalt contigs using the STAR aligner with default paramters. It theninvokes the 'extract' and 'genotype' subtools provided by arcasHLA.

Usage

  1. Install snakemake

    conda install -c conda-forge mambamamba create -c conda-forge -c bioconda -n snakemake snakemakeconda activate snakemake
  2. Clone theMultiHLA repository

    git clone https://github.com/lkuchenb/MultiHLA.git hla_typingcd hla_typing
  3. Put the input files in place
    MultiHLA comes with a predefined folder structure:

    • dataset/

      A dataset is defined as a set of samples. Place a TSV file here for every dataset with the following three named columns:

       SampleName  FileNameR1                              FileNameR2 Donor1      SEQ_D1_DAT_01_S53_L001_R1_001.fastq.gz  SEQ_D1_DAT_01_S53_L001_R2_001.fastq.gz Donor1      SEQ_D1_DAT_01_S53_L002_R1_001.fastq.gz  SEQ_D1_DAT_01_S53_L002_R2_001.fastq.gz Donor2      SEQ_D2_DAT_01_S54_L001_R1_001.fastq.gz  SEQ_D2_DAT_01_S54_L001_R2_001.fastq.gz Donor2      SEQ_D2_DAT_01_S54_L002_R1_001.fastq.gz  SEQ_D2_DAT_01_S54_L002_R2_001.fastq.gz Donor3      SEQ_D3_DAT_01_S55_L001_R1_001.fastq.gz  SEQ_D3_DAT_01_S55_L001_R2_001.fastq.gz Donor3      SEQ_D3_DAT_01_S55_L002_R1_001.fastq.gz  SEQ_D3_DAT_01_S55_L002_R2_001.fastq.gz

      FASTQ files have to come in gziped pairs and be named{prefix}_R[12]{suffix}.fastq.gz. A sample can be covered by an arbitrarynumber of FASTQ pairs (at least one).

    • fastq/

      Place the FASTQ files as listed in your dataset sheet here.

    • ref/

      Place or link the required human genome references here as described for each supported method, otherwise they will be automatically downloaded.

    • trim/

      This is an output folder. It will be filled with adapter trimmed versions of the provided FASTQ files.

    • typing/{method}/

      This is an output folder. It will be filled with subfolders for each method.

    • workflow/

      This folder contains the workflow code.

  4. Run the workflow

    Invoke snakemake usingsnakemake --use-conda --use-singularity. This enablessnakemake to automatically install dependencies into conda environments thatare created on the fly and also enables the container based jobs to run. Toprocess all samples of a dataset, for example the datasetdataset_1described indatasets/dataset_1.tsv use

    snakemake --use-conda --use-singularity typing/dataset_1.all.multihla

    Memory and run time requirements for each job are noted in their resources (mem_mb andtime).

About

WES HLA Typing based on multiple alternative tools

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp