Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitdbbd9a3

Browse files
authored
Merge pull request#339 from bcgsc/release/v3.1.0
Release/v3.1.0
2 parents479fdeb +60817e7 commitdbbd9a3

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

47 files changed

+2795
-1839
lines changed

‎.github/workflows/build.yml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,9 @@ jobs:
2020
steps:
2121
-uses:actions/checkout@v2
2222
-name:install machine dependencies
23-
run:sudo apt-get install -y libcurl4-openssl-dev
23+
run:|
24+
sudo apt-get update
25+
sudo apt-get install -y libcurl4-openssl-dev
2426
-name:Set up Python ${{ matrix.python-version }}
2527
uses:actions/setup-python@v2
2628
with:
@@ -93,7 +95,7 @@ jobs:
9395
-name:Install workflow dependencies
9496
run:|
9597
python -m pip install --upgrade pip setuptools wheel
96-
pip install mavis_config pandas snakemake
98+
pip install mavis_config pandas
9799
-uses:eWaterCycle/setup-singularity@v6
98100
with:
99101
singularity-version:3.6.4

‎README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ The simplest way to use MAVIS is via Singularity. The MAVIS docker container use
4343
by singularity will take care of installing the aligner as well.
4444

4545
```bash
46-
pip install -U setuptools pip
46+
pip install -U setuptools pip wheel
4747
pip install mavis_config# also installs snakemake
4848
```
4949

‎docs/background/citations.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,11 @@ Chen,X. et al. (2016) Manta: rapid detection of structural variants
2323
and indels for germline and cancer sequencing applications.
2424
Bioinformatics, 32, 1220--1222.
2525

26+
##Chiu-2021
27+
28+
Chiu,R. et al. (2021) Straglr: discovering and genotyping tandem repeat
29+
expansions using whole genome long-read sequences. Genome Biol., 22, 224.
30+
2631
##Haas-2017
2732

2833
Haas,B et al. (2017) STAR-Fusion: Fast and Accurate Fusion
@@ -62,6 +67,11 @@ Saunders,C.T. et al. (2012) Strelka: accurate somatic small-variant
6267
calling from sequenced tumor--normal sample pairs. Bioinformatics,
6368
28, 1811--1817.
6469

70+
##Uhrig-2021
71+
72+
Uhrig,S. et al. (2021) Accurate and efficient detection of gene
73+
fusions from RNA sequencing data. Genome Res., 31, 448--460.
74+
6575
##Yates-2016
6676

6777
Yates,A. et al. (2016) Ensembl 2016. Nucleic Acids Res., 44,

‎docs/glossary.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -127,6 +127,11 @@ install instructions.
127127
Community based standard of reccommendations for variant notation.
128128
See[http://varnomen.hgvs.org/](http://varnomen.hgvs.org/)
129129

130+
##Arriba
131+
132+
Arriba is an SV caller. Source for Arriba can be found
133+
[here](https://github.com/suhrig/arriba)[Uhrig-2021](../background/citations#uhrig-2021)
134+
130135
##BreakDancer
131136

132137
BreakDancer is an SV caller. Source for BreakDancer can be found

‎docs/inputs/.pages

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
nav:
2+
- reference.md
3+
- standard.md
4+
- ...
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
#Non-python Dependencies
2+
3+
MAVIS integrates with
4+
[SV callers](./sv_callers.md),
5+
[job schedulers](#job-schedulers), and
6+
[aligners](#aligners). While some of
7+
these dependencies are optional, all currently supported options are
8+
detailed below. The versions column in the tables below list all the
9+
versions which were tested for each tool. Each version listed is known
10+
to be compatible with MAVIS.
11+
12+
##Job Schedulers
13+
14+
MAVIS v3 uses[snakemake](https://snakemake.readthedocs.io/en/stable/) to handle job scheduling
15+
16+
##Aligners
17+
18+
Two aligners are supported[bwa](../../glossary/#bwa) and
19+
[blat](../../glossary/#blat) (default). These are both included in the docker image by default.
20+
21+
| Name| Version(s)| Environment Setting|
22+
| ----------------------------------------------| -----------------------| -------------------------|
23+
|[blat](../../glossary/#blat)|`36x2``36`|`MAVIS_ALIGNER=blat`|
24+
|[bwa mem <bwa>](../../glossary/#bwa mem <bwa>)|`0.7.15-r1140``0.7.12`|`MAVIS_ALIGNER='bwa mem'`|
25+
26+
!!! note
27+
When setting the aligner you will also need to set the
28+
[aligner_reference](../../configuration/settings/#aligner_reference) to match

‎docs/inputs/reference.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -10,16 +10,16 @@ To improve the install experience for the users, different
1010
configurations of the MAVIS annotations file have been made available.
1111
These files can be downloaded below, or if the required configuration is
1212
not available,
13-
(instructions on generating the annotations file)[/inputs/reference/#generating-the-annotations-from-ensembl] can be found below.
13+
[instructions on generating the annotations file](/inputs/reference/#generating-the-annotations-from-ensembl) can be found below.
1414

15-
| File Name (Type/Format)| Environment Variable| Download|
16-
| ---------------------------------------------------------------------------------------------| -------------------------| -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
17-
|[reference genome](../../inputs/reference/#reference-genome) ([fasta](../../glossary/#fasta))|`MAVIS_REFERENCE_GENOME`|[![](../images/get_app-24px.svg) GRCh37/Hg19](http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/chromFa.tar.gz) <br>[![](../images/get_app-24px.svg) GRCh38](http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.tar.gz)|
15+
| File Name (Type/Format)| Environment Variable| Download|
16+
| ---------------------------------------------------------------------------------------------| -------------------------| -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
17+
|[reference genome](../../inputs/reference/#reference-genome) ([fasta](../../glossary/#fasta))|`MAVIS_REFERENCE_GENOME`|[![](../images/get_app-24px.svg) GRCh37/Hg19](http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/chromFa.tar.gz) <br>[![](../images/get_app-24px.svg) GRCh38](http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.tar.gz)|
1818
|[annotations](../../inputs/reference/#annotations) ([JSON](../../glossary/#json))|`MAVIS_ANNOTATIONS`|[![](../images/get_app-24px.svg) GRCh37/Hg19 + Ensembl69](http://www.bcgsc.ca/downloads/mavis/v3/ensembl69_hg19_annotations.v3.json.gz) <br>[![](../images/get_app-24px.svg) GRCh38 + Ensembl79](http://www.bcgsc.ca/downloads/mavis/v3/ensembl79_hg38_annotations.v3.json.gz)|
19-
|[masking](../../inputs/reference/#masking-file) (text/tabbed)|`MAVIS_MASKING`|[![](../images/get_app-24px.svg) GRCh37/Hg19](http://www.bcgsc.ca/downloads/mavis/hg19_masking.tab)<br>[![](../images/get_app-24px.svg) GRCh38](http://www.bcgsc.ca/downloads/mavis/GRCh38_masking.tab)|
20-
|[template metadata](../../inputs/reference/#template-metadata) (text/tabbed)|`MAVIS_TEMPLATE_METADATA`|[![](../images/get_app-24px.svg) GRCh37/Hg19](http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/cytoBand.txt.gz)<br>[![](../images/get_app-24px.svg) GRCh38](http://hgdownload.cse.ucsc.edu/goldenPath/hg38/database/cytoBand.txt.gz)|
21-
|[DGV annotations](../../inputs/reference/#dgv-database-of-genomic-variants) (text/tabbed)|`MAVIS_DGV_ANNOTATION`|[![](../images/get_app-24px.svg) GRCh37/Hg19](http://www.bcgsc.ca/downloads/mavis/dgv_hg19_variants.tab)<br>[![](../images/get_app-24px.svg) GRCh38](http://www.bcgsc.ca/downloads/mavis/dgv_hg38_variants.tab)|
22-
|[aligner reference](../../inputs/reference/#aligner-reference)|`MAVIS_ALIGNER_REFERENCE`|[![](../images/get_app-24px.svg) GRCh37/Hg19 2bit (blat)](http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.2bit)<br>[![](../images/get_app-24px.svg) GRCh38 2bit (blat)](http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.2bit)|
19+
|[masking](../../inputs/reference/#masking-file) (text/tabbed)|`MAVIS_MASKING`|[![](../images/get_app-24px.svg) GRCh37/Hg19](http://www.bcgsc.ca/downloads/mavis/hg19_masking.tab)<br>[![](../images/get_app-24px.svg) GRCh38](http://www.bcgsc.ca/downloads/mavis/GRCh38_masking.tab)|
20+
|[template metadata](../../inputs/reference/#template-metadata) (text/tabbed)|`MAVIS_TEMPLATE_METADATA`|[![](../images/get_app-24px.svg) GRCh37/Hg19](http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/cytoBand.txt.gz)<br>[![](../images/get_app-24px.svg) GRCh38](http://hgdownload.cse.ucsc.edu/goldenPath/hg38/database/cytoBand.txt.gz)|
21+
|[DGV annotations](../../inputs/reference/#dgv-database-of-genomic-variants) (text/tabbed)|`MAVIS_DGV_ANNOTATION`|[![](../images/get_app-24px.svg) GRCh37/Hg19](http://www.bcgsc.ca/downloads/mavis/dgv_hg19_variants.tab)<br>[![](../images/get_app-24px.svg) GRCh38](http://www.bcgsc.ca/downloads/mavis/dgv_hg38_variants.tab)|
22+
|[aligner reference](../../inputs/reference/#aligner-reference)|`MAVIS_ALIGNER_REFERENCE`|[![](../images/get_app-24px.svg) GRCh37/Hg19 2bit (blat)](http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.2bit)<br>[![](../images/get_app-24px.svg) GRCh38 2bit (blat)](http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.2bit)|
2323

2424
If the environment variables above are set they will be used as the
2525
default values when any step of the pipeline script is called (including

‎docs/inputs/support.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ It should be noted however that the tool tracked will only be listed as
4141

4242
| Name| Version(s)| MAVIS input| Publication|
4343
| ------------------------------------------| ----------------| ---------------------------------------------| -----------------------------------------------------------|
44+
|[Arriba](../../glossary/#arriba)|`2.2.1`|`fusions.tsv`|[Uhrig-2021](../../background/citations#uhrig-2021)|
4445
|[BreakDancer](../../glossary/#breakdancer)|`1.4.5`|`Tools main output file(s)`|[Chen-2009](../../background/citations#chen-2009)|
4546
|[BreakSeq](../../glossary/#breakseq)|`2.2`|`work/breakseq.vcf.gz`|[Abyzov-2015](../../background/citations#abyzov-2015)|
4647
|[Chimerascan](../../glossary/#chimerascan)|`0.4.5`|`*.bedpe`|[Iyer-2011](../../background/citations#Iyer-2011)|

‎docs/inputs/sv_callers.md

Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
#SV Callers
2+
3+
MAVIS supports output from a wide-variety of SV callers. Assumptions are made for each tool based on interpretation of the output and the publications for each tool.
4+
5+
##Configuring Conversions
6+
7+
Adding a conversion step to your MAVIS run is as simple as adding that section to the input JSON config.
8+
9+
The general structure of this section is as follows
10+
11+
```jsonc
12+
{
13+
"convert": {
14+
"<ALIAS>": {
15+
"file_type":"<TOOL OUTPUT TYPE>",
16+
"name":"<TOOL NAME>",// optional field for supported tools
17+
"inputs": [
18+
"/path/to/tool/output/file"
19+
]
20+
}
21+
}
22+
}
23+
```
24+
25+
A full version of the input configuration file specification can be found in the[configuration](../configuration/general.md) section.
26+
27+
##Supported Tools
28+
29+
The tools and versions currently supported are given below. Versions listed indicate the version of the tool for which output files have been tested as input into MAVIS. MAVIS also supports a[general VCF input](#general-vcf-inputs).
30+
31+
| SV Caller| Version(s) Tested| Files used as MAVIS input|
32+
| ---------------------------------------------------------------------------| -----------------| ---------------------------------------------|
33+
|[BreakDancer (Chen, 2009)](../../background/citations#chen-2009)|`1.4.5`|`Tools main output file(s)`|
34+
|[BreakSeq (Abyzov, 2015)](../../background/citations#abyzov-2015)|`2.2`|`work/breakseq.vcf.gz`|
35+
|[Chimerascan (Iyer, 2011)](../../background/citations#iyer-2011)|`0.4.5`|`*.bedpe`|
36+
|[CNVnator (Abyzov, 2011)](../../background/citations#abyzov-2011)|`0.3.3`|`Tools main output file(s)`|
37+
|[CuteSV (Jiang, 2020)](../../background/citations#jiang-2020)|`1.0.10`|`*.vcf`|
38+
|[DeFuse (McPherson. 2011)](../../background/citations#mcpherson-2011)|`0.6.2`|`results/results.classify.tsv`|
39+
|[DELLY (Rausch, 2012)](../../background/citations#rausch-2012)|`0.6.1``0.7.3`|`combined.vcf` (converted from bcf)|
40+
|[Manta (Chen, 2016)](../../background/citations#chen-2016)|`1.0.0`|`{diploidSV,somaticSV}.vcf`|
41+
|[Pindel (Ye, 2009)](../../background/citations#ye-2009)|`0.2.5b9`|`Tools main output file(s)`|
42+
|[Sniffles (Sedlazeck, 2018)](../../background/citations#sedlazeck-2018)|`1.0.12b`|`*.vcf`|
43+
|[STAR-Fusion (Haas, 2017)](../../background/citations#haas-2017)|`1.4.0`|`star-fusion.fusion_predictions.abridged.tsv`|
44+
|[Straglr (Chiu, 2021)](../../background/citations#chiu-2021)|||
45+
|[Strelka (Saunders, 2012)](../../background/citations#saunders-2012)|`1.0.6`|`passed.somatic.indels.vcf`|
46+
|[Trans-ABySS (Robertson, 2010)](../../background/citations/#robertson-2010)|`1.4.8 (custom)`|`{indels/events_novel_exons,fusions/*}.tsv`|`<output_prefix>.bed`|
47+
48+
!!! note
49+
[Trans-ABySS](../../glossary/#trans-abyss): The trans-abyss version
50+
used was an in-house dev version. However the output columns are
51+
compatible with 1.4.8 as that was the version branched from.
52+
Additionally, although indels can be used from both genome and
53+
transcriptome outputs of Trans-ABySS, it is recommended to only use the
54+
genome indel calls as the transcriptome indels calls (for versions
55+
tested) introduce a very high number of false positives. This will slow
56+
down validation. It is much faster to simply use the genome indels for
57+
both genome and transcriptome.
58+
59+
##[DELLY](../../glossary/#delly) Post-processing
60+
61+
Some post-processing on the delly output files is generally done prior
62+
to input. The output BCF files are converted to a VCF file
63+
64+
```bash
65+
bcftools concat -f /path/to/file/with/vcf/list --allow-overlaps --output-type v --output combined.vcf
66+
```
67+
68+
##General VCF inputs
69+
70+
Assuming that the tool outputting the VCF file follows standard
71+
conventions, then it is possible to use a
72+
[general VCF conversion](../../package/mavis/tools/vcf)
73+
that is not tool-specific. Given the wide variety in content for VCF files,
74+
MAVIS makes a number of assumptions and the VCF conversion may not work
75+
for all VCFs. In general MAVIS follows the[VCF 4.2
76+
specification](https://samtools.github.io/hts-specs/VCFv4.2.pdf). If the
77+
input tool you are using differs, it would be better to use a
78+
[custom conversion script](#custom-conversions).
79+
80+
Using the general VCF tool with a non-standard tool can be done as follows
81+
82+
```json
83+
{
84+
"convert": {
85+
"my_tool_alias": {
86+
"file_type":"vcf",
87+
"name":"my_tool",
88+
"inputs": ["/path/to/my_tool/output.vcf"]
89+
}
90+
}
91+
}
92+
```
93+
94+
###Assumptions on non-standard INFO fields
95+
96+
-`PRECISE` if given, Confidence intervals are ignored if given in favour of exact breakpoint calls using pos and END as the breakpoint positions
97+
-`CT` values if given are representative of the breakpoint orientations.
98+
-`CHR2` is given for all interchromosomal events
99+
100+
###Translating BND type Alt fields
101+
102+
There are four possible configurations for the alt field of a BND type structural variant
103+
based on the VCF specification. These correspond 1-1 to the orientation types for MAVIS
104+
translocation structural variants.
105+
106+
```text
107+
r = reference base/seq
108+
u = untemplated sequence/alternate sequence
109+
p = chromosome:position
110+
```
111+
112+
| alt format| orients|
113+
| ----------| -------|
114+
|`ru[p[`| LR|
115+
|`[p[ur`| RR|
116+
|`]p]ur`| RL|
117+
|`ru]p]`| LL|
118+
119+
##Custom Conversions
120+
121+
If there is a tool that is not yet supported by MAVIS and you would like it to be, you can either add a[feature request](https://github.com/bcgsc/mavis/issues) to our GitHub page or tackle writing the conversion script yourself. Either way there are a few things you will need
122+
123+
- A sample output from the tool in question
124+
- Tool metadata for the citation, version, etc
125+
126+
###Logic Example -[Chimerascan](../../glossary/#chimerascan)
127+
128+
The following is a description of how the conversion script for
129+
[Chimerascan](../../background/citations/#iyer-2011) was generated.
130+
While this is a built-in conversion command now, the logic could also
131+
have been put in an external script. As mentioned above, there are a
132+
number of assumptions that had to be made about the tools output to
133+
convert it to the
134+
[standard mavis format](../../inputs/standard/). Assumptions were then verified by reviewing at a series of
135+
called events in[IGV](../../glossary/#igv). In the current
136+
example,[Chimerascan](../../background/citations/#iyer-2011) output
137+
has six columns of interest that were used in the conversion
138+
139+
- start3p
140+
- end3p
141+
- strand3p
142+
- start5p
143+
- end5p
144+
- strand5p
145+
146+
The above columns describe two segments which are joined. MAVIS requires
147+
the position of the join. It was assumed that the segments are always
148+
joined as a[sense fusion](../../glossary/#sense-fusion). Using this
149+
assumption there are four logical cases to determine the position of the
150+
breakpoints.
151+
152+
i.e. the first case would be: If both strands are positive, then the end
153+
of the five-prime segment (end5p) is the first breakpoint and the start
154+
of the three-prime segment is the second breakpoint
155+
156+
###Calling a Custom Conversion Script
157+
158+
Since MAVIS v3+ is run using[snakemake](https://snakemake.readthedocs.io/en/stable/) the simplest way to incorporate your custom conversion scripts is to modify the Snakefile and add them as rules.

‎setup.cfg

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[metadata]
22
name = mavis
3-
version = 3.0.0
3+
version = 3.1.0
44
url = https://github.com/bcgsc/mavis.git
55
download_url = https://github.com/bcgsc/mavis/archive/v2.2.10.tar.gz
66
description = A Structural Variant Post-Processing Package
@@ -37,7 +37,7 @@ install_requires =
3737
braceexpand==0.1.2
3838
colour
3939
Distance>=0.1.3
40-
mavis_config>=1.1.0, <2.0.0
40+
mavis_config>=1.2.2, <2.0.0
4141
networkx>=2.5,<3
4242
numpy>=1.13.1
4343
pandas>=1.1, <2

‎src/mavis/bam/read.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -424,7 +424,7 @@ def sequenced_strand(read: pysam.AlignedSegment, strand_determining_read: int =
424424
else:
425425
strand=STRAND.NEGifnotread.is_reverseelseSTRAND.POS
426426
elifstrand_determining_read==2:
427-
ifread.is_read2:
427+
ifnotread.is_read1:
428428
strand=STRAND.NEGifread.is_reverseelseSTRAND.POS
429429
else:
430430
strand=STRAND.NEGifnotread.is_reverseelseSTRAND.POS

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp