Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

A pipeline to identify differentially-methylated regions from whole-genome bisulfite sequencing data

License

NotificationsYou must be signed in to change notification settings

cemordaunt/DMRfinder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A pipeline to identify differentially-methylated regions from whole-genome bisulfite sequencing data

By Charles Mordaunt

Running DMRfinder:

First prepare WGBS data in DSS file format for at least two groups of two samples. Then run DMRfinder.R with the appropriate arguments as in DMRfinder_example.sh. As part of the pipeline, DMRfinder.R will check and install needed packages.

DSS file format:

Input files should be split by each sample and each chromosome. The format should be in DSS which is a tab-delimited text file with 4 columns: chromosome (chr), position (pos), total reads (N), and methylated reads (X) (see below).

    chr        posNX    chr21501397111    chr21501404611    chr21501405611    chr21501408211    chr21501409710

Required arguments:

These arguments are required for running DMRfinder

    -n, --chrNum = chromosome number (1-25, depending on genome, last 3 are chrX, chrY, chrM)     -d, --setwd = working directory      -c, --numCtrl = number of control samples      -e, --numExp = number of experimental samples      -a, --genome = genome assembly (hg38, hg19, mm10, rn6, rheMac8)      -o, --outprefix = title used in output files

Optional arguments:

These arguments generally don't need to be changed. For parallel processing for smoothing, increase mc.cores. For more stringent DMR criteria, increase pctMinCtrl and pctMinExp to 1 (CpGs have 1 read in all samples) and increase meanDiff_cutoff to 0.1 (DMRs have at least 10% difference between groups).

    --pctMinCtrl = minimum percent of control samples with 1 read at CpG [default = 0.9]      --pctMinExp = minimum percent of experimental samples with 1 read at CpG [default = 0.9]      --mc.cores = cores to use, same as SBATCH -n [default = 1]      --estimate.var = method to estimate variance for t-test (same, paired, group2) [default = same]      --meanDiff_cutoff = minimum difference between group means for DMRs [default = 0.05]      --maxGap = maximum distance between all consecutive CpGs in a DMR [default = 300]      --invdensity_cutoff = maximum average distance between consecutive CpGs in a DMR [default = 300]       --nperm = number of permutations to perform for FWER estimation [default = 1000]

Output arguments:

These arguments control the output files, including the DMR locations, plots, methylation, and coverage.

    --silver_bed = Output bed file of locations for silver DMRs [default = TRUE]      --silver_info = Output txt file of info for silver DMRs [default = TRUE]      --gold_plots = Output pdf file of plots for gold DMRs [default = TRUE]    --background = Output bed file of background DMRs [default = TRUE]      --gold_bed = Output bed file of locations for gold DMRs [default = TRUE]      --silver_plots = Output pdf file of plots for silver DMRs [default = FALSE]      --meth = Output table of smoothed methylation for each sample at each silver DMR       [default = FALSE]      --cov = Output table of total coverage for each sample at each DMR [default = FALSE]      --background_cov = Output table of total coverage for each sample at each background DMR      [default = FALSE]      --CpGs = Output bed file of tested CpGs [default = FALSE]

Add DSS_file prefixes after all other arguments, with all control samples first (DSS_files/CTRL01_)

Example call:

This call is submitted from the shell and runs DMRfinder on chromosome 21, comparing 3 control and 3 experimental samples.

    Rscript --vanilla DMRfinder.R -n 21 -d /share/lasallelab/programs/DMRfinder/test -c 3 -e 3     -a hg38 -o DFtest CTRL1_ CTRL2_ CTRL3_ EXP1_ EXP2_ EXP3_

Also see the example data in the TestData directory of this repository.

Example output:

DMR info

chrstartendCpGswidthinvdensityareaStatmaxStattstat_sdCtrl_meanExp_meanmeanDiffdirectionFWERRel_FWER
chr21466992594669932116623.88-143.18-8.760.070.730.16-0.58hypo00.00
chr214377921343779862964972.1142.305.350.060.440.750.31hyper140.78
chr214548233245482675934338.1129.903.500.060.520.780.26hyper181.00
chr21348907833489082810454.5028.182.850.060.130.290.16hyper181.00
chr214398599943986281828235.2526.544.150.060.160.350.20hyper181.00

DMR plot

Example DMR plot

Publications Using DMRfinder:

Mordaunt CE, Kieffer DA, Shibata NM, Czlonkowska A, Litwin T, Weiss KH, Zhu Y, Bowlus CL, Sarkar S, Cooper S, Wan YJY, Ali M,LaSalle JM, and Medici V. (2019) Epigenomic signatures in liver and blood of Wilson disease patients includehypermethylation of liver-specific enhancers.Epigenetics & Chromatin 12(10):1-16.

Zhu Y, Mordaunt CE, Yasui DH, Marathe R, Coulson RL, Dunaway KW, Walker CK, Ozonoff S, Hertz-Picciotto I, Schmidt RJ, andLaSalle JM. (2019) Placental DNA methylation levels atCYP2E1 andIRS2 are associated with child outcome andgenetic or nutrient modifiers in a prospective autism study.Human Molecular Genetics 28(16):2659-2674.

Mordaunt CE, Shibata NM, Kieffer DA, Czlonkowska A, Litwin T, Weiss KH, Gotthardt DN, Olson K, Wei D, Cooper S, Wan YJ, Ali M,LaSalle JM, and Medici V. (2018) Epigenetic changes of the thioredoxin system in the tx-j mouse model and in patientswith Wilson disease.Human Molecular Genetics 27(22):3854-3869.

Vogel Ciernia A, Laufer BI, Dunaway KW, Mordaunt CE, Coulson RL, Totah TS, Stolzenberg DS, Frahm JC, Singh-Taylor A, Baram TZ,LaSalle JM, and Yasui DH. (2018) Experience-dependent neuroplasticity of the developing hypothalamus: integrativeepigenomic approaches.Epigenetics 13(3):318-330.

Coulson RL, Yasui DH, Dunaway KW, Laufer BI, Vogel-Ciernia A, Zhu Y, Mordaunt CE, Totah TS, and LaSalle JM. (2018)Snord116-dependent diurnal rhythm of DNA methylation in mouse cortex.Nature Communications 9:1616.

Lopez SJ, Dunaway KW, Islam MS, Mordaunt CE, Vogel Ciernia A, Meguro-Horike M, Horike S, Segal DJ, and LaSalle JM. (2017)UBE3A-mediated regulation of imprinted genes and epigenome-wide marks in human neurons.Epigenetics 12(11):982-990.

Dunaway K, Goorha S, Matelski L, Urraca N, Lein PJ, Korf I, Reiter LT, and LaSalle JM. (2017) Dental pulp stem cells modelearly life and imprinted DNA methylation patterns.Stem Cells 35(4):981-988.

Dunaway KW*, Islam MS*, Coulson RL, Lopez SJ, Vogel Ciernia A, Chu R, Yasui DH, Pessah I, Lott P, Mordaunt CE, Meguro-HorikeM, Horike S, Korf I, and LaSalle JM. (2016) Cumulative impact of large chromosomal duplications and polychlorinatedbiphenyl exposure on DNA methylation, chromatin, and expression of autism candidate genes.Cell Reports 17(11):3035-3048.

Session Info:

    R version 3.5.0 (2018-04-23)    Platform: x86_64-pc-linux-gnu (64-bit)    Running under: Ubuntu 16.04.5 LTS    Matrix products: default    BLAS/LAPACK: /usr/lib/libopenblasp-r0.2.18.so    locale:    [1] LC_CTYPE=en_US       LC_NUMERIC=C         LC_TIME=en_US           [4] LC_COLLATE=en_US     LC_MONETARY=en_US    LC_MESSAGES=en_US       [7] LC_PAPER=en_US       LC_NAME=C            LC_ADDRESS=C            [10] LC_TELEPHONE=C       LC_MEASUREMENT=en_US LC_IDENTIFICATION=C     attached base packages:    [1] splines   stats4    parallel  stats     graphics  grDevices utils        [8] datasets  methods   base         other attached packages:    [1] scales_1.0.0                permute_0.9-4                  [3] DSS_2.30.0                  bsseq_1.18.0                   [5] SummarizedExperiment_1.12.0 GenomicRanges_1.34.0           [7] GenomeInfoDb_1.18.1         DelayedArray_0.8.0             [9] BiocParallel_1.16.2         IRanges_2.16.0                 [11] matrixStats_0.54.0          S4Vectors_0.20.1               [13] Biobase_2.42.0              BiocGenerics_0.28.0            [15] magrittr_1.5                remotes_2.0.2                  [17] BiocManager_1.30.4             loaded via a namespace (and not attached):    [1] Rcpp_1.0.0               compiler_3.5.0           XVector_0.22.0              [4] R.methodsS3_1.7.1        R.utils_2.7.0            bitops_1.0-6                [7] tools_3.5.0              DelayedMatrixStats_1.4.0 zlibbioc_1.28.0             [10] rhdf5_2.26.1             lattice_0.20-35          BSgenome_1.50.0             [13] Matrix_1.2-14            GenomeInfoDbData_1.2.0   rtracklayer_1.42.1          [16] Biostrings_2.50.1        gtools_3.8.1             locfit_1.5-9.1              [19] grid_3.5.0               data.table_1.11.8        HDF5Array_1.10.1            [22] XML_3.98-1.16            limma_3.38.3             Rhdf5lib_1.4.2              [25] GenomicAlignments_1.18.0 Rsamtools_1.34.0         colorspace_1.3-2            [28] RCurl_1.95-4.11          munsell_0.5.0            R.oo_1.22.0

About

A pipeline to identify differentially-methylated regions from whole-genome bisulfite sequencing data

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp