Module: Bioconductor R analyses
These are a collection ofBioconductor tutorials, ranging from an introduction to R and Bioconductor, to showing how to perform more complex analyses of biological data using Bioconductor R packages.
DuringSmorgasbord week May 22-26 2023
- you will be able to run these tutorials in a personal RStudio instance launched in Galaxy, that has all necessary packages pre-installed. Go toworkshop.bioconductor.org
- the instructors of these tutorials have kindly volunteered to answer questions inGTN Training Slack
- you can submit your Rhistory for a tutorial for your Smorgabord certificatein this form.
Organisers:Maria Doyle, Alex Mahmoud,Bioconductor Teaching Committee
Register:See registration information here
Setup
Demo Video
| Description: | This video will help you get setup on the BioConductor Galaxy to run the tutorials in this module. |
| Length: | 5 minutes |
| Captions: | Bioconductor |
| Created: | 5 May 2023 |
| Materials: | |
| Support: |
Speaker
Bioconductor Carpentries lessons
Introduction to data analysis with R and Bioconductor
This is a self-study session. Please work through the materials on your own, and ask the instructors for help if you get stuck or have any questions!| Tutorial: | Introduction to data analysis with R and Bioconductor |
| Description: | The Data science lesson is based on the Carpentries Ecology Curriculum.There are no pre-requisites for this module, and the materials assume no priorknowledge about R and Bioconductor. It introduces R, RStudio, teaches datacleaning, management, analysis, and visualisation and introduces someBioconductor concepts. |
| Materials: | |
| Support: |
Instructor
The Bioconductor project
This is a self-study session. Please work through the materials on your own, and ask the instructors for help if you get stuck or have any questions!| Tutorial: | The Bioconductor project |
| Description: | The Bioconductor project lesson provides an introduction to the Bioconductorproject such as the Bioconductor home page, packages, package landing pages,and package vignettes, where to find help, Bioconductor workflows,Bioconductor release schedule and versions, some core infrastructure,..It is meant to be use in combination with other modules as part of a widerworkshop. |
| Materials: | |
| Support: |
Bioconductor workshops
Tutorial Video
| Description: | This workshop demonstrates the use of the iSEE package to createand configure interactive applications for the exploration of various types ofgenomics data sets (e.g., bulk and single-cell RNA-seq, CyTOF, gene expressionmicroarray). This workshop is presented as a lab session that combines aninstructor-led live demo, followed by hands-on experimentation guided bycompletely worked examples and stand-alone notes that participants maycontinue to use after the workshop. The instructor-led live demo comprises three parts:
The hands-on lab comprises three parts:
|
| Length: | 55 minutes |
| Captions: | No captions available for this video. We aim to have captions for all our videos, and are working to add captions to this video as soon as possible. |
| Created: | 28 July 2020 |
| Materials: | |
| Support: |
Tutorial Video
| Description: | This tutorial will showcase analysis of single-cell RNA sequencing datafollowing the tidy data paradigm. The tidy data paradigm provides a standardway to organise data values within a dataset, where each variable is a column,each observation is a row, and data is manipulated using an easy-to-understandvocabulary. Most importantly, the data structure remains consistent acrossmanipulation and analysis functions. This can be achieved with the integration of packages present in the R CRANand Bioconductor ecosystem, including tidySingleCellExperiment and tidyverse.These packages are part of the tidytranscriptomics suite that introduces atidy approach to RNA sequencing data representation and analysis. For moreinformation see the tidy transcriptomics blog. |
| Length: | 1 hour 27 minutes |
| Captions: | No captions available for this video. We aim to have captions for all our videos, and are working to add captions to this video as soon as possible. |
| Created: | 28 July 2022 |
| Materials: | |
| Support: |
Speaker
Tutorial Video
| Description: | This workshop consists of a demonstration of using DECIPHER and SynExtend forcommon analyses in comparative genomics. The immediate goal of this sessionis to use sequence data to uncover networks of functionally associated genes.These networks consist of genetic regions under shared evolutionary pressure,which have previously been shown to imply some degree of conserved function. |
| Length: | 1 hour 5 minutes |
| Captions: | No captions available for this video. We aim to have captions for all our videos, and are working to add captions to this video as soon as possible. |
| Created: | 29 July 2022 |
| Materials: | |
| Support: |
Tutorial Video
| Description: | Concepts of causal inference in epidemiology have important ramificationsfor studies across bioinformatics and other fields of health research. Inthis workshop, we introduce basic concepts of epidemiology, study design,and causal inference for bioinformaticians. Emphasis is placed on addressingbias and confounding as common threats to assessing a causal pathway in avariety of study design types and when using common forms of analyses suchas GWAS and survival analysis. Workshop participants will have theopportunity to create their own structural causal models (DAGs) usingdagitty and ggdag and then use this model to determine how to assess anestimated causal effect. Examples using DESeq2, edgeR, and limma will beused to show how multivariable models can be fitted depending on thehypothesized causal relationship. Presented successfully at BioC2021 to alarge audience of more than 100, updates that material by revising currentexamples based on participant feedback as well as content updates. |
| Length: | 83 minutes |
| Captions: | No captions available for this video. We aim to have captions for all our videos, and are working to add captions to this video as soon as possible. |
| Created: | 27 July 2022 |
| Materials: | |
| Support: |
SpatialOmicsOverlay Workshop
This is a self-study session. Please work through the materials on your own, and ask the instructors for help if you get stuck or have any questions!| Tutorial: | SpatialOmicsOverlay Workshop |
| Description: | This workshop will introduce users to the NanoString R packageSpatialOmicsOverlay. This package is designed for use in interpretation andpresentation of the multi-form data generated by NanoString's GeoMx ®Digital Spatial Profiler spatial biology platform. The GeoMx DSP producesboth rich imaging and genomics data. Integrating these data typesfacilitates a deep understanding of the profiled tissue. TheSpatialOmicsOverlay package integrates both of these data types, thereby bymaintaining the relationship between underlying tissue morphology andresultant gene expression.Specifically, SpatialOmicsOverlay was developed to visualize and analyze thefree-handed nature of Region of Interest (ROI) selection in a GeoMxexperiment, as well as the immunofluorescence-guided segmentation process.The overlay from the instrument is recreated in the R environment, whichallows for plotting overlays with data like ROI type or gene expression. Thepackage provides a convenient workflow for users to generate customized,sharable visualizations.The Introduction to SpatialOmicsOverlay vignette demonstrates how to useOME-TIFF files, which are exported from the GeoMx platform. Participantswill learn how to interact with this file type and generate informativeplots over images. This vignette utilizes data from our Spatial Organ Atlas.The Spatial Organ Atlas is a freely-accesible resource of wholetranscriptome spatial profiles of functional components of tissues fromhuman and mouse generated using our Whole Transcriptome Atlas RNA assay. Inparticular, vignette users will be analyzing data from the mouse brain. Thiscontent is similar to the vignette available with the package uponinstallation. |
| Materials: | |
| Support: |
Bioconductor workflows
RNA-seq analysis is easy as 1-2-3 with limma, Glimma and edgeR
This is a self-study session. Please work through the materials on your own, and ask the instructors for help if you get stuck or have any questions!| Tutorial: | RNA-seq analysis is easy as 1-2-3 with limma, Glimma and edgeR |
| Description: | The ability to easily and efficiently analyse RNA-sequencing data is a keystrength of the Bioconductor project. Starting with counts summarised at thegene-level, a typical analysis involves pre-processing, exploratory dataanalysis, differential expression testing and pathway analysis with theresults obtained informing future experiments and validation studies. In thisworkflow article, we analyse RNA-sequencing data from the mouse mammary gland,demonstrating use of the popular edgeR package to import, organise, filterand normalise the data, followed by the limma package with its voom method,linear modelling and empirical Bayes moderation to assess differentialexpression and perform gene set testing. This pipeline is further enhancedby the Glimma package which enables interactive exploration of the resultsso that individual samples and genes can be examined by the user. The completeanalysis offered by these three packages highlights the ease with whichresearchers can turn the raw counts from an RNA-sequencing experiment intobiological insights using Bioconductor. |
| Materials: | |
| Support: |
Instructor
From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline
This is a self-study session. Please work through the materials on your own, and ask the instructors for help if you get stuck or have any questions!| Tutorial: | From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline |
| Description: | RNA sequencing (RNA-seq) has become a very widely used technology forprofiling gene expression. One of the most common aims of RNA-seq profilingis to identify genes or molecular pathways that are differentially expressed(DE) between two or more biological conditions. This article demonstrates acomputational workflow for the detection of DE genes and pathways from RNA-seqdata by providing a complete analysis of an RNA-seq experiment profilingepithelial cell subsets in the mouse mammary gland. The workflow uses Rsoftware packages from the open-source Bioconductor project and covers allsteps of the analysis pipeline, including alignment of read sequences, dataexploration, differential expression analysis, visualization and pathwayanalysis. Read alignment and count quantification is conducted using theRsubread package and the statistical analyses are performed using the edgeRpackage. The differential expression analysis uses the quasi-likelihoodfunctionality of edgeR. |
| Materials: | |
| Support: |
Instructor
Using singscore to predict mutations in AML from transcriptomic signatures
This is a self-study session. Please work through the materials on your own, and ask the instructors for help if you get stuck or have any questions!| Tutorial: | Using singscore to predict mutations in AML from transcriptomic signatures |
| Description: | Advances in RNA sequencing (RNA-seq) technologies that measure thetranscriptome of biological samples have revolutionised our ability tounderstand transcriptional regulatory programs that underpin diseasessuch as cancer. We recently published singscore - a single-sample, rank-basedgene set scoring method which quantifies how concordant the transcriptionalprofile of individual samples are relative to specific gene sets of interest.Here we demonstrate the application of singscore to investigatetranscriptional profiles associated with specific mutations or genetic lesionsin acute myeloid leukemia. Using matched genomic and transcriptomic dataavailable through The Cancer Genome Atlas we show that scoring of appropriatesignatures can distinguish samples with corresponding mutations, reflectingthe ability of these mutations to drive aberrant transcriptional programsinvolved in leukemogenesis. We believe the singscore method is particularlyuseful for studying heterogeneity within specific subsets of cancers, and asdemonstrated, singscore has the ability to identify samples where alternativemutations/genetic lesions appear to drive transcriptional programs. |
| Materials: | |
| Support: |
Instructor
fluentGenomics: A plyranges and tximeta workflow
This is a self-study session. Please work through the materials on your own, and ask the instructors for help if you get stuck or have any questions!| Tutorial: | fluentGenomics: A plyranges and tximeta workflow |
| Description: | An extended workflow using the plyranges and tximeta packages for fluentgenomic data analysis. Use tximeta to correctly import RNA-seq transcriptquantifications and summarize them to gene counts for downstream analysis.Use plyranges for clearly expressing operations over genomic coordinates andto combine results from differential expression and differential accessibilityanalyses. |
| Materials: | |
| Support: |
MungeSumstats: Standardise the format of GWAS summary statistics
This is a self-study session. Please work through the materials on your own, and ask the instructors for help if you get stuck or have any questions!| Tutorial: | MungeSumstats: Standardise the format of GWAS summary statistics |
| Description: | The package is designed to handle the lack of standardisation of outputfiles by the GWAS community. The MRC IEU Open GWAS team have provided fullsummary statistics for >10k GWAS, which are API-accessible via the ieugwasrand gwasvcf packages. But these GWAS are only standardised in the sense thatthey are VCF format, and can be fully standardised with MungeSumstats.MungeSumstats provides a framework to standardise the format for any GWASsummary statistics, including those in VCF format, enabling downstreamintegration and analysis. It addresses the most common discrepancies acrosssummary statistic files, and offers a range of adjustable Quality Control(QC) steps. |
| Materials: | |
| Support: |
Instructor
R for Mass Spectrometry
This is a self-study session. Please work through the materials on your own, and ask the instructors for help if you get stuck or have any questions!| Tutorial: | R for Mass Spectrometry |
| Description: | This material introduces participants to the analysis and exploration of massspectrometry (MS) based proteomics data using R and Bioconductor. The coursewill cover all levels of MS data, from raw data to identification andquantitation data, up to the statistical interpretation of a typical shotgunMS experiment and will focus on hands-on tutorials. At the end of this course,the participants will be able to manipulate MS data in R and use existingpackages for their exploratory and statistical proteomics data analysis. |
| Materials: | |
| Support: |
