Lishijiagg/MicrobTiSDAPublic

NotificationsYou must be signed in to change notification settings
Fork0
Star4

Microbiome Time-Series Data Analysis

License

MIT license

4 stars 0 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
R		R
data		data
man		man
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
CRAN-SUBMISSION		CRAN-SUBMISSION
DESCRIPTION		DESCRIPTION
LICENSE.md		LICENSE.md
MicrobTiSDA.Rproj		MicrobTiSDA.Rproj
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
cran-comments.md		cran-comments.md

Repository files navigation

MicrobTiSDA

MicrobTiSDA focuses on analyzing microbiome time-series data. Its corefunctionalities include inferring species interaction relationshipswithin time-series data, constructing abundance time regression modelsfor microbial features, and assessing the similarity of temporalabundance patterns among different microbial features. It integrates the“Learning Interactions from Microbial Time Series” algorithm based onthe discrete-time Lotka-Volterra model, and supports the construction ofnatural spline regression models to characterize changes in microbialfeature abundance over time.

Workflow

Installation

You can install MicrobTiSDA fromCRAN with:

# install.packages("MicrobTiSDA")

Or the newest virsion of MicrobTiSDA fromGitHubwith:

# install.packages("devtools")# library("devtools)# install_github("Lishijiagg/MicrobTiSDA")

Tutorial

Here is an example of applying MicrobTiSDA to an in vitro culturedaquatic microbiome dataset. The dataset was obtained from the study byFujita etal.

First, we should load MicrobTiSDA and other necessary packages.

library(tidyr)library(dplyr)#>#> Attaching package: 'dplyr'#> The following objects are masked from 'package:stats':#>#>     filter, lag#> The following objects are masked from 'package:base':#>#>     intersect, setdiff, setequal, unionlibrary(ggplot2)library(MicrobTiSDA)

In this dataset, the in vitro cultivation of aquatic microbiomes wasconducted in eight replicate experiments. Each replicate involvedcontinuous cultivation for 110 days, with daily sampling performedthroughout the entire period. As a result, a total of 880 aquaticmicrobiome community samples were obtained. In total, 28 ASVs weredetected across all samples.

data("fujita.data")data("fujita.meta")data("fujita.taxa")

Please note that MicrobTiSDA has specific format requirements for theuser-provided microbiome count table, metadata, and taxonomic annotationfiles:

The count table (OTU/ASV table) must have OTU/ASV IDs as row names andsample IDs as column names.

head(fujita.data[,c(1:6)])#>        S_00073 S_00169 S_00265 S_00361 S_00457 S_00553#> X_0002   11932   10453    8974    4452    6881    6651#> X_0004       0       0       0       0       0       0#> X_0007       0       0       0       0       0       0#> X_0008       0       0       0       0       0       0#> X_0010       0       0       0       0       0       0#> X_0015       0       0       0       0       0       0

The metadata file must use sample IDs as row names, with additionalsample information as column names. The columns must include thesampling time (Time) and either group (Group) or individual (SampleID)information for each sample. MicrobTiSDA uses these details to performspecies inference, construct microbial abundance time regressionmodels, interpolate missing data, construct random forestclassification models, and other analyses.

head(fujita.meta[,c(1:6)])#>         time resource inoculum replicate.id         treat1 treat2#> S_00073    1 Medium-C    Water        Rep.1 Water/Medium-C     WC#> S_00169    2 Medium-C    Water        Rep.1 Water/Medium-C     WC#> S_00265    3 Medium-C    Water        Rep.1 Water/Medium-C     WC#> S_00361    4 Medium-C    Water        Rep.1 Water/Medium-C     WC#> S_00457    5 Medium-C    Water        Rep.1 Water/Medium-C     WC#> S_00553    6 Medium-C    Water        Rep.1 Water/Medium-C     WC

The taxonomic annotation table should have microbial features (OTU/ASVIDs) as row names, and the columns should represent taxonomicannotations for each feature. These annotations must be split intodifferent taxonomic levels (e.g., arranged sequentially from kingdomto species)

head(fujita.taxa)#>            ID  Kingdom         Phylum               Class#> X_0001 X_0001 Bacteria Proteobacteria Gammaproteobacteria#> X_0002 X_0002 Bacteria Proteobacteria Gammaproteobacteria#> X_0003 X_0003 Bacteria   Bacteroidota         Bacteroidia#> X_0004 X_0004 Bacteria     Firmicutes          Clostridia#> X_0005 X_0005 Bacteria Proteobacteria Gammaproteobacteria#> X_0006 X_0006 Bacteria Proteobacteria Gammaproteobacteria#>                                      Order                Family#> X_0001                    Enterobacterales          Yersiniaceae#> X_0002                       Aeromonadales        Aeromonadaceae#> X_0003                     Chitinophagales      Chitinophagaceae#> X_0004 Peptostreptococcales-Tissierellales Peptostreptococcaceae#> X_0005                     Xanthomonadales      Xanthomonadaceae#> X_0006                     Pseudomonadales      Pseudomonadaceae#>                   Genus      Species       identified#> X_0001         Yersinia unidentified         Yersinia#> X_0002        Aeromonas unidentified        Aeromonas#> X_0003     unidentified unidentified Chitinophagaceae#> X_0004   Clostridioides   mangenotii       mangenotii#> X_0005 Stenotrophomonas unidentified Stenotrophomonas#> X_0006      Pseudomonas unidentified      Pseudomonas

Next, we need to perform filtering on the feature table in this dataset.However, given the relatively small number of microbial features (ASVs)in this dataset, we chose not to filter out ASVs with low abundance orlow prevalence.

fujita_filt<- Data.filter(Data=fujita.data,metadata=fujita.meta,OTU_counts_filter_value=0,OTU_filter_value=0,Group_var='replicate.id')# The output object of function Data.filter is an S3 class object, use summary() to check the output result.#summary(fujita_filt)

Typically, the filtered data need to be transformed prior to analysis.MicrobTiSDA provides a modified centered log-ratio (MCLR) transformationfor this purpose. This transformation is performed separately for eachsampled individual or experimental group; therefore, users are requiredto specify the grouping variable from the metadata. In this dataset, thevariable representing the eight replicate experiments is “replicate.id”.

fujita_trans<- Data.trans(Data=fujita_filt,metadata=fujita.meta,Group_var='replicate.id')

After filtering and transforming the dataset, MicrobTiSDA can be used toinfer species interactions within each individual subject or replicatemicrobiome. By integrating the “Learning Interactions from MicrobialTime-Series” (LIMITS) framework, MicrobTiSDA is able to infer sparseinteraction networks from microbiome time-series data. This step maytake a considerable amount of time to complete, depending on the size ofthe dataset.

fujita_interact<- Spec.interact(Data=fujita_trans,metadata=fujita.meta,Group_var='replicate.id',num_iterations=10)

Next, we can visualize the results of the species interaction inference.The function generates network plots of species interactions for eachreplicate aquatic microbiome experiment.

fujita_interact_vis<- Interact.dyvis(Interact_data=fujita_interact,threshold=1e-6,core_arrow_num=4,Taxa=fujita.taxa)

We can use the function to visualize species interaction networks. Theparameter allows users to specify which group or individual to display.For instance, to visualize the species interaction network only for thefirst replicate experiment (Rep.1) in the dataset, specify . If no groupis specified, the network for each group will be plotted by default.

plot(fujita_interact_vis,groups="Rep.1")#> Plot: Rep.1

Please note that the output is an interactive HTML chart supportingzooming and dragging for detailed exploration. However, sinceinteractive htmlwidgets cannot be rendered here, users can do it foryour own. In the chart, arrow colors indicate interaction types: orangerepresents positive interactions, and blue represents negativeinteractions. Nodes represent species, with keystone species (i.e.,those involved in multiple pairwise interactions) highlighted in yellow.

After completing species interaction inference for the aquaticmicrobiome, we can proceed to analyze the temporal abundance patterns ofmicrobial features using MicrobTiSDA. To support this analysis,MicrobTiSDA provides a method for fitting regression models to theabundance trajectories of microbial features. To begin, we first need toconstruct a design matrix for modeling temporal trends for eachmicrobial feature. This can be accomplished using the function, whereusers must specify the variables in the metadata that represent thegroup ID for each subject or replicate, the sample ID, and the timepoint of each sample.

fujita_design<- Design(metadata=fujita.meta,Group_var='replicate.id',Sample_ID='timeChar',Pre_processed_Data=fujita_trans,Sample_Time='time')

When fitting regression models for each microbial feature, MicrobTiSDAemploys natural spline regression. Users can manually specify thepositions of spline knots by providing a vector—for example, sets the10th, 20th, and 30th days in the time series as knot positions.Alternatively, MicrobTiSDA can automatically determine the optimalnumber and placement of knots using generalized cross-validation, basedon a user-defined maximum number of knots .

fujita_model<- Reg.SPLR(Data_for_Reg=fujita_design,pre_processed_data=fujita_trans,max_Knots=5,unique_values=5)

here, it is important to account for the sparsity and zero-inflatednature of microbiome data. The function includes the parameter, whichspecifies the minimum number of non-zero abundance values requiredacross time-series samples for a microbial feature to be included inregression modeling. By default, this threshold is set to at least 5non-zero values.

Based on the fitted regression models, we can predict the temporalabundance patterns of each microbial feature. These features can then beclustered according to the similarity of their abundance temporalpatterns, using correlation distance as the clustering metric.

fujita_model_pred<- Pred.data(Fitted_models=fujita_model,metadata=fujita.meta,Group='replicate.id',Sample_Time='time',time_step=1)fujita_model_clust<- Data.cluster(predicted_data=fujita_model_pred,clust_method='average',dend_title_size=12,font_size=3)# Visualize the feature clustering of the first replicate.plot(fujita_model_clust,groups="Rep.1")

We can also identify the optimal clusters of microbial features based onthe clustering results. Here we select features with correlationdistance less than 0.15 for clustering.

fujita_clust_results<- Data.cluster.cut(cluster_outputs=fujita_model_clust,cut_height=0.15,font_size=4)# Visualize the feature clustering of the first replicate.plot(fujita_clust_results,groups="Rep.1")

Finally, we can visualize the abundance patterns of microbial featureswithin each selected cluster.

fujita_model_vis<- Data.visual(cluster_results=fujita_clust_results,cutree_by='height',cluster_height= rep(0.15,8),predicted_data=fujita_model_pred,Design_data=fujita_design,pre_processed_data=fujita_trans,plot_dots=TRUE,figure_x_scale=20,Taxa=fujita.taxa,plot_lm=FALSE,legend_title_size=10,legend_text_size=10,axis_x_size=10,axis_title_size=10,axis_y_size=10)# Use plot() to visualize temporal profiles of clustered microbial features.# For example, we can visualize the fourth feature cluster temporal profiles of the first replicate.plot(fujita_model_vis,groups="Rep.1",clusters=4)#> In Rep.1, number of feature clusters -- 5#> `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

MicrobTiSDA offers users with multiple visualization functions. Inaddition to the interactive HTML widget for species interactionvisualization generated by the function, all other visualization outputsfrom MicrobTiSDA are structured as ggplot2 objects. Users can extractthe underlying ggplot2 object of each visualization and furthercustomize the visual appearance using functions such as and etc. Formore information, please refer to .

For example:

# Extract the ggplot2 object of the fourth clustered features from the Rep.1 experimental aquatic microbiomefujita_vis_rep1_cluster4<-fujita_model_vis$plots$Rep.1[[4]]print(fujita_vis_rep1_cluster4)#> `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

The output objects can be further modified using built-in ggplot2functions such as and .

# Assume we want to change the title and increase the legend title font sizefujita_vis_rep1_cluster4<-fujita_vis_rep1_cluster4+  theme(legend.title= element_text(size=15))+  labs(title="The Fourth ASV cluster of Rep.1")print(fujita_vis_rep1_cluster4)#> `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

About

Microbiome Time-Series Data Analysis

Languages

R100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

MicrobTiSDA

Workflow

Installation

Tutorial

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases1

Packages

Contributors2

Uh oh!

Languages

Movatterモバイル変換

License

Lishijiagg/MicrobTiSDA

Folders and files

Latest commit

History

Repository files navigation

MicrobTiSDA

Workflow

Installation

Tutorial

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases1

Packages0

Contributors2

Uh oh!

Languages

Packages