Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

A comprehensive R package for label-free proteomics data analysis and modeling

License

NotificationsYou must be signed in to change notification settings

caranathunge/promor

Repository files navigation

Proteomics Data Analysis and Modeling Tools

CRAN statusCRAN RStudio mirror downloadsCRAN RStudio mirror downloadsR-CMD-checktest-coverageLicense: LGPL v2.1

  • promor is a user-friendly, comprehensive R package that combinesproteomics data analysis with machine learning-based modeling.

  • promor streamlines differential expression analysis oflabel-freequantification (LFQ) proteomics data and building predictive modelswith top protein candidates.

  • Withpromor we provide a range of quality control and visualizationtools to analyze label-free proteomics data at the protein level.

  • Input files forpromor are aproteinGroups.txtfile produced byMaxQuant or astandardinputfilecontaining a quantitative matrix of protein intensities and anexpDesign.txtfile containing the experimental design of your proteomics data.

  • The standard input file should be a tab-delimited text file. Proteinsor protein groups should be indicated by rows and samples by columns.Protein names should be listed in the first column and you may use acolumn name of your choice for the first column. The remaining samplecolumn names should match the sample names indicated by the mq_labelcolumn in the expDesign.txt file.

🚨Check out our R Shiny app:PROMORApp


Installation

Install the released version from CRAN

install.packages("promor")

Install development version fromGitHub

# install devtools, if you haven't already:install.packages("devtools")# install promor from githubdevtools::install_github("caranathunge/promor")

Proteomics data analysis with promor

promor prot analysis flow chart by caranathungeFigure 1. Aschematic diagram of suggested workflows for proteomics data analysiswith promor.

Example

Here is a minimal working example showing how to identify differentiallyexpressed proteins between two conditions usingpromor in five simplesteps. We use a previously published data set fromCox etal. (2014) (PRIDEID: PXD000279).

# Load promorlibrary(promor)# Create a raw_df object with the files provided in this github account.raw<- create_df(prot_groups="https://raw.githubusercontent.com/caranathunge/promor_example_data/main/pg1.txt",exp_design="https://raw.githubusercontent.com/caranathunge/promor_example_data/main/ed1.txt")# Filter out proteins with high levels of missing data in either condition or groupraw_filtered<- filterbygroup_na(raw)# Impute missing data and create an imp_df object.imp_df<- impute_na(raw_filtered)# Normalize data and create a norm_df objectnorm_df<- normalize_data(imp_df)# Perform differential expression analysis and create a fit_df objectfit_df<- find_dep(norm_df)

Lets take a look at the results using a volcano plot.

volcano_plot(fit_df,text_size=5)


Modeling with promor

promor flowchart-modeling by caranathungeFigure 2. A schematic diagram of suggested workflows for buildingpredictive models with promor.

Example

The following minimal working example shows you how to use your resultsfrom differential expression analysis to build machine learning-basedpredictive models usingpromor.

We use a previously published data set fromSuvarna etal. (2021)that used differentially expressed proteins between severe andnon-severe COVID patients to build models to predict COVID severity.

# First, let's make a model_df object of top differentially expressed proteins.# We will be using example fit_df and norm_df objects provided with the package.covid_model_df<- pre_process(fit_df=covid_fit_df,norm_df=covid_norm_df)# Next, we split the data into training and test data setscovid_split_df<- split_data(model_df=covid_model_df)# Let's train our models using the default list of machine learning algorithmscovid_model_list<- train_models(split_df=covid_split_df)# We can now use our models to predict the test datacovid_prob_list<- test_models(model_list=covid_model_list,split_df=covid_split_df)

Let’s make ROC plots to check how the different models performed.

roc_plot(probability_list=covid_prob_list,split_df=covid_split_df)


Tutorials

You can choose a tutorial from the list below that best fits yourexperiment and the structure of your proteomics data.

  1. This README file can be accessed from RStudio as follows,
vignette("intro_to_promor",package="promor")
  1. If your data do NOT contain technical replicates:promor: Notechnicalreplicates

  2. If your data contain technical replicates:promor: Technicalreplicates

  3. If you would like to use your proteomics data to build predictivemodels:promor:Modeling

About

A comprehensive R package for label-free proteomics data analysis and modeling

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp