Movatterモバイル変換

Type:

Package

Title:

Proteomics Data Analysis and Modeling Tools

Version:

0.2.2

Description:

A comprehensive, user-friendly package for label-free proteomics data analysis and machine learning-based modeling. Data generated from 'MaxQuant' can be easily used to conduct differential expression analysis, build predictive models with top protein candidates, and assess model performance. promor includes a suite of tools for quality control, visualization, missing data imputation (Lazar et. al. (2016) <doi:10.1021/acs.jproteome.5b00981>), differential expression analysis (Ritchie et. al. (2015) <doi:10.1093/nar/gkv007>), and machine learning-based modeling (Kuhn (2008) <doi:10.18637/jss.v028.i05>).

License:

LGPL-2.1 |LGPL-3 [expanded from: LGPL (≥ 2.1)]

Encoding:

UTF-8

Language:

en-US

RoxygenNote:

7.3.3

VignetteBuilder:

knitr

Suggests:

covr, knitr, rmarkdown, testthat (≥ 3.0.0)

Depends:

R (≥ 3.5.0)

URL:

https://github.com/caranathunge/promor,https://caranathunge.github.io/promor/

Imports:

reshape2, ggplot2, ggrepel, gridExtra, limma, statmod,pcaMethods, VIM, missForest, caret, kernlab, xgboost,naivebayes, viridis, pROC

LazyData:

true

Config/testthat/edition:

BugReports:

https://github.com/caranathunge/promor/issues

NeedsCompilation:

Packaged:

2025-11-11 16:52:29 UTC; caran

Author:

Chathurani Ranathunge

[aut, cre, cph]

Maintainer:

Chathurani Ranathunge <caranathunge86@gmail.com>

Repository:

CRAN

Date/Publication:

2025-11-11 22:20:02 UTC

Compute average intensity

Description

This function computes average intensities acrosstechnical replicates for each sample.

Usage

aver_techreps(raw_df)

Arguments

raw_df

Araw_df object containing technical replicates.

Details

aver_techreps assumes that column names in the data framefollow the "Group_UniqueSampleID_TechnicalReplicate" notation. (Usehead(raw_df) to see the structure of theraw_df object.)

Value

Araw_df object of averaged intensities.

Author(s)

Chathurani Ranathunge

Examples

## Use a data set containing technical replicates to create a raw_df objectraw_df <- create_df(prot_groups = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/pg2.txt",exp_design = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/ed2.txt",tech_reps = TRUE)# Compute average intensities across technical replicates.rawdf_ave <- aver_techreps(raw_df)

Correlation between technical replicates

Description

This function generates scatter plots to visualize thecorrelation between a given pair of technical replicates (Eg: 1 vs 2)for each sample.

Usage

corr_plot(  raw_df,  rep_1,  rep_2,  save = FALSE,  file_type = "pdf",  palette = "viridis",  text_size = 5,  n_row = 4,  n_col = 4,  dpi = 80,  file_path = NULL)

Arguments

raw_df

Araw_df object (output ofcreate_df)containing technical replicates.

rep_1

Numerical. Technical replicate number.

rep_2

Numerical. Number of the second technical replicate to comparetorep1.

save

Logical. IfTRUE saves a copy of the plot in thedirectory provided infile_path.

file_type

File type to save the scatter plots.Default is"pdf".

palette

Viridis color palette option for plots. Default is"viridis". Seeviridisfor available options.

text_size

Text size for plot labels, axis labels etc. Default is10.

n_row

Numerical. Number of plots to print in a row in a single page.Default is4.

n_col

Numerical. Number of plots to print in a column in a singlepage. Default is4.

dpi

Plot resolution. Default is80.

file_path

A string containing the directory path to save the file.

Details

Given a data frame of log-transformed intensities(araw_df object) and a pair of numbers referring to the technicalreplicates,corr_plot produces a list of scatter plots showingcorrelation between the given pair of technical replicates for all thesamples provided in the data frame.
Note:n_row *n_col should be equal to the number ofsamples to display in a single page.

Value

A list ofggplot2 plot objects.

Author(s)

Chathurani Ranathunge

Examples

## Use a data set containing technical replicates to create a raw_df objectraw_df <- create_df(prot_groups = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/pg2.txt",exp_design = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/ed2.txt",tech_reps = TRUE)## Compare technical replicates 1 vs. 2 for all samplescorr_plot(raw_df, rep_1 = 1, rep_2 = 2)

Suvarna et al 2021 LFQ data (fit object)

Description

An object of class "MArrayLM" from running find_dep on covid_norm_df

Usage

data(covid_fit_df)

Format

An object of class "MArrayLM"

References

https://www.frontiersin.org/articles/10.3389/fphys.2021.652799/full#h3

Suvarna et al 2021 LFQ data (normalized)

Description

A dataframe containing normalized LFQ protein intensity data for 230proteins in 35 samples (a subset of the original data set)

Usage

data(covid_norm_df)

Format

A data frame with 230 rows (proteins) and 35 columns (samples)

References

https://www.frontiersin.org/articles/10.3389/fphys.2021.652799/full#h3

Create a data frame of protein intensities

Description

This function creates a data frame of protein intensities

Usage

create_df(  prot_groups,  exp_design,  input_type = "MaxQuant",  data_type = "LFQ",  filter_na = TRUE,  filter_prot = TRUE,  uniq_pep = 2,  tech_reps = FALSE,  zero_na = TRUE,  log_tr = TRUE,  base = 2)

Arguments

prot_groups

File path to a proteinGroups.txt file produced by MaxQuantor a standard input file containing a quantitative matrixwhere the proteins or protein groups are indicated by rows and thesamples by columns.

exp_design

File path to a text file containing the experimentaldesign.

input_type

Type of input file indicated byprot_groups.Available options are: "MaxQuant", if a proteinGroups.txt file is used, or"standard" if a standard input file is used. Default is "MaxQuant."

data_type

Type of sample protein intensity data columns to use fromthe proteinGroups.txt file. Some available options are "LFQ", "iBAQ","Intensity". Default is "LFQ." User-defined prefixes in the proteinGroups.txtfile are also allowed. Thedata_type argument is case-sensitive, andonly applies wheninput_type = "MaxQuant".

filter_na

Logical. IfTRUE(default), filters out empty rows andcolumns from the data frame.

filter_prot

Logical. IfTRUE (default), filters outreverse proteins, proteins only identified by site, potential contaminants,and proteins identified with less than the minimum number of unique peptidesindicated byuniq_pep. Only applies wheninput_type = "MaxQuant".

uniq_pep

Numerical. Proteins that are identified by this number orfewer number of unique peptides are filtered out (default is 2).Only applieswheninput_type = "MaxQuant".

tech_reps

Logical. Indicate asTRUE if technical replicatesare present in the data. Default isFALSE.

zero_na

Logical. IfTRUE (default), zeros are consideredmissing values and replaced with NAs.

log_tr

Logical. IfTRUE (default), intensity values are logtransformed to the base indicated bybase.

base

Numerical. Logarithm base. Default is 2.

Details

This function first reads in the proteinGroups.txt fileproduced by MaxQuant or a standard input file containing a quantitativematrix where the proteins or protein groups are indicated by rows and thesamples by columns.
It then reads in the expDesign.txt file provided asexp_design and extracts relevant information from it to add to thedata frame. an example of the expDesign.txt is provided here:https://raw.githubusercontent.com/caranathunge/promor_example_data/main/ed1.txt.
First, empty rows and columns are removed from the data frame.
Next, if a proteinGroups.txt file is used, it filters out reverseproteins, proteins that were only identified by site, and potentialcontaminants.Then it removes proteins identified with less thanthe number of unique peptides indicated byuniq_pep from thedata frame.
Next, it extracts the intensity columns indicated bydata typeand the selected protein rows from the data frame.
Converts missing values (zeros) to NAs.
Finally, the function log transforms the intensity values.

Value

Araw_df object which is a data frame containing proteinintensities. Proteins or protein groups are indicated by rows and samplesby columns.

Author(s)

Chathurani Ranathunge

Examples

### Using a proteinGroups.txt file produced by MaxQuant as input.## Generate a raw_df object with default settings. No technical replicates.raw_df <- create_df(  prot_groups = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/pg1.txt",  exp_design = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/ed1.txt",  input_type = "MaxQuant")## Data containing technical replicatesraw_df <- create_df(  prot_groups = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/pg2.txt",  exp_design = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/ed2.txt",  input_type = "MaxQuant",  tech_reps = TRUE)## Alter the number of unique peptides needed to retain a proteinraw_df <- create_df(  prot_groups = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/pg1.txt",  exp_design = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/ed1.txt",  input_type = "MaxQuant",  uniq_pep = 1)## Use "iBAQ" values instead of "LFQ" valuesraw_df <- create_df(  prot_groups = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/pg1.txt",  exp_design = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/ed1.txt",  input_type = "MaxQuant",  data_type = "iBAQ")### Using a universal standard input file instead of MaxQuant output.raw_df <- create_df(  prot_groups = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/st.txt",  exp_design = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/ed1.txt",  input_type = "standard")

Cox et al 2014 LFQ data (fit object)

Description

An object of class "MArrayLM" from running find_dep on ecoli_norm_df

Usage

data(ecoli_fit_df)

Format

An object of class "MArrayLM"

References

https://europepmc.org/article/MED/24942700#id609082

Cox et al 2014 LFQ data (normalized)

Description

A dataframe containing normalized LFQ protein intensity data for 4360proteins in 6 samples

Usage

data(ecoli_norm_df)

Format

A data frame with 4360 rows (proteins) and 6 columns (samples)

References

https://europepmc.org/article/MED/24942700#id609082

Visualize feature (protein) variation among conditions

Description

This function visualizes protein intensity differences amongconditions (classes) using box plots or density distribution plots.

Usage

feature_plot(  model_df,  type = "box",  text_size = 10,  palette = "viridis",  n_row,  n_col,  save = FALSE,  file_path = NULL,  file_name = "Feature_plot",  file_type = "pdf",  dpi = 80,  plot_width = 7,  plot_height = 7)

Arguments

model_df

Amodel_df object from performingpre_process.

type

Type of plot to generate. Choices are "box" or "density." Defaultis"box."

text_size

Text size for plot labels, axis labels etc. Default is10.

palette

Viridis color palette option for plots. Default is"viridis". Seeviridisfor available options.

n_row

Number of rows to print the plots.

n_col

Number of columns to print the plots.

save

Logical. IfTRUE saves a copy of the plot in thedirectory provided infile_path.

file_path

A string containing the directory path to save the file.

file_name

File name to save the plot.Default is"Feature_plot."

file_type

File type to save the plot.Default is"pdf".

dpi

Plot resolution. Default is80.

plot_width

Width of the plot. Default is7.

plot_height

Height of the plot. Default is7.

Details

This function visualizes condition-wise differences in proteinintensity using boxplots and/or density plots.

Value

Aggplot2 object

Author(s)

Chathurani Ranathunge

Examples

## Create a model_df object with default settings.covid_model_df <- pre_process(covid_fit_df, covid_norm_df)## Feature variation - box plotsfeature_plot(covid_model_df, type = "box", n_row = 4, n_col = 2)## Density plotsfeature_plot(covid_model_df, type = "density")## Change color palettefeature_plot(covid_model_df, type = "density", n_row = 4, n_col = 2, palette = "rocket")

Filter proteins by group level missing data

Description

This function filters out proteins based on missing dataat the group level.

Usage

filterbygroup_na(raw_df, set_na = 0.34, filter_condition = "either")

Arguments

raw_df

Araw_df object (output ofcreate_df)

set_na

The proportion of missing data allowed.Default is 0.34 (one third of the samples in the group).

filter_condition

If set to"each", proteins that exceedthe missing value proportion threshold set byset_na in each groupwill be removed (lenient).If set to"either"(default), proteins that exceed the missing valueproportion threshold set byset_na in at least one group will beremoved (stringent).

Details

This function firstextracts group or condition information from theraw_df object andassigns samples to their groups.
Iffilter_condition = "each", it then removes proteins (rows)from the data frame if the proportion of NAs ineach group exceeds thethreshold indicated byset_na (default is 0.34). This option ismore lenient in comparison tofilter_condition = "either", whereproteins that exceeds the missing data threshold ineither group getsremoved from the data frame.

Value

Araw_df object.

Author(s)

Chathurani Ranathunge

Examples

# Generate a raw_df object with default settings. No technical replicates.raw_df <- create_df(prot_groups = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/pg1.txt",exp_design = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/ed1.txt")## Remove proteins that exceed 34% NAs in either group (default)rawdf_filt1 <- filterbygroup_na(raw_df)## Remove proteins that exceed 34% NAs in each grouprawdf_filt2 <- filterbygroup_na(raw_df, filter_condition = "each")## Proportion of samples with NAs allowed in each group = 0.5rawdf_filt3 <- filterbygroup_na(raw_df, set_na = 0.5, filter_condition = "each")

Identify differentially expressed proteins between groups

Description

This function performs differential expression analysison protein intensity data with limma.

Usage

find_dep(  df,  save_output = FALSE,  save_tophits = FALSE,  file_path = NULL,  adj_method = "BH",  cutoff = 0.05,  lfc = 1,  n_top = 20)

Arguments

df

Anorm_df object or animp_df object.

save_output

Logical. IfTRUE saves results from thedifferential expression analysis in a text file labeled "limma_output.txt"in the directory specified byfile_path.

save_tophits

Logical. IfTRUE savesn_topnumber of top hits from the differential expression analysis in a text filelabeled "TopHits.txt" in the directory specified byfile_path.

file_path

A string containing the directory path to save the file.

adj_method

Method used for adjusting the p-values for multipletesting. Default is"BH" for "Benjamini-Hochberg" method.

cutoff

Cutoff value for p-values and adjusted p-values. Default is0.05.

lfc

Minimum absolute log2-fold change to use as threshold fordifferential expression.

n_top

The number of top differentially expressed proteins to save inthe "TopHits.txt" file. Default is20.

Details

It is important that the data is first log-transformed, ideally,imputed, and normalized before performing differential expression analysis.
save_output saves the complete results table from thedifferential expression analysis.
save_tophits first subsets the results to those with absolutelog fold change of more than 1, performs multiple correction withthe method specified inadj_method and outputs the topn_topresults based on lowest p-value and adjusted p-value.
If the number of hits with absolute log fold change of more than 1 isless thann_top,find_dep prints only those withlog-fold change > 1 to "TopHits.txt".
If thefile_path is not specified, text files will be saved ina temporary directory.

Value

Afit_df object, which is similar to alimmafit object.

Author(s)

Chathurani Ranathunge

References

Ritchie, Matthew E., et al. "limma powers differential expressionanalyses for RNA-sequencing and microarray studies." Nucleic acids research43.7 (2015): e47-e47.

Examples

## Perform differential expression analysis using default settingsfit_df1 <- find_dep(ecoli_norm_df)## Change p-value and adjusted p-value cutofffit_df2 <- find_dep(ecoli_norm_df, cutoff = 0.1)

Heatmap of differentially expressed proteins

Description

This function generates a heatmap to visualize differentiallyexpressed proteins between groups

Usage

heatmap_de(  fit_df,  df,  adj_method = "BH",  cutoff = 0.05,  lfc = 1,  sig = "adjP",  n_top = 20,  palette = "viridis",  text_size = 10,  save = FALSE,  file_path = NULL,  file_name = "HeatmapDE",  file_type = "pdf",  dpi = 80,  plot_height = 7,  plot_width = 7)

Arguments

fit_df

Afit_df object from performingfind_dep.

df

Thenorm_df object or theimp_df object from whichthefit_df object was obtained.

adj_method

Method used for adjusting the p-values for multipletesting. Default is"BH".

cutoff

Cutoff value for p-values and adjusted p-values. Default is0.05.

lfc

Minimum absolute log2-fold change to use as threshold fordifferential expression. Default is 1.

sig

Criteria to denote significance. Choices are"adjP"(default) for adjusted p-value or"P" for p-value.

n_top

Number of top hits to include in the heat map.

palette

Viridis color palette option for plots. Default is"viridis". Seeviridisfor available options.

text_size

Text size for axis text, labels etc.

save

Logical. IfTRUE saves a copy of the plot in thedirectory provided infile_path.

file_path

A string containing the directory path to save the file.

file_name

File name to save the plot. Default is "HeatmapDE."

file_type

File type to save the plot. Default is"pdf".

dpi

Plot resolution. Default is80.

plot_height

Height of the plot. Default is 7.

plot_width

Width of the plot. Default is 7.

Details

By default the tiles in the heatmap are reordered by intensity valuesalong both axes (x axis = samples, y axis = proteins).

Value

Aggplot2 plot object.

Author(s)

Chathurani Ranathunge

Examples

## Build a heatmap of differentially expressed proteins using the provided## example fit_df and norm_df data objectsheatmap_de(covid_fit_df, covid_norm_df)## Create a heatmap with P-value of 0.05 and log fold change of 1 as## significance criteria.heatmap_de(covid_fit_df, covid_norm_df, cutoff = 0.05, sig = "P")## Visualize the top 30 differentially expressed proteins in the heatmap and## change the color paletteheatmap_de(covid_fit_df, covid_norm_df,  cutoff = 0.05, sig = "P", n_top = 30,  palette = "magma")

Visualize missing data

Description

This function visualizes the patterns of missing valueoccurrence using a heatmap.

Usage

heatmap_na(  raw_df,  protein_range,  sample_range,  reorder_x = FALSE,  reorder_y = FALSE,  x_fun = mean,  y_fun = mean,  palette = "viridis",  label_proteins = FALSE,  text_size = 10,  save = FALSE,  file_type = "pdf",  file_path = NULL,  file_name = "Missing_data_heatmap",  plot_width = 15,  plot_height = 15,  dpi = 80)

Arguments

raw_df

Araw_df object (output fromcreate_df).

protein_range

The range or subset of proteins (rows) to plot. If notprovided, all the proteins (rows) in the data frame will be used.

sample_range

The range of samples to plot. If notprovided, all the samples (columns) in the data frame will be used.

reorder_x

Logical. IfTRUE samples on the x axis are reorderedusing the function given inx_fun. Default isFALSE.

reorder_y

Logical. IfTRUE proteins in the y axis are reorderedusing the function given iny_fun. Default isFALSE.

x_fun

Function to reorder samples along the x axis. Possible optionsaremean andsum. Default ismean.

y_fun

Function to reorder proteins along the y axis. Possible optionsaremean andsum. Default ismean.

palette

Viridis color palette option for plots. Default is"viridis". Seeviridisfor available options.

label_proteins

IfTRUE proteins on the y axiswill be labeled with their Majority Protein IDs. Default isFALSE.

text_size

Text size for axis labels. Default is10.

save

Logical. IfTRUE saves a copy of the plot in thedirectory provided infile_path.

file_type

File type to save the heatmap. Default is"pdf".

file_path

A string containing the directory path to save the file.

file_name

File name to save the heatmap. Default is"Missing_data_heatmap".

plot_width

Width of the plot. Default is15.

plot_height

Height of the plot. Default is15.

dpi

Plot resolution. Default is80.

Details

This function visualizes patterns of missing value occurrence using aheatmap. The user can choose to reorder the axes using the available functions(x_fun,y_fun) to better understand the underlying cause ofmissing data.

Value

Aggplot2 plot object.

Author(s)

Chathurani Ranathunge

Examples

## Generate a raw_df object with default settings. No technical replicates.raw_df <- create_df(  prot_groups = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/pg1.txt",  exp_design = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/ed1.txt")## Missing data heatmap with default settings.heatmap_na(raw_df)## Missing data heatmap with x and y axes reordered by the mean (default) of## protein intensity.heatmap_na(raw_df,  reorder_x = TRUE, reorder_y = TRUE)## Missing data heatmap with x and y axes reordered by the sum of## protein intensity.heatmap_na(raw_df,  reorder_x = TRUE, reorder_y = TRUE, x_fun = sum,  y_fun = sum)## Missing data heatmap for a subset of the proteins with x and y axes## reordered by the mean (default) of protein intensity and the y axis## labeled with protein IDs.heatmap_na(raw_df,  protein_range = 1:30,  reorder_x = TRUE, reorder_y = TRUE,  label_proteins = TRUE)

Impute missing values

Description

This function imputes missing values using a user-specifiedimputation method.

Usage

impute_na(  df,  method = "minProb",  tune_sigma = 1,  q = 0.01,  maxiter = 10,  ntree = 20,  n_pcs = 2,  seed = NULL)

Arguments

df

Araw_df object (output ofcreate_df)containing missing values or anorm_df object after performingnormalization.

method

Imputation method to use. Default is"minProb".Available methods:"minDet", "RF", "kNN", and "SVD".

tune_sigma

A scalar used in the"minProb" method forcontrolling the standard deviation of the Gaussian distributionfrom which random values are drawn for imputation.
Default is 1.

q

A scalar used in"minProb" and"minDet" methodsto obtain a low intensity value for imputation.q should be set to avery low value. Default is 0.01.

maxiter

Maximum number of iterations to be performed when using the"RF" method. Default is10.

ntree

Number of trees to grow in each forest when using the"RF" method. Default is20.

n_pcs

Number of principal components to calculate when using the"SVD" method. Default is 2.

seed

Numerical. Random number seed. Default isNULL

Details

Ideally, you should first remove proteins withhigh levels of missing data using thefilterbygroup_na functionbefore runningimpute_na on theraw_df object or thenorm_df object.
impute_na function imputes missing values using auser-specified imputation method from the available options,minProb,minDet,kNN,RF, andSVD.
Note: Some imputation methods may require that the data be normalizedprior to imputation.
Make sure to fix the random number seed withseed for reproducibility

Value

Animp_df object, which is a data frame of protein intensitieswith no missing values.

Author(s)

Chathurani Ranathunge

References

Lazar, Cosmin, et al. "Accounting for the multiple natures ofmissing values in label-free quantitative proteomics data sets to compareimputation strategies." Journal of proteome research 15.4 (2016): 1116-1125.

Examples

## Generate a raw_df object with default settings. No technical replicates.raw_df <- create_df(  prot_groups = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/pg1.txt",  exp_design = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/ed1.txt")## Impute missing values in the data frame using the default minProb## method.imp_df1 <- impute_na(raw_df, seed = 3312)## Using the kNN method.imp_df2 <- impute_na(raw_df, method = "kNN", seed = 3312)## Using the SVD method with n_pcs set to 3.imp_df3 <- impute_na(raw_df, method = "SVD", n_pcs = 3, seed = 3312)## Using the minDet method with q set at 0.001.imp_df4 <- impute_na(raw_df, method = "minDet", q = 0.001, seed = 3312)## Impute a normalized data set using the kNN methodimp_df5 <- impute_na(ecoli_norm_df, method = "kNN")

Visualize the impact of imputation

Description

This function generates density plots to visualize the impact ofmissing data imputation on the data.

Usage

impute_plot(  original,  imputed,  global = TRUE,  text_size = 10,  palette = "viridis",  n_row,  n_col,  save = FALSE,  file_path = NULL,  file_name = "Impute_plot",  file_type = "pdf",  plot_width = 7,  plot_height = 7,  dpi = 80)

Arguments

original

Araw_df object (output ofcreate_df)containing missing values or anorm_df object containing normalizedprotein intensity data.

imputed

Animp_df object obtained from runningimpute_naon the same data frame provided asoriginal.

global

Logical. IfTRUE (default), a global density plot isproduced. IfFALSE, sample-wise density plots are produced.

text_size

Text size for plot labels, axis labels etc. Default is10.

palette

Viridis color palette option for plots. Default is"viridis". Seeviridisfor available options.

n_row

Used ifglobal = FALSE to indicate the number of rowsto print the plots.

n_col

Used ifglobal = FALSE to indicate the number ofcolumns to print the plots.

save

Logical. IfTRUE saves a copy of the plot in thedirectory provided infile_path.

file_path

A string containing the directory path to save the file.

file_name

File name to save the density plot/s.Default is"Impute_plot."

file_type

File type to save the density plot/s.Default is"pdf".

plot_width

Width of the plot. Default is7.

plot_height

Height of the plot. Default is7.

dpi

Plot resolution. Default is80.

Details

Given two data frames, one with missing valuesand the other, an imputed data frame (imp_df object) of the samedata set,impute_plot generates global or sample-wise density plotsto visualize the impact of imputation on the data set.
Note, when sample-wise option is selected (global = FALSE),n_col andn_row can be used to specify the number of columnsand rows to print the plots.
If you choose to specifyn_row andn_col, make sure thatn_row *n_col matches the total number of samples in thedata frame.

Value

Aggplot2 plot object.

Author(s)

Chathurani Ranathunge

Examples

## Generate a raw_df object with default settings. No technical replicates.raw_df <- create_df(  prot_groups = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/pg1.txt",  exp_design = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/ed1.txt")## Impute missing values in the data frame using the default minProb## method.imp_df <- impute_na(raw_df)## Visualize the impact of missing data imputation with a global density## plot.impute_plot(original = raw_df, imputed = imp_df)## Make sample-wise density plotsimpute_plot(raw_df, imp_df, global = FALSE)## Print plots in user-specified numbers of rows and columnsimpute_plot(raw_df, imp_df, global = FALSE, n_col = 2, n_row = 3)

Visualize the effect of normalization

Description

This function visualizes the impact of normalization onthe data

Usage

norm_plot(  original,  normalized,  type = "box",  text_size = 10,  palette = "viridis",  save = FALSE,  file_path = NULL,  file_name = "Norm_plot",  file_type = "pdf",  dpi = 80,  plot_width = 10,  plot_height = 7)

Arguments

original

Araw_df object (output ofcreate_df)containing missing values, or animp_df object afterimputing the missing values withimpute_na.

normalized

Anorm_df object after normalizing the data frameprovided asoriginal usingnormalize_data.

type

Type of plot to generate. Choices are "box" or "density." Defaultis"box."

text_size

Text size for plot labels, axis labels etc. Default is10.

palette

Viridis color palette option for plots. Default is"viridis". Seeviridisfor available options.

save

Logical. IfTRUE saves a copy of the plot in thedirectory provided infile_path.

file_path

A string containing the directory path to save the file.

file_name

File name to save the plot.Default is"Norm_plot."

file_type

File type to save the plot.Default is"pdf".

dpi

Plot resolution. Default is80.

plot_width

Width of the plot. Default is10.

plot_height

Height of the plot. Default is7.

Details

Given two data frames, one with data prior to normalization(original), and the other, after normalization (normalized),norm_plot generates side-by-side plots to visualize the effect ofnormalization on the protein intensity data.

Value

Aggplot2 plot object.

Author(s)

Chathurani Ranathunge

Examples

## Generate a raw_df object with default settings. No technical replicates.raw_df <- create_df(  prot_groups = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/pg1.txt",  exp_design = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/ed1.txt")## Impute missing values in the data frame using the default minProb## method.imp_df <- impute_na(raw_df)## Normalize the imp_df object using the default quantile methodnorm_df <- normalize_data(imp_df)## Visualize normalization using box plotsnorm_plot(original = imp_df, normalized = norm_df)## Visualize normalization using density plotsnorm_plot(imp_df, norm_df, type = "density")

Normalize intensity data

Description

This function normalizes data using a user-specifiednormalization method.

Usage

normalize_data(df, method = "quantile")

Arguments

df

Animp_df object with missing values imputed usingimpute_na or araw_df object containing missing values.

method

Name of the normalization method to use. Choices are"none", "scale", "quantile" or "cyclicloess."Default is"quantile."

Details

normalize_data is a wrapper function aroundthenormalizeBetweenArrays function from thelimma package.
This function normalizesintensity values to achieve consistency among samples.
It assumes that the intensities in thedata frame have been log-transformed, therefore, it is important to make surethatcreate_df was run withlog_tr = TRUE(default) whencreating theraw_df object.

Value

Anorm_df object, which is a data frame ofnormalized protein intensities.

Author(s)

Chathurani Ranathunge

Examples

## Generate a raw_df object with default settings. No technical replicates.raw_df <- create_df(  prot_groups = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/pg1.txt",  exp_design = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/ed1.txt")## Impute missing values in the data frame using the default minProb## method prioir to normalization.imp_df <- impute_na(raw_df)## Normalize the imp_df object using the default quantile methodnorm_df1 <- normalize_data(imp_df)## Use the cyclicloess methodnorm_df2 <- normalize_data(imp_df, method = "cyclicloess")## Normalize data in the raw_df object prior to imputation.norm_df3 <- normalize_data(raw_df)

Proteins that are only expressed in a given group

Description

This function outputs a list of proteins that are onlyexpressed (present) in one user-specified group while not expressed(completely absent) in another user-specified group.

Usage

onegroup_only(  raw_df,  abs_group,  pres_group,  set_na = 0.34,  save = FALSE,  file_path = NULL)

Arguments

raw_df

Araw_df object (output ofcreate_df)

abs_group

Name of the group in which proteins are not expressed.

pres_group

Name of the group in which proteins are expressed.

set_na

The percentage of missing data allowed inpres_group.Default is 0.34 (one third of the samples in the group).

save

Logical. IfTRUE (default), it saves the output in a textfile named "Group_pres_group_only.txt."

file_path

A string containing the directory path to save the file.

Details

Note:onegroup_only function assumes that column names intheraw_df object provided asdf follow "Group_UniqueSampleID"notation. (Usehead(raw_df) to check the structure of yourraw_df object.)

Given a pair of groups,onegroup_onlyfunction finds proteins that are only expressed inpres_group whilecompletely absent or not expressed inabs_group.
A text file containing majority protein IDs will be saved in atemporary directory iffile_path is not specified.

Value

A list of majority protein IDs.

Author(s)

Chathurani Ranathunge

Examples

# Generate a raw_df object with default settings. No technical replicates.raw_df <- create_df(prot_groups = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/pg1.txt",exp_design = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/ed1.txt")## Find the proteins only expressed in group L, but absent in group H.onegroup_only(raw_df, abs_group = "H",pres_group = "L")

Model performance plot

Description

This function generates plots to visualize model performance

Usage

performance_plot(  model_list,  type = "box",  text_size = 10,  palette = "viridis",  save = FALSE,  file_path = NULL,  file_name = "Performance_plot",  file_type = "pdf",  plot_width = 7,  plot_height = 7,  dpi = 80)

Arguments

model_list

Amodel_list object from performingtrain_models.

type

Type of plot to generate. Choices are "box" or "dot."Default is"box." for boxplots.

text_size

Text size for plot labels, axis labels etc. Default is10.

palette

Viridis color palette option for plots. Default is"viridis". Seeviridisfor available options.

save

Logical. IfTRUE saves a copy of the plot in thedirectory provided infile_path.

file_path

A string containing the directory path to save the file.

file_name

File name to save the plot.Default is"Performance_plot."

file_type

File type to save the plot.Default is"pdf".

plot_width

Width of the plot. Default is7.

plot_height

Height of the plot. Default is7.

dpi

Plot resolution. Default is80.

Details

performance_plot uses resampling results frommodels included in themodel_list to generate plots showing modelperformance.
The default metrics used for classification based models are "Accuracy"and "Kappa."
These metric types can be changed by providing additional arguments tothetrain_models function. Seetrain andtrainControl for more information.

Value

Aggplot2 object.

Author(s)

Chathurani Ranathunge

Examples

## Create a model_df objectcovid_model_df <- pre_process(covid_fit_df, covid_norm_df)## Split the data frame into training and test data setscovid_split_df <- split_data(covid_model_df)## Fit models based on the default list of machine learning (ML) algorithmscovid_model_list <- train_models(covid_split_df)## Generate box plots to visualize performance of different ML algorithmsperformance_plot(covid_model_list)## Generate dot plotsperformance_plot(covid_model_list, type = "dot")## Change color paletteperformance_plot(covid_model_list, type = "dot", palette = "inferno")

Pre-process protein intensity data for modeling

Description

This function pre-processes protein intensity data fromthe top differentially expressed proteins identified withfind_dep formodeling.

Usage

pre_process(  fit_df,  norm_df,  sig = "adjP",  sig_cutoff = 0.05,  fc = 1,  n_top = 20,  find_highcorr = TRUE,  corr_cutoff = 0.9,  save_corrmatrix = FALSE,  file_path = NULL,  rem_highcorr = TRUE)

Arguments

fit_df

Afit_df object from performingfind_dep.

norm_df

Thenorm_df object from which thefit_df objectwas obtained.

sig

Criteria to denote significance in differential expression.Choices are"adjP" (default) for adjusted p-value or"P"for p-value.

sig_cutoff

Cutoff value for p-values and adjusted p-values indifferential expression. Default is0.05.

fc

Minimum absolute log-fold change to use as threshold fordifferential expression. Default is1.

n_top

The number of top hits fromfind_dep to be used inmodeling. Default is20.

find_highcorr

Logical. IfTRUE (default), finds highlycorrelated proteins.

corr_cutoff

A numeric value specifying the correlation cutoff.Default is0.90.

save_corrmatrix

Logical. IfTRUE, saves a copy of theprotein correlation matrix in a tab-delimited text file labeled"Protein_correlation.txt" in the directory specified byfile_path.

file_path

A string containing the directory path to save the file.

rem_highcorr

Logical. IfTRUE (default), removes highlycorrelated proteins (predictors or features).

Details

This function creates a data frame that contains protein intensitiesfor a user-specified number of top differentially expressed proteins.

Usingfind_highcorr = TRUE, highly correlatedproteins can be identified, and can be removed withrem_highcorr = TRUE.
Note: Most models will benefit from reducing correlation betweenproteins (predictors or features), therefore we recommend removing thoseproteins at this stage to reduce pairwise-correlation.
If no or few proteins meet the significance threshold for differentialexpression, you may adjustsig,fc, and/orsig_cutoffaccordingly to obtain more proteins for modeling.

Value

Amodel_df object, which is a data frame of proteinintensities with proteins indicated by columns.

Author(s)

Chathurani Ranathunge

Examples

## Create a model_df object with default settings.covid_model_df1 <- pre_process(fit_df = covid_fit_df, norm_df = covid_norm_df)## Change the correlation cutoff.covid_model_df2 <- pre_process(covid_fit_df, covid_norm_df, corr_cutoff = 0.95)## Change the significance criteria to include more proteinscovid_model_df3 <- pre_process(covid_fit_df, covid_norm_df, sig = "P")## Change the number of top differentially expressed proteins to includecovid_model_df4 <- pre_process(covid_fit_df, covid_norm_df, sig = "P", n_top = 24)

Remove user-specified proteins (features) from a data frame

Description

This function removes user-specified proteins from amodel_dfobject

Usage

rem_feature(model_df, rem_protein)

Arguments

model_df

Amodel_df object.

rem_protein

Name of the protein to remove.

Details

After visualizing protein intensity variationamong conditions withfeature_plot or after assessing the importanceof each protein in models usingvarimp_plot, you can choose to removespecific proteins (features) from the data frame.
For example, you canchoose to remove a protein from themodel_df object if the proteindoes not show distinct patterns of variation among conditions. This proteinmay show mostly overlapping distributions in the feature plots.
Another incidence would be removing a protein that is very low invariable importance in the models built usingtrain_models. You canvisualize variable importance usingvarimp_plot.

Value

Amodel_df object.

Author(s)

Chathurani Ranathunge

Examples

covid_model_df <- pre_process(fit_df = covid_fit_df, norm_df = covid_norm_df)## Remove sp|P22352|GPX3_HUMAN protein from the model_df objectcovid_model_df1 <- rem_feature(covid_model_df, rem_protein = "sp|P22352|GPX3_HUMAN")

Remove user-specified samples

Description

This function removes user-specified samples from thedata frame.

Usage

rem_sample(raw_df, rem)

Arguments

raw_df

Araw_df object.

rem

Name of the sample to remove.

Details

rem_sample assumes that sample names follow the"Group_UniqueSampleID_TechnicalReplicate" notation (Usehead(raw_df)to see the structure of theraw_df object.)
If all the technical replicates representing a sample needs to beremoved, provide "Group_UniqueSampleID" asrem.
If a specific technical replicate needs to be removed in case itshows weak correlation with other technical replicates for example, you canremove that particular technical replicate by providing"Group_UniqueSampleID_TechnicalReplicate" asrem.

Value

Araw_df object.

Author(s)

Chathurani Ranathunge

Examples

## Use a data set containing technical replicates to create a raw_df objectraw_df <- create_df(prot_groups = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/pg2.txt",exp_design = "https://raw.githubusercontent.com/caranathunge/promor_example_data/main/ed2.txt",tech_reps = TRUE)# Check the first few rows of the raw_df objecthead(raw_df)## Remove all technical replicates of "WT_4"raw_df1 <- rem_sample(raw_df, "WT_4")## Remove only technical replicate number 2 of "WT_4"raw_df2 <- rem_sample(raw_df, "WT_4_2")

ROC plot

Description

This function generates Receiver Operating Characteristic (ROC)curves to evaluate models

Usage

roc_plot(  probability_list,  split_df,  ...,  multiple_plots = TRUE,  text_size = 10,  palette = "viridis",  save = FALSE,  file_path = NULL,  file_name = "ROC_plot",  file_type = "pdf",  plot_width = 7,  plot_height = 7,  dpi = 80)

Arguments

probability_list

Aprobability_list object from performingtest_models withtype = "prob".

split_df

Asplit_df object from performingsplit_data

...

Additional arguments to be passed on toroc.

multiple_plots

Logical. IfFALSE plots all ROC curvesrepresenting algorithms included in theprobability_list in a singleplot.

text_size

Text size for plot labels, axis labels etc. Default is10.

palette

Viridis color palette option for plots. Default is"viridis". Seeviridisfor available options.

save

Logical. IfTRUE saves a copy of the plot in thedirectory provided infile_path.

file_path

A string containing the directory path to save the file.

file_name

File name to save the plot.Default is"ROC_plot."

file_type

File type to save the plot.Default is"pdf".

plot_width

Width of the plot. Default is7.

plot_height

Height of the plot. Default is7.

dpi

Plot resolution. Default is80.

Details

roc_plot first uses probabilities generatedduringtest_models to build a ROC object.
Next, relevant information is extracted from the ROC object toplot the ROC curves.

Value

Aggplot2 object.

Author(s)

Chathurani Ranathunge

Examples

## Create a model_df objectcovid_model_df <- pre_process(covid_fit_df, covid_norm_df)## Split the data frame into training and test data setscovid_split_df <- split_data(covid_model_df)## Fit models using the default list of machine learning (ML) algorithmscovid_model_list <- train_models(covid_split_df)# Test a list of models on a test data set and output class probabilities,covid_prob_list <- test_models(covid_model_list, covid_split_df, type = "prob")## Plot ROC curves separately for each ML algorithmroc_plot(covid_prob_list, covid_split_df)## Plot all ROC curves in one plotroc_plot(covid_prob_list, covid_split_df, multiple_plots = FALSE)## Change color paletteroc_plot(covid_prob_list, covid_split_df, palette = "plasma")

Split the data frame to create training and test data

Description

This function can be used to create balanced splits of theprotein intensity data in amodel_df object to create training and test data

Usage

split_data(model_df, train_size = 0.8, seed = NULL)

Arguments

model_df

Amodel_df object from performingpre_process.

train_size

The size of the training data set as a proportion of thecomplete data set. Default is 0.8.

seed

Numerical. Random number seed. Default isNULL

Details

This function splits themodel_df object in to training andtest data sets using random sampling while preserving the originalclass distribution of the data. Make sure to fix the random number seed withseed for reproducibility

Value

A list of data frames.

Author(s)

Chathurani Ranathunge

Examples

## Create a model_df objectcovid_model_df <- pre_process(covid_fit_df, covid_norm_df)## Split the data frame into training and test data sets using default settingscovid_split_df1 <- split_data(covid_model_df, seed = 8314)## Split the data frame into training and test data sets with 70% of the## data in training and 30% in test data setscovid_split_df2 <- split_data(covid_model_df, train_size = 0.7, seed = 8314)## Access training data setcovid_split_df1$training## Access test data setcovid_split_df1$test

Test machine learning models on test data

Description

This function can be used to predict test data using modelsgenerated by different machine learning algorithms

Usage

test_models(  model_list,  split_df,  type = "prob",  save_confusionmatrix = FALSE,  file_path = NULL,  ...)

Arguments

model_list

Amodel_list object from performingtrain_models.

split_df

Asplit_df object from performingsplit_data.

type

Type of output. Settype as "prob" (default) to outputclass probabilities, and "raw" to output class predictions.

save_confusionmatrix

Logical. IfTRUE, a tab-delimitedtext file ("Confusion_matrices.txt") with confusion matrices in thelong-form data format will be saved in the directory specified byfile_path.See below for more details.

file_path

A string containing the directory path to save the file.

...

Additional arguments to be passed on topredict.

Details

test_models function usesmodels obtained fromtrain_models to predict a given test data set.
Settingtype = "raw" is required to obtain confusion matrices.
Settingtype = "prob" (default) will output a list ofprobabilities that can be used to generate ROC curves usingroc_plot.

Value

probability_list: Iftype = "prob", a list ofdata frames containing class probabilities for each method in themodel_list will be returned.
prediction_list: Iftype = "raw", a list of factorscontaining class predictions for each method will be returned.

Author(s)

Chathurani Ranathunge

Examples

## Create a model_df objectcovid_model_df <- pre_process(covid_fit_df, covid_norm_df)## Split the data frame into training and test data setscovid_split_df <- split_data(covid_model_df)## Fit models using the default list of machine learning (ML) algorithmscovid_model_list <- train_models(covid_split_df)# Test a list of models on a test data set and output class probabilities,covid_prob_list <- test_models(model_list = covid_model_list, split_df = covid_split_df)## Not run: # Save confusion matrices in the working directory and output class predictionscovid_pred_list <- test_models(  model_list = covid_model_list,  split_df = covid_split_df,  type = "raw",  save_confusionmatrix = TRUE,  file_path = ".")## End(Not run)

Train machine learning models on training data

Description

This function can be used to train models on protein intensitydata using different machine learning algorithms

Usage

train_models(  split_df,  resample_method = "repeatedcv",  resample_iterations = 10,  num_repeats = 3,  algorithm_list,  seed = NULL,  ...)

Arguments

split_df

Asplit_df object from performingsplit_data.

resample_method

The resampling method to use. Default is"repeatedcv" for repeated cross validation.SeetrainControl fordetails on other available methods.

resample_iterations

Number of resampling iterations. Default is10.

num_repeats

The number of complete sets of folds to compute (Forresampling method = "repeatedcv" only).

algorithm_list

A list of classification or regression algorithms touse.A full list of machine learning algorithms available throughthecaret package can be found here:http://topepo.github.io/caret/train-models-by-tag.html. See below fordefault options.

seed

Numerical. Random number seed. Default isNULL

...

Additional arguments to be passed on totrain function in thecaret package.

Details

train_models function can be used to firstdefine the control parameters to be used in training models, calculateresampling-based performance measures for models based on a given set ofmachine-learning algorithms, and output the best model for each algorithm.
In the event thatalgorithm_list is not provided, a defaultlist of four classification-based machine-learning algorithms will be usedfor building and training models. Defaultalgorithm_list:"svmRadial", "rf", "glm", "xgbLinear, and "naive_bayes."
Note: Models that fail to build are removed from the output.
Make sure to fix the random number seed withseed for reproducibility

Value

A list of classtrain for each machine-learning algorithm.Seetrain for more information on accessingdifferent elements of this list.

Author(s)

Chathurani Ranathunge

References

Kuhn, Max. "Building predictive models in R using the caretpackage." Journal of statistical software 28 (2008): 1-26.

Examples

## Create a model_df objectcovid_model_df <- pre_process(covid_fit_df, covid_norm_df)## Split the data frame into training and test data setscovid_split_df <- split_data(covid_model_df, seed = 8314)## Fit models based on the default list of machine learning (ML) algorithmscovid_model_list1 <- train_models(split_df = covid_split_df, seed = 351)## Fit models using a user-specified list of ML algorithms.covid_model_list2 <- train_models(  covid_split_df,  algorithm_list = c("svmRadial", "glmboost"),  seed = 351)## Change resampling method and resampling iterations.covid_model_list3 <- train_models(  covid_split_df,  resample_method = "cv",  resample_iterations = 50,  seed = 351)

Variable importance plot

Description

This function visualizes variable importance in models

Usage

varimp_plot(  model_list,  ...,  type = "lollipop",  text_size = 10,  palette = "viridis",  n_row,  n_col,  save = FALSE,  file_path = NULL,  file_name = "VarImp_plot",  file_type = "pdf",  dpi = 80,  plot_width = 7,  plot_height = 7)

Arguments

model_list

Amodel_list object from performingtrain_models.

...

Additional arguments to be passed on tovarImp.

type

Type of plot to generate. Choices are "bar" or "lollipop."Default is"lollipop."

text_size

Text size for plot labels, axis labels etc. Default is10.

palette

Viridis color palette option for plots. Default is"viridis". Seeviridisfor available options.

n_row

Number of rows to print the plots.

n_col

Number of columns to print the plots.

save

Logical. IfTRUE saves a copy of the plot in thedirectory provided infile_path.

file_path

A string containing the directory path to save the file.

file_name

File name to save the plot.Default is"VarImp_plot."

file_type

File type to save the plot.Default is"pdf".

dpi

Plot resolution. Default is80.

plot_width

Width of the plot. Default is7.

plot_height

Height of the plot. Default is7.

Details

varimp_plot produces a list of plots showingvariable importance measures calculated from models generated with differentmachine-learning algorithms.
Note: Variables are ordered by variable importance indescending order, and by default, importance values are scaled to 0 and 100.This can be changed by specifyingscale = FALSE. SeevarImp for more information.

Value

A list ofggplot2 objects.

Author(s)

Chathurani Ranathunge

Examples

## Create a model_df objectcovid_model_df <- pre_process(covid_fit_df, covid_norm_df)## Split the data frame into training and test data setscovid_split_df <- split_data(covid_model_df)## Fit models based on the default list of machine learning (ML) algorithmscovid_model_list <- train_models(covid_split_df)## Variable importance - lollipop plotsvarimp_plot(covid_model_list)## Bar plotsvarimp_plot(covid_model_list, type = "bar")## Do not scale variable importance valuesvarimp_plot(covid_model_list, scale = FALSE)## Change color palettevarimp_plot(covid_model_list, palette = "magma")

Volcano plot

Description

This function generates volcano plots to visualizedifferentially expressed proteins between groups.

Usage

volcano_plot(  fit_df,  adj_method = "BH",  sig = "adjP",  cutoff = 0.05,  lfc = 1,  line_fc = TRUE,  line_p = TRUE,  palette = "viridis",  text_size = 10,  label_top = FALSE,  n_top = 10,  save = FALSE,  file_path = NULL,  file_name = "Volcano_plot",  file_type = "pdf",  plot_height = 7,  plot_width = 7,  dpi = 80)

Arguments

fit_df

Afit_df object from performingfind_dep.

adj_method

Method used for adjusting the p-values for multipletesting. Default is"BH".

sig

Criteria to denote significance. Choices are"adjP"(default) for adjusted p-value or"P" for p-value.

cutoff

Cutoff value for p-values and adjusted p-values. Default is0.05.

lfc

Minimum absolute log2-fold change to use as threshold fordifferential expression.

line_fc

Logical. IfTRUE(default), a dotted line will be shownto indicate thelfc threshold in the plot.

line_p

Logical. IfTRUE(default), a dotted line will be shownto indicate the p-value or adjusted p-valuecutoff.

palette

Viridis color palette option for plots. Default is"viridis". Seeviridisfor available options.

text_size

Text size for axis text, labels etc.

label_top

Logical. IfTRUE (default), labels are added to thedots to indicate protein names.

n_top

The number of top hits to label with protein name whenlabel_top = TRUE. Default is10.

save

Logical. IfTRUE saves a copy of the plot in thedirectory provided infile_path.

file_path

A string containing the directory path to save the file.

file_name

File name to save the plot. Default is "Volcano_plot."

file_type

File type to save the plot. Default is"pdf".

plot_height

Height of the plot. Default is 7.

plot_width

Width of the plot. Default is 7.

dpi

Plot resolution. Default is80.

Details

Volcano plots show log-2-fold change on the x-axis,and based on the significance criteria chosen, either -log10(p-value) or-log10(adjusted p-value) on the y-axis.
volcano_plot requires afit_df object from performingdifferential expression analysis withfind_dep.
User has the option to choose criteria that denote significance.

Value

Aggplot2 plot object.

Author(s)

Chathurani Ranathunge

Examples

## Create a volcano plot with default settings.volcano_plot(ecoli_fit_df)## Change significance criteria and cutoffvolcano_plot(ecoli_fit_df, cutoff = 0.1, sig = "P")## Label top 30 differentially expressed proteins and## change the color palette of the plotvolcano_plot(ecoli_fit_df, label_top = TRUE, n_top = 30, palette = "mako")

Movatterモバイル変換

Compute average intensity

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Correlation between technical replicates

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Suvarna et al 2021 LFQ data (fit object)

Description

Usage

Format

References

Suvarna et al 2021 LFQ data (normalized)

Description

Usage

Format

References

Create a data frame of protein intensities

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Cox et al 2014 LFQ data (fit object)

Description

Usage

Format

References

Cox et al 2014 LFQ data (normalized)

Description

Usage

Format

References

Visualize feature (protein) variation among conditions

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Filter proteins by group level missing data

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Identify differentially expressed proteins between groups

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Heatmap of differentially expressed proteins

Description

Usage

Arguments

Details