Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

🔎 R package for detecting damaged cells in single-cell RNA sequencing data

License

NotificationsYou must be signed in to change notification settings

AlicenJoyHenning/DamageDetective

Repository files navigation

R-CMD-checkBuild StatusProject Status: Active – The project has reached a stable, usable state and is being actively developedCRAN status

Content

Description |Installation |Quick start |Contribute |Authors |License |References

Description

Jump to the DamageDetective website

Damaged cells are an artifact of single-cell RNA sequencing (scRNA-seq) formed when cells succumb to stress before being sequenced. As a result, the gene expression data captured does not reflect biologically viable cells and introduces technical variability that is indistinguishable from functionally relevant variability. Filtering these cells is a standard task in scRNA-seq quality control (QC), though lacks standardisation in practice.

The majority of approaches filter damaged cells according to deviations in cell-level QC metrics. This outlier-based detection implicitly assumes viable cells follow unimodal distributions across QC metrics, where deviation is synonymous with damage. This assumption falters in the context of heterogeneous data and risks introducing filtering bias related to cell type abundance. Recent methods address this by defining damage within distinct distributions, representing cell populations, independently. This, however, assumes all distinct distributions are associated with viable cell populations and risks leaving abundant damage undetected and ultimately misclassified.

DamageDetective takes a different approach, rather than detecting damage by measuring the extent to which cells deviate from one another, it measures the extent to which cells deviate from artificially damaged profiles of themselves, created through simulating cytoplasmic RNA escape–a characteristic of damage resulting from the loss of plasma membrane integrity. This is inspired by the approach ofDoubletFinder—a high-performing tool of another prominent scRNA-seq artifact.

LikeDoubletFinder,DamageDetective uses principal component analysis to compute the proximity of true cells to artificial cells. This is calculated as a proportion (pANN) of a cell's nearest neighbours that are of artificial origin, reflecting the likelihood that the cell has experienced the same cytoplasmic RNA loss as its artificial neighbours, i.e., is damaged. This score, ranging from 0 to 1, provides an intuitive scale for filtering that is standardised across cell types, sample origin, and experimental design.


Installation

InstallDamageDetective from CRAN (R >= 4.4.0),

install.packages('DamageDetective')

Or the latest development version on GitHub (R >= 3.5.0),

library(devtools)devtools::install_github("AlicenJoyHenning/DamageDetective", build_vignettes = TRUE)

To verify installation, run the following to see if you can view the package vignette and the function help pages,

library(DamageDetective)help(package="DamageDetective")



Quick start

This demonstration can be followed immediately after loading the package using the internal dummy dataset. For examples with true data and more detailed explanations, please refer to the package articleswebsite.

Prepare input

Damage detection is carried out by thedetect_damage function that accepts count matrices,Seurat orSingleCellExperiment objects, or alignment files (package tutorials) as input. We will demonstrate using a dummy count matrix,test_counts, a subset of the(kotliarov-pbmc-2020) PBMC dataset provided in thescRNAseq package.

library(DamageDetective)library(Matrix)data("test_counts",package="DamageDetective")dim(test_counts)

Expected outcome,

[1]32738500

Select parameters for damage detection

ribosome_penalty

Whiledetect_damage requires only a count matrix as input, additional parameters control aspects of the function's computations. Of these, we recommendribosome_penalty be adjusted for each dataset using theselect_penalty function as shown below,

penalty<- select_penalty(count_matrix=test_counts)penalty

Expected outcome,

Testingpenaltyof0.1...Testingpenaltyof0.15...Testingpenaltyof0.2...Testingpenaltyof0.25...Stoppingearly:dTNNisnolongerimproving.0.1

filter_threshold

DamageDetective performs filtering using the proximity scores according to a threshold. By default,DamageDetective offers the threshold of0.5 where values greater than0.5 reflect more permissive filtering and values closer to0 reflect more stringent filtering. We recommend the default, but suggest that if adjustments are made, they are informed by the outputdetect_damage plots,generate_plot = TRUE.


Run damage detection

Damage detection is run as shown below, using the count matrix and ribosomal penalty as inputs. Below, we have additionally specified forfilter_counts parameter to beTRUE. This will use the defaultfilter_threshold to detect damaged cells for removal and return the filtered count matrix that can be used immediately afterwards for the remainder of pre-processing. Though implemented in R,DamageDetective provides output that is platform-agnostic and can be integrated into any existing single-cell analysis workflow.

# Perform damage detectiondetection_results<- detect_damage(count_matrix=test_counts,ribosome_penalty=penalty,filter_counts=TRUE)# View the resulting count matrixdim(detection_results$output)

Expected outcome,

Clusteringcells...Simulatingdamage...ComputingpANN...32738461

Alternatively, iffilter_counts is set toFALSE, a data frame will be given as output containing the damage scores for each cell. This is provided for the user if they wish to interact with theDamageDetective results directly. From here, a user can filter their data manually, as is done byfilter_counts=TRUE automatically.

# Perform damage detectiondetection_results<- detect_damage(count_matrix=test_counts,ribosome_penalty=penalty,filter_counts=FALSE,seed=7)# View outputprint(head(detection_results$output),row.names=FALSE)# Filter matrixundamaged_cells<- subset(detection_results$output,DamageDetective<0.7)filtered_matrix<-test_counts[,undamaged_cells$Cells]dim(filtered_matrix)

Expected outcome,

Clusteringcells...Simulatingdamage...ComputingpANN...CellsDamageDetectiveTCTGGAAAGCCCAACC_H1B2ln60CCGTTCATCGTGGGAA_H1B2ln20CTTCTCTTCAGCCTAA_H1B2ln10GGATTACAGGGATGGG_H1B2ln10TCTATTGTCTGGTATG_H1B2ln20ACGGGTCAGACAAGCC_H1B2ln6032738461

Contribute

We are committed to the improvement ofDamageDetective and encourage users to report any bugs or difficulties they encounter. Contributions that refine or challenge the assumptions and heuristics used to detect damaged cells are also welcome. Please reach out via the maintainer's email listed in theDESCRIPTION file or start a public discussionIssue.

License

DamageDetective is made available for public use through theGNU AGPL-3.0

Authors

Alicen Henning
Stellenbosch University, Cape Town, South Africa
Bioinformatics and Computational Biology

This work was done under the supervision of Prof Marlo Möller, Prof Gian van der Spuy, and Prof André Loxton.

References

About

🔎 R package for detecting damaged cells in single-cell RNA sequencing data

Resources

License

Stars

Watchers

Forks

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp