Movatterモバイル変換


[0]ホーム

URL:


ContributorsForksStargazersIssuesGPL License


Logo

SignacX 2.2.3

Get the most out of your single cell data.
Explore the docs »

ViewDemo ·ReportBug ·RequestFeature

Table of Contents

What is SignacX?

SignacX is software developed by the Savova lab at Sanofi with afocus on single cell genomics for clinical applications. SignacXclassifies the cellular phenotype for each individual cell in singlecell RNA-sequencing data using neural networks trained with sorted bulkgene expression data from theHumanPrimary Cell Atlas. In this R implementation, we provide functionsand vignettes that demonstrate how to: integrate single cell data(mapping cells from one data set to another), classify non-human data,identify novel cell types, and classify single cell data across manytissues, diseases and technologies. To learn more, check out thepre-printhere.

Data portal

Here, we provide interactive access to data from thepre-printwithSPRINGViewer. Just click the “Explore” links below, and search yourfavorite gene:

LinksTissueDiseaseNumber of cellsNumber of samplesSourceSignac version
ExploreKidneyCancer48,03747Stewartet al. 2019v2.0.7
ExploreKidney and urineLupus nephritis and healthy5,88639Arazi etal. 2019v2.0.7
ExploreLungCancer42,84418Zilionis etal. 2020v2.0.7
ExploreLungFibrosis96,46131Habermann etal. 2020v2.0.7
ExploreLungFibrosis109,42116Reyfman etal. 2019v2.0.7
ExploreMonkey PBMCsHealthy5,4911Chamberlainet al. 2021v2.0.7
ExploreMonkey PBMCsHealthy5,2201Chamberlainet al. 2021v2.0.7
ExploreMonkey T cellsHealthy5,4961Chamberlainet al. 2021v2.0.7
ExplorePBMCsCancer14,0488Zilionis etal. 2020v2.0.7
ExplorePBMCsHealthy7,902110XGenomicsv2.0.7
ExplorePBMCsHealthy4,784110XGenomicsv2.0.7
ExploreSkinAtopic dermatitis36,69017He etal. 2020v2.0.7
ExploreSynoviumRheumatoid arthritis and osteoarthritis8,92026Zhanget. al 2019v2.0.7

Note: * Cell type annotations are provided at four levels (immune,celltypes, cellstates and novel celltypes). * When available, we alsoprovided information about sample covariates (i.e., disease, age,gender, FACs etc.). * Cell type annotations for all 13 data sets weregenerated with the Signac function with the default settings withoutchanging any settings or parameters.

Special thanks to Allon Klein’s lab (particularly Caleb Weinreb andSam Wolock) for hosting the data.

Getting Started

To install SignacX in R, simply do:

Installation

install.packages("SignacX")

Quick start

The main functions in Signac are:

# load the librarylibrary(SignacX)# Generate initial labelslabels=Signac(E = your_data_here)# Get cell type labelscelltypes=GenerateLabels(labels,E = your_data_here)

Sometimes we don’t have time to run Signac, and need a quicksolution. Although Signac scales fine with large data sets (>300,000cells), we developed SignacFast to quickly classify single celldata:

# load the librarylibrary(SignacX)# generate labels with pre-trained modellabels_fast<-SignacFast(E = your_data_here,num.cores =4)celltypes_fast=GenerateLabels(labels_fast,E = your_data_here)

Usage

To make life easier, SignacX was integrated withSeurat (versions 3 and 4), andwithSPRING. Weprovide a few vignettes:

SPRING

In thepre-print,we often used Signac integrated withSPRING. Toreproduce our findings and to generate new results with SPRING, pleasevisit the SPRING repository which hasexample notebooks andinstallation instructions, particularly forprocessingCITE-seq and scRNA-seq data from 10X Genomics. Briefly, Signac isintegrated seamlessly with the output files of SPRING in R, requiringonly a few functions:

# load the Signac librarylibrary(SignacX)# dir points to the "FullDataset_v1" directory generated by the SPRING Jupyter notebookdir="./FullDataset_v1"# load the expression dataE=CID.LoadData(dir)# generate cellular phenotype labelslabels=Signac(E,spring.dir = dir)celltypes=GenerateLabels(labels,E = E,spring.dir = dir)# write cell types and Louvain clusters to SPRINGdat<-CID.writeJSON(celltypes,spring.dir = dir)

After running the above functions, cellular phenotypes and Louvainclusters are ready to be visualized with SPRING Viewer, which can besetup locally as describedhere.

Seurat

Another way to use Signac is with Seurat.Inthis vignette, we performed multi-modal analysis of CITE-seq PBMCsfrom 10X Genomics using Signac integrated with Seurat.

Note: * This same data set was also processed using SPRINGinthis notebook, and subsequently classified with Signac, which wasused to generate SPRING layouts for these data in thepre-print(Figures 2-4), which is available for interactive explorationhere.

MASC

Sometimes, we have single cell genomics data with diseaseinformation, and we want to know which cellular phenotypes are enrichedfor disease.Inthis vignette, we applied Signac to classify cellular phenotypes inhealthy and lupus nephritis kidney cells, and then we usedMASC to identifywhich cellular phenotypes were disease-enriched.

Note: * MASC typically requires equal numbers of cells and samplesbetween case and control: an unequal number might skew the clustering ofcells towards one sample (i.e., a “batch effect”), which could causespurious disease enrichment in the mixed effect model. Since Signacclassifies each cell independently (without using clusters), Signacannotations can be used with MASC without a priori balancing samples orcells, unlike cluster-based annotation methods.

Non-human data

In Supplemental Figure 8 of thepre-print,we classified single cell data for a model organism (cynomolgus monkey)for which flow-sorted datasets were generally lacking without anyadditional species-specific training. Instead, we mapped homologousgenes from theMacaca fascicularis genome to the human genomein the single cell data, and then performed cell type classificationwith Signac. We demonstrate how we mapped the gene symbolshere.

Note: * This code can be used for to identify homologous genesbetween any two species. * Monkey data used in Supplemental Figure 8 areavailable for interactive exploration in the table listed above.

Genes of interest

In Figure 6 of thepre-print,we compiled data from three source (CellPhoneDB,GWAS catalog andFang etal. 2020) to find genes of immunological / pharmacological interest.These genes and their annotations can be accessed internally from withinSignac:

# load the librarylibrary(SignacX)# See ?Genes_Of_Interestdata("Genes_Of_Interest")

Learning from single celldata

In Figure 4 of thepre-print,we demonstrated that Signac mapped cell type labels from one single celldata set to another; learning CD56bright NK cells fromCITE-seq data.Here,we provide a vignette for reproducing this analysis, which can beused to map cell populations (or clusters of cells) from one data set toanother. We also provide interactive access to the single cell data thatwere annotated with the CD56bright NK cell-model (Note: theCD56bright NK cells appear in the “CellStates” annotationlayer as red cells).

LinksTissueDiseaseNumber of cellsNumber of samplesSourceSignac version
ExploreKidneyCancer48,03747Stewartet al. 2019v2.0.7 + CD56bright NK
ExploreKidney and urineLupus nephritis and healthy5,88639Arazi etal. 2019v2.0.7 + CD56bright NK
ExploreLungCancer42,84418Zilionis etal. 2020v2.0.7 + CD56bright NK
ExploreLungFibrosis96,46131Habermann etal. 2020v2.0.7 + CD56bright NK
ExploreLungFibrosis109,42116Reyfman etal. 2019v2.0.7 + CD56bright NK
ExploreMonkey PBMCsHealthy5,4911Chamberlainet al. 2021v2.0.7 + CD56bright NK
ExploreMonkey PBMCsHealthy5,2201Chamberlainet al. 2021v2.0.7 + CD56bright NK
ExploreMonkey T cellsHealthy5,4961Chamberlainet al. 2021v2.0.7 + CD56bright NK
ExplorePBMCsCancer14,0488Zilionis etal. 2020v2.0.7 + CD56bright NK
ExplorePBMCsHealthy4,784110XGenomicsv2.0.7 + CD56bright NK
ExploreSkinAtopic dermatitis36,69017He etal. 2020v2.0.7 + CD56bright NK
ExploreSynoviumRheumatoid arthritis and osteoarthritis8,92026Zhanget. al 2019v2.0.7 + CD56bright NK

Fast Signac

Sometimes we don’t have time to run Signac and need a fastersolution. Although Signac scales fine with large data sets (>300,000cells) and even for large data, typically takes less than an hour, wedeveloped SignacFast to quickly classify single cell data:

# load the librarylibrary(SignacX)# generate labels with pre-trained modellabels_fast<-SignacFast(E = your_data_here,num.cores =4)celltypes_fast=GenerateLabels(labels_fast,E = your_data_here)

Unlike Signac, SignacFast uses a pre-trained ensemble of neuralnetwork models generated from the HPCA reference data, speedingclasssification time ~5-10x fold. These models were generated from theHPCA training data like so:

# load the librarylibrary(SignacX)# load pre-trained neural network ensemble modelref=GetTrainingData_HPCA()# generate modelsModels_HPCA=ModelGenerator(R = training_HPCA,N =100,num.cores =4)

The “Models_HPCA” are accessed from within the R package:

# load the librarylibrary(SignacX)# load pre-trained neural network ensemble modelModels=GetModels_HPCA()

We demonstrate how to use SignacFast in thisvignette,which shows that the results are broadly consistent with runningSignac.

Note: * For proper use; if the concern is only major cell types(i.e., TNK and MPh cells), then SignacFast is a fine alternative toSignac.

Benchmarking

CITE-seq

In Figure 2-3 of thepre-print,we validated Signac with CITE-seq PBMCs. Here, we reproduced thatanalysis with SPRING (in this vignette;as was performed in the pre-print) and additionally with Seurat (inthis vignette), and provide interactive access to the datahere.

Flow-sorted synovial cells

In Figure 3 of thepre-print,we validated Signac with flow cytometry and compared Signac to SingleR.We reproduced that analysis using Seuratinthis vignette, and provide interactive access to the datahere.

PBMCs

In Table 1 of thepre-print,we benchmarked Signac across seven different technologies: CEL-seq,Drop-Seq, inDrop, 10X (v2), 10X (v3), Seq-Well and Smart-Seq2; thisanalysis was reproducedhere.

Roadmap

See theopenissues for a list of proposed features (and known issues).

Contributing

Any contributions you make aregreatlyappreciated.

  1. Fork the Project
  2. Create your Feature Branch(git checkout -b feature/AmazingFeature)
  3. Commit your Changes(git commit -m 'Add some AmazingFeature')
  4. Push to the Branch(git push origin feature/AmazingFeature)
  5. Open a Pull Request

You can also open a pull request to commit to the master branch.

License

Distributed under the GPL v3.0 License. SeeLICENSE formore information.

Contact

Mathew Chamberlain - chamberlainphd@gmail.com

Project Link:https://github.com/mathewchamberlain/SignacX


[8]ページ先頭

©2009-2025 Movatter.jp