CN120813702A

Movatterモバイル変換

Info

Publication number: CN120813702A
Application number: CN202480015277.3A
Authority: CN
Inventors: 乔迪·马丁; 加文·莫科姆; 辛西娅·萨科夫斯基; 杰弗里·奥斯本
Original assignee: Becton Dickinson and Co
Current assignee: Becton Dickinson and Co
Priority date: 2023-02-28
Filing date: 2024-02-21
Publication date: 2025-10-17

Abstract

Methods and compositions are provided for preparing a population of labeled cells, e.g., useful in protocols for obtaining correlated single cell flow cytometry data and sequencing (e.g., multichemical) data. Aspects of the methods include combining a cell sample composed of a plurality of cells with a labeling composition comprising a plurality of different double-indexed beads, each bead having a different fluorescent barcode and oligonucleotide barcode, under conditions sufficient to stably bind cells in the cell sample to one or more double-indexed beads to produce a labeled population of cells. In embodiments, the labeled cell population is then subjected to a flow cytometry and sequencing workflow, wherein the obtained flow cytometry data and sequencing data can be correlated. Compositions for carrying out the methods are also provided.

Description

Double index particle labeling for obtaining correlated single cell count data and sequencing data

Cross Reference to Related Applications

The present application claims priority from U.S. provisional patent application Ser. No. 63/448,930, filed on 28 at 2 months 2023, the disclosure of which is incorporated herein by reference in its entirety.

Introduction to the invention

Current technology allows measurement of gene expression of single cells in a massively parallel manner (e.g., >10000 cells) by attaching cell-specific oligonucleotide barcodes to poly (a) mRNA molecules from single cells, as each cell is co-located with barcoded reagent beads in a compartment. BD Rhapsody^TM Single cell analysis System is a platform that supports measurement of single cell gene expression in a massively parallel manner. BD Rhapsody^TM Single cell analysis systems are platforms that can capture nucleic acids from single cells in high throughput using simple microplate (cartridge) workflow and multi-layered bar code systems. The resulting capture materials can be used to generate various types of second generation sequencing (NGS) libraries, including libraries suitable for whole transcriptome analysis, e.g., libraries for whole transcriptome analysis for discovery biology and for targeted RNA analysis for high sensitivity transcript detection. Shum et al ,"Quantitation of mRNA Transcripts and Proteins Using the BD Rhapsody^TM Single-Cell Analysis System,"Adv Exp Med Biol.2019;1129:63-79.

Gene expression can affect protein expression. Protein-protein interactions can affect gene expression and protein expression. Accordingly, systems and methods have recently been developed that can quantitatively analyze the protein expression of cells and simultaneously measure the protein expression and gene expression of cells. BD Abseq platforms are one such platform. Abseq is a method for analyzing proteins in and on single cells. In Abseq, conventional fluorescently labeled antibodies are replaced with nucleic acid sequence tags that can be read at the single cell level, for example, by bar code and NGS sequencing. "Abseq is aimed at enabling sensitive, precise and comprehensive characterization of protein and mRNA transcripts in a large number of single cells. Like conventional immunostaining, cells bind to antibodies directed against different target epitopes, but the antibodies are labeled with unique sequence tags. When an antibody binds to its target, the DNA tag is carried along with it, so that the presence of the target can be deduced from the presence of the tag. In this way, counting the tags provides an assessment of the different epitopes present in the cell, such as the cell detected by antibody binding. Shahi et al ,"Abseq:Ultrahigh-throughput single cell protein profiling with droplet microfluidic barcoding.Sci Rep 7,44447(2017).".

Flow cytometry is a technique for characterizing physical and/or chemical properties of a cell sample, such as detection, measurement, and the like. In flow cytometry, a cell sample is suspended in a fluid and injected into a flow cytometer instrument. The sample is focused to flow the laser beam, ideally one cell at a time, where the scattered light is characteristic of the cells and their components. Cells are typically labeled with a fluorescent label (e.g., an antibody-fluorophore) such that light is emitted in the wavelength band after being absorbed. Tens of thousands of cells can be rapidly detected and data collected therefrom. Flow cytometry is routinely used in basic research, clinical practice and clinical trials. Applications for flow cytometry include, but are not limited to, cell counting, cell sorting, determining cell characteristics and function, microbiological detection, biomarker detection, protein engineering detection, diagnosis of health conditions, genome size measurement, and the like. A flow cytometer is an instrument that provides quantifiable data of a sample. Other instruments that employ flow cytometry include cell sorters that physically separate and thereby purify target cells based on their optical properties.

Disclosure of Invention

The inventors have recognized that it is desirable to be able to correlate flow cytometry or microscopy data with downstream sequencing data, such as multiple sets of chemical data, so that flow cytometry data, including imaging cytometry data, and multiple sets of chemical sequencing data of the same cell can be readily obtained. Described herein are new methods for indexing single cells for correlating fluorescence data (immunophenotype data, functional data, or other data) or brightfield imaging data with downstream single cell sequencing, e.g., multicellular data. Embodiments of the present invention provide the ability to screen living cells of interest or stored cells by imaging and/or fluorescence analysis (e.g., fluorescence activated cell sorting), and subsequently obtain sequencing data for these selected cells.

Methods and compositions are provided for preparing labeled cell populations, e.g., that can be used in protocols for obtaining correlated single cell flow cytometry data and sequencing (e.g., multichemical) data. Aspects of the method include mixing a cell sample composed of a plurality of cells with a labeling composition comprising a plurality of different double-indexed beads, each bead having a unique fluorescent barcode and oligonucleotide barcode, under conditions sufficient to stably bind cells in the cell sample to one or more double-indexed beads to produce a labeled population of cells. The index of individual cells is the result of a unique combination of cell markers. In embodiments, a flow cytometry and sequencing workflow is performed on a population of labeled cells, wherein the obtained flow cytometry data and sequencing data may be correlated. Compositions for carrying out the method are also provided.

Aspects of the invention utilize double indexed beads with bound oligonucleotide barcodes and fluorescent address codes (i.e., fluorescent barcodes). Aspects of the method employ a large and diverse pool of double indexed beads. In the bead pool, each unique fluorescent barcode is directly bound to a unique oligonucleotide barcode. In embodiments, the double indexed beads bind cells randomly as desired, e.g., in solution or on a two-dimensional surface, by affinity binding or other stable binding mechanism (e.g., encapsulated within an emulsion partition). The combination of double indexed beads with random combinations of individual cells provides unique combinations of barcodes that bind to cells in a sample such that most, if not all, of the cells have unique fluorescent features composed of one or more barcodes of double indexed beads that bind to the cells. The fluorescent barcode of the cell-bound beads can then be analyzed using the same modality as fluorescence data is collected from the cells (e.g., flow cytometry). Subsequently, in a single-cell multi-set chemical workflow, the oligonucleotide barcodes of the double-indexed beads can be released and captured by the complementary oligonucleotides, while endogenous nucleic acids and other barcodes of cell binding are captured upon cell lysis. Oligonucleotide barcodes from double indexed beads can be used to analogize sample multiplex oligonucleotides in a hash (handling) experiment. Standard single cell multiplex library preparation, sequencing protocols and data analysis were then used to identify oligonucleotide barcode combinations specific to each cell in the experiment. NGS single cell data can be correlated one-to-one with flow cytometry single cell data by direct binding between each combined oligonucleotide barcode and each combined fluorescent barcode.

Drawings

The invention is best understood from the following detailed description when read in connection with the accompanying drawing figures. Included in the drawings are the following figures:

Fig. 1 shows a double index bead according to an embodiment of the present invention.

FIG. 2 shows cells labeled with double indexed beads according to an embodiment of the invention.

Figure 3 shows cells labeled with double indexed beads according to an embodiment of the invention, wherein the cell binding member is a gel nanobottle.

FIG. 4 illustrates acquiring flow cytometry imaging data of labeled cells according to an embodiment of the present invention.

Definition of the definition

Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. See, e.g., singleton et al ,Dictionary of Microbiology and Molecular Biology 2nd ed.,J.Wiley&Sons(New York,NY 1994);Sambrook et al, molecular Cloning, A Laboratory Manual, cold Spring Harbor Press (Cold Spring Harbor, NY 1989). For purposes of this disclosure, the following terms are defined as follows.

As used herein, an antibody may be a full length (e.g., naturally occurring or formed by the process of recombination of normal immunoglobulin gene fragments) immunoglobulin molecule (e.g., an IgG antibody), or an immunologically active (i.e., specifically binding) portion of an immunoglobulin molecule, such as an antibody fragment. In some embodiments, the antibody is a functional antibody fragment. For example, an antibody fragment may be a portion of an antibody, such as F (ab ') 2, fab', fab, fv, sFv, and the like. The antibody fragment may bind to the same antigen as recognized by the full length antibody. Antibody fragments may include isolated fragments consisting of antibody variable regions, such as "Fv" fragments consisting of the variable regions of the heavy and light chains, as well as recombinant single chain polypeptide molecules ("scFv proteins") in which the light chain variable region, the heavy chain variable region, are linked by a peptide linker. Exemplary antibodies can include, but are not limited to, antibodies against cancer cells, antibodies against viruses, antibodies that bind to cell surface receptors (e.g., CD8, CD34, and CD 45), and therapeutic antibodies.

As used herein, the term "bind" or "bind with" may mean that two or more substances may be identified as co-localized at a point in time. Binding may refer to two or more species (specie) now or once in a similar container. The combination may be a combination of information. For example, digital information about two or more species may be stored and used to determine that one or more species are co-located at a point in time. The bond may also be a physical bond. In some embodiments, two or more bound species are "tethered," "attached," or "immobilized" to each other, or are "tethered," "attached," or "immobilized" on a common solid or semi-solid surface. Binding may refer to covalent or non-covalent means for attaching the label to a solid or semi-solid support such as a bead. Binding may be a covalent bond between the target and the label. Binding may include hybridization between two molecules (e.g., a target molecule and a label).

As used herein, the term "complementary" may refer to the ability to precisely pair between two nucleotides. For example, a nucleic acid is considered to be complementary to one another at a given position if the nucleotide at that position is capable of forming hydrogen bonds with the nucleotide of the other nucleic acid. Complementarity between two single-stranded nucleic acid molecules may be "partial," in which only some nucleotides bind, or may be complete when there is complete complementarity between the single-stranded molecules. A first nucleotide sequence may be referred to as a "complement" of a second sequence if the first nucleotide sequence is complementary to the second nucleotide sequence. A first nucleotide sequence may be referred to as a "reverse complement" of a second sequence if the first nucleotide sequence is complementary to the reverse sequence (i.e., the order of the nucleotides is reversed) of the second sequence. As used herein, the terms "complement," "complementary," and "reverse complement" are used interchangeably. It will be appreciated from the disclosure that a molecule may be the complement of a molecule being hybridized if it can hybridize to another molecule.

As used herein, the term "nucleic acid" refers to a polynucleotide sequence or fragment thereof. The nucleic acid may comprise nucleotides. The nucleic acid may be exogenous or endogenous relative to the cell. The nucleic acid may be present in a cell-free environment. The nucleic acid may be a gene or a fragment thereof. The nucleic acid may be DNA. The nucleic acid may be RNA. The nucleic acid may comprise one or more than one analog (e.g., altered backbone, sugar, or nucleobase). Some non-limiting examples of analogs include 5-bromouracil, peptide nucleic acids, heterologous nucleic acids, morpholino, locked nucleic acids, ethylene glycol nucleic acids, threose nucleic acids, dideoxynucleotides, cordycepin, 7-deaza-GTP, fluorophores (e.g., rhodamine or fluorescein linked to sugars), sulfhydryl-containing nucleotides, biotin-linked nucleotides, fluorescent base analogs, cpG islands, methyl-7-guanosine, methylated nucleotides, inosine, thiouridine, pseudouridine, dihydrouridine, braided glycosides (queuosine), and hupeoside (wyosine). "nucleic acid", "polynucleotide", "target polynucleotide" and "target nucleic acid" are used interchangeably.

The nucleic acid may comprise one or more than one modification (e.g., base modification, backbone modification) to provide the nucleic acid with new or enhanced features (e.g., improved stability). The nucleic acid may comprise a nucleic acid affinity tag. The nucleoside may be a base-sugar combination. The base portion of a nucleoside may be a heterocyclic base. Two of the most common classes of such heterocyclic bases are purine and pyrimidine. The nucleotide may be a nucleoside further comprising a phosphate group covalently linked to the sugar moiety of the nucleoside. For those nucleosides that include a pentofuranosyl sugar, the phosphate group can be attached to the 2' hydroxyl moiety, the 3' hydroxyl moiety, or the 5' hydroxyl moiety of the sugar. In forming nucleic acids, phosphate groups can covalently link adjacent nucleosides to one another to form a linear polymeric compound. In turn, the respective ends of the linear polymeric compounds may also be joined to form cyclic compounds, however, linear compounds are generally suitable. Furthermore, the linear compounds may have internal nucleotide base complementarity and thus may fold in a manner that results in a full or partial double chain compound. Within a nucleic acid, phosphate groups are commonly referred to as forming the internucleoside backbone of the nucleic acid. The bond or backbone may be a 3 'to 5' phosphodiester bond.

The nucleic acid may comprise a modified backbone and/or modified internucleoside linkages. Modified backbones may include those that retain phosphorus atoms in the backbone and those that do not have phosphorus atoms in the backbone. Suitable modified nucleic acid backbones containing phosphorus atoms therein may include, for example, phosphorothioates, chiral phosphorothioates, phosphorodithioates, phosphotriesters, aminoalkyl phosphotriesters, methylphosphonates and other alkylphosphonates, such as 3' -alkylene phosphonates, 5' -alkylene phosphonates, chiral phosphonates, phosphonites, phosphoramidates including 3' -phosphoramidates and aminoalkyl phosphoramidates, phosphodiamidates, phosphorothioate esters (thionophosphoramidate), phosphorothioate alkyl phosphonates, phosphorothioate triesters, selenophosphate and the like having normal 3' to 5' linkages, 2' to 5' linkages, and those nucleic acid backbones having opposite polarity in which one or more internucleoside linkages are 3' to 3' linkages, 5' to 5' linkages, or 2' to 2' linkages.

The nucleic acid may comprise a polynucleotide backbone formed by short chain alkyl or cycloalkyl internucleoside linkages, mixed heteroatom internucleoside linkages and alkyl or cycloalkyl internucleoside linkages, or one or more short chain heteroatom internucleoside linkages or heterocyclic internucleoside linkages. These may include those having morpholino linkages (formed in part from the sugar moiety of the nucleoside), siloxane backbones, sulfide, sulfoxide and sulfone backbones, formylacetyl backbones and thioformylacetyl backbones, methyleneformylacetyl backbones and thioformylacetyl backbones, riboacetyl backbones, olefin-containing backbones, sulfamate backbones, methyleneimino (methyleneimino) and methylenehydrazino (methylenehydrazino) backbones, sulfonate and sulfonamide backbones, amide backbones, and other backbones having portions of the components of N, O, S and CH₂ mixed.

The nucleic acid may comprise a nucleic acid mimetic. The term "mimetic" may be intended to include polynucleotides in which only furanose rings or furanose rings and internucleoside linkages are substituted with non-furanose groups, the substitution of only furanose rings may also be referred to as sugar substitutes. The heterocyclic base moiety or modified heterocyclic base moiety can remain hybridized to the appropriate target nucleic acid. One such nucleic acid may be a Peptide Nucleic Acid (PNA). In PNA, the sugar backbone of the polynucleotide may be replaced by an amide containing backbone, in particular an aminoethylglycine backbone. The nucleotide may be retained and bound directly or indirectly to the nitrogen heteroatom of the amide portion of the backbone. The backbone in the PNA compound may comprise two or more linked aminoethylglycine units, which results in PNA having an amide-containing backbone. The heterocyclic base moiety may be directly or indirectly bound to the aza nitrogen atom of the backbone amide moiety.

The nucleic acid may comprise a morpholino backbone structure. For example, the nucleic acid may comprise a 6-membered morpholine ring in place of a ribose ring. In some embodiments, the phosphodiester linkage may be replaced with an internucleoside linkage of a phosphodiamide or other non-phosphodiester.

The nucleic acid can comprise linked morpholino units (e.g., morpholino nucleic acids) having a heterocyclic base attached to a morpholino ring. The linking group can be attached to a morpholino monomer unit in the morpholino nucleic acid. Nonionic morpholino-based oligomeric compounds can have fewer undesired interactions with cellular proteins. Morpholino-based polynucleotides may be nonionic mimics of nucleic acids. The various compounds in the morpholino class may be linked by different linking groups. Another class of polynucleotide mimics may be referred to as cyclohexenyl nucleic acids (CeNA). The furanose ring normally present in a nucleic acid molecule may be replaced by a cyclohexenyl ring. CeNA DMT protected phosphoramidite monomers can be prepared and used to synthesize oligomeric compounds using phosphoramidite chemistry. Incorporation of CeNA monomers into nucleic acid strands can increase the stability of DNA/RNA hybrids. CeNA oligoadenylates can form complexes with nucleic acid complements, with stability similar to natural complexes. Additional modifications may include Locked Nucleic Acids (LNA) in which the 2 '-hydroxy group is attached to the 4' carbon atom of the sugar ring, thereby forming a 2'-C,4' -C-oxymethylene bond, thereby forming a bicyclic sugar moiety. The bond may be a methylene group (-CH₂) bridging the 2 'oxygen atom and the 4' carbon atom, where n is 1 or 2. LNAs and LNA analogs can exhibit very high duplex thermal stability (tm= +3 ℃ to +10 ℃) with complementary nucleic acids, stability to 3' -exonuclease degradation, and good solubility characteristics.

Nucleic acids may also include nucleobase (often referred to simply as "base") modifications or substitutions. As used herein, "unmodified" or "natural" nucleobases can include purine bases (e.g., adenine (a) and guanine (G)) and pyrimidine bases (e.g., thymine (T), cytosine (C) and uracil (U)). Modified nucleobases may include other synthetic and natural nucleobases such as 5-methylcytosine (5-me-C), 5-hydroxymethylcytosine, xanthine, hypoxanthine, 2-aminoadenine, 6-methyl and other alkyl derivatives of adenine and guanine, 2-propyl and other alkyl derivatives of adenine and guanine, 2-thiouracil, 2-thiothymine and 2-thiocytosine, 5-halouracil and cytosine, 5-propynyl (-c=c—ch3) uracil and cytosine and other alkynyl derivatives of pyrimidine bases, 6-azouracil, cytosine and thymine, 5-uracil (pseudouracil), 4-thiouracil, 8-halo, 8-amino, 8-mercapto, 8-thioalkyl, 8-hydroxy and other 8-substituted adenine and guanine, 5-halo, in particular 5-bromo, 5-trifluoromethyl and other 5-substituted uracil and cytosine, 7-methylguanine and 7-methyladenine, 2-F-adenine, 2-aza, 8-deaza and 7-deaza, and 7-deaza and 8-deaza. The modified nucleobases may include tricyclopyrimidines, such as phenoxazine cytidine (1H-pyrimido (5, 4-b) (1, 4) benzoxazin-2 (3H) -one), phenothiazine cytidine (1H-pyrimido (5, 4-b) (1, 4) benzothiazin-2 (3H) -one), G-clamps (G-clamp) such as substituted phenoxazine cytidine (e.g., 9- (2-aminoethoxy) -H-pyrimido (5, 4- (b) (1, 4) benzoxazin-2 (3H) -one), phenothiazine cytidine (1H-pyrimido (5, 4-b) (1, 4) benzothiazin-2 (3H) -one), G-clamps such as substituted phenoxazine cytidine (e.g., 9- (2-aminoethoxy) -H-pyrimido (5, 4- (b) (1, 4) benzoxazin-2 (3H) -one), carbazole cytidine (2H-pyrimido (4, 5-b) pyrido-2 (3H) -one), and pyrido [2, 4' ] pyrido 2 (3H) -one.

As used herein, the term "sample" may refer to a composition comprising a target. Suitable samples for analysis by the disclosed methods, devices and systems include cells, tissues, organs or organisms. A cell sample is a composition composed of a plurality of cells, such as a composition comprising a plurality of different cells, such as an aqueous composition of single cells, wherein the number of cells may vary.

As used herein, the term "sampling device" or "device" may refer to a device that can acquire a portion of a sample and/or place the portion on a substrate. Sampling devices may refer to, for example, fluorescence Activated Cell Sorting (FACS) instruments, cell sorting instruments, biopsy needles, biopsy devices, tissue slice devices, microfluidic devices, cascades (blade grid), and/or microtomes (microtome).

As used herein, the term "solid support" may refer to a discrete solid or semi-solid surface to which nucleic acids may be attached. The solid support may comprise any type of solid, porous or hollow sphere, carrier, cylinder or other similar structure, composed of a plastic, ceramic, metal or polymeric material (e.g., hydrogel) to which the nucleic acid may be immobilized (e.g., covalently or non-covalently). The solid support may comprise discrete particles that are spherical (e.g., microspheres) or have non-spherical or irregular shapes such as cubes, cuboids, cones, cylinders, cones, ovals, discs, etc. The beads may be non-spherical. The plurality of solid supports spaced in the array may not comprise a substrate. The solid support may be used interchangeably with the term "bead".

As used herein, the term "target" may refer to a component that may be analyzed according to embodiments of the present invention. Exemplary suitable targets for analysis by the disclosed methods, devices, and systems include oligonucleotides, DNA, RNA, mRNA, micrornas, trnas, and the like. The target may be single-stranded or double-stranded. In some embodiments, the target may be a protein, peptide, or polypeptide. In some embodiments, the target is a lipid. As used herein, "target" is used interchangeably with "species".

Detailed Description

Methods and compositions are provided for preparing labeled cell populations, e.g., which can be used in protocols for obtaining correlated single cell flow cytometry data and sequencing (e.g., multicell) data. Aspects of the methods include mixing a cell sample composed of a plurality of cells with a labeling composition comprising a plurality of different double-indexed beads, each bead having a unique fluorescent barcode and oligonucleotide barcode, under conditions sufficient to stably bind the cells in the cell sample to one or more double-indexed beads to produce a labeled population of cells. In embodiments, the flow cytometry and sequencing workflow is performed on a population of labeled cells, and the obtained flow cytometry data and sequencing data may be correlated. Compositions for carrying out the method are also provided.

Before the present invention is described in more detail, it is to be understood that this invention is not limited to particular embodiments described, as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting, since the scope of the present invention will be limited only by the appended claims.

Where a range of values is provided, it is understood that unless the context clearly dictates otherwise, to the nearest tenth of the unit of the lower limit, the invention includes every intermediate value between the upper and lower limits of that range, and any other stated or intermediate value within that range. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges, and are also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the invention.

Certain ranges set forth herein are used with the term "about" preceding the numerical values. The term "about" as used herein provides literal support for the exact numbers following, as well as for numbers near or near the end of the figure. In determining whether a number is close or approximately to a specifically recited number, the close or approximately non-recited number may be the number that provides substantial identity to the specifically recited number in the context in which it appears.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present invention, representative illustrative methods and materials are described below.

All publications and patents cited in this specification are herein incorporated by reference as if each individual publication or patent were specifically and individually indicated to be incorporated by reference and were set forth herein by reference to disclose and describe the methods and/or materials in connection with which the publications were cited. The citation of any publication is for its disclosure prior to the filing date and should not be construed as an admission that the present invention is not entitled to antedate such publication by virtue of prior invention. Furthermore, the dates of publication provided may be different from the actual publication dates, which may need to be independently confirmed.

It should be noted that, as used herein and in the appended claims, the term "pre-countless" includes plural referents unless the context clearly dictates otherwise. It should also be noted that the claims may be written to exclude any optional elements. Accordingly, this statement is intended to serve as antecedent basis for use of exclusive terminology such as "solely," "only" and the like in connection with recitation of claim elements, or use of "negative" limitation.

It will be apparent to those skilled in the art after reading this disclosure that each of the individual embodiments described and illustrated herein has discrete compositions and features that can be readily separated from or combined with the features of any of the other several embodiments without departing from the scope or spirit of the invention. Any recited method may be performed in the order of the recited events or in any other order that is logically possible.

Although the system and method has been or will be described for the sake of grammatical fluidity and functional explanation, it is to be clearly understood that the claims are not to be construed as necessarily limited in any way by the interpretation of the terms of "means" or "steps" limitation, but are to be accorded the full scope of meaning and equivalents of the definitions provided by the claims in accordance with the judicial doctrine of equivalents, and are to be accorded the full legal equivalents in accordance with 35U.S. C. ≡112 where the claims are expressly specified in accordance with 35U.S. C. ≡112.

Method of

As outlined above, methods of preparing a population of labeled cells are provided. In some cases, the population of labeled cells is comprised of a plurality of distinguishable labeled cells, wherein the plurality of distinguishable labeled cells have different or unique fluorescent characteristics associated therewith. "fluorescence signature" (i.e., fluorescent identifier) refers to a composite spectrum or aggregate spectrum composed of fluorescence emission signals obtained from one or more fluorophores stably associated with labeled cells, such as provided by one or more double indexed beads (as described in more detail below) associated with the cells. In the population of labeled cells produced by the methods of embodiments of the present invention, different ones of the plurality of cells have distinguishable fluorophores constituting fluorescent identifiers bound thereto, and thus provide different fluorescent characteristics, for example, when determined by a flow cytometry protocol. Thus, the different labeled cells in the cell population are distinguishable from each other by the presence of their unique fluorescent characteristics.

Double indexed beads

As described above, the fluorescent characteristics of a given labeled cell produced by an embodiment of the invention are provided by one or more double indexed beads that stably bind to the cell. Double indexed beads are particle compositions comprising one or more fluorophores that make up the fluorescent barcode of the bead, as well as oligonucleotide barcodes. The double indexed beads that may be employed in embodiments of the invention may have any convenient shape and size. The beads are solid supports, may comprise any type of solid, porous or hollow spheres, carriers, cylinders or other similar configurations composed of plastic, ceramic, metal or polymeric materials (e.g., hydrogels), onto which nucleic acids may be immobilized (e.g., covalently or non-covalently) and one or more fluorophores may be incorporated. The solid support may comprise discrete particles that are spherical (e.g., microspheres) or have non-spherical or irregular shapes such as cubes, cuboids, cones, cylinders, cones, ovals, discs, etc. The beads may be non-spherical. The bead size can vary as desired, and is typically sized such that the double indexed beads are smaller than the cells, with the double indexed beads in some cases ranging in size from 0.5 μm to 20 μm, such as from 1 μm to 10 μm.

Fluorescent bar code

If desired, the fluorescent barcode comprises one or more fluorophores. Thus, the double indexed beads may comprise a single type of fluorophore. Or a given double indexed bead may comprise two or more different types of fluorophores. Examples of fluorophores include, but are not limited to, acridine and its derivatives such as acridine, acridine orange, acridine yellow, acridine red and acridine isothiocyanate, 5- (2' -aminoethyl) aminonaphthalene-1-sulfonic acid (EDANS), 4-amino-N- [ 3-vinylsulfonyl) phenyl ] naphthalimide-3, 5 disulfonic acid (Lucifer Yellow VS), N- (4-amino-1-naphthyl) maleimide, anthranilamide, brilliant yellow, coumarin and its derivatives such as Coumarin, 7-amino-4-methylcoumarin (AMC, coumarin 120), 7-amino-4-trifluoromethylcoumarin (Coumaran), cyanine and its derivatives such as acid Red 92 (cyanosine), cy3, cy5, Cy5.5 and Cy7, 4', 6-diamidino-2-phenylindole (DAPI), 5' -dibromo-o-phenyl-sulfonyl-phthalein (bromophthal-red), 7-diethylamino-3- (4 ' -isothiocyanaphenyl) -4-methylcoumarin, diethylaminocoumarin, diethylenetriamine pentaacetic acid, 4' -diiso-dihydrostilbene-2, 2' -disulfonic acid, 4' -diisocyanato stilbene-2, 2' -disulfonic acid, 5- [ dimethylamino ] naphthalene-1-sulfonyl chloride (DNS, dansyl chloride), 4- (4 ' -dimethylaminophenyl azo) benzoic acid (DABCYL), 4-dimethylaminophenyl-4 ' -isothiocyanate (DABITC), eosin and its derivatives such as eosin, erythrosin and its derivatives such as erythrosin B and erythrosin, ethidium bromide, fluorescein and its derivatives such as 5-carboxyfluorescein (FAM), and its derivatives such as erythrosin, 5- (4, 6-dichlorotriazin-2-yl) aminofluorescein (DTAF), 2'7' -dimethoxy-4 ', 5' -dichloro-6-carboxyfluorescein (JOE), fluorescein Isothiocyanate (FITC), chlorotriazinyl fluorescein, naphthofluorescein and QFITC (XRITC), fluorescamine, IR144, IR1446, LISSAMINE TM, lissamine rhodamine, fluorescein, malachite green, 4-methylumbelliferone, o-cresolphthalein, nitrotyrosine, accessory red, nile red, oregon green, phenol red, B-phycoerythrin, phthaldehyde, pyrene and derivatives thereof such as pyrene, Pyrene butyrate and pyrene butyrate succinimide, activated Red 4 (CibacronT TM bright Red 3B-A), rhodamine and its derivatives such as 6-carboxy-X-Rhodamine (ROX), 6-carboxy rhodamine (R6G), 4,7-dichloro rhodamine-lissamine (4, 7-dichlororhodamine lissamine), rhodamine B sulfonyl chloride, rhodamine (Rhod), rhodamine B, rhodamine 123, rhodamine isothiocyanate X, sulforhodamine B, sulforhodamine 101, sulfochloride derivatives of sulforhodamine 101 (Texas Red), N, N' -tetramethyl-6-carboxyrhodamine (TAMRA), tetramethyl rhodamine, and Tetramethyl Rhodamine Isothiocyanate (TRITC); riboflavin, rhodoic acid and terbium chelate derivatives, xanthenes, alexa-Fluor dyes (e.g. Alexa Fluor 350、Alexa Fluor 430、Alexa Fluor 488、Alexa Fluor 546、Alexa Fluor 555、Alexa Fluor 568、Alexa Fluor 594、Alexa Fluor 633、Alexa Fluor 647、Alexa Fluor 660、Alexa Fluor 680、Alexa Fluor700、Alexa Fluor 750)、Pacific Blue、Pacific Orange、Cascade Blue、Cascade Yellow; Quantum dot dye (Quantum Dot Corporation)), dyight dyes of Pierce (Rockford, ill.) including Dyight 800, Dyight 680, dyight 649, dyight 633, dyight 549, dyight 488, dyight 405, or combinations thereof. Other fluorophores known to those skilled in the art, or combinations thereof, may also be used, such as those available from Molecular Probes (Eugene, oreg.) and expiton (Dayton, ohio).

In some cases, the fluorophore is a polymeric dye (e.g., a fluorescent polymeric dye). Fluorescent polymer dyes useful in the subject methods are manifold. In some cases of the method, the polymeric dye comprises a conjugated polymer. Conjugated Polymers (CPs) are characterized by a delocalized electron structure comprising a backbone of alternating unsaturated bonds (e.g., double and/or triple bonds) and saturated bonds (e.g., single bonds), wherein pi electrons can move from one bond to another. Thus, the conjugated backbone may impart an extended linear structure to the polymeric dye with limited bond angles between the polymer repeat units. For example, proteins and nucleic acids are polymers at the same time, but in some cases do not form extended rod-like structures but fold into a more advanced three-dimensional shape. In addition, CPs may form a "rigid rod" polymer backbone and experience limited torsion (e.g., twist) angles between repeating units along the polymer backbone. In some cases, the polymeric dye comprises a CP having a rigid rod structure. The structural characteristics of the polymeric dye can affect the fluorescent properties of the molecule.

Any convenient polymeric dye may be used in the subject devices and methods. In some cases, the polymeric dye is a multichromophore having a structure capable of capturing light that amplifies the fluorescence output of the fluorophore. In some cases, the polymeric dye can capture light and convert it efficiently to longer wavelength emitted light. In some cases, the polymeric dye has a light trapping multichromophore system that is capable of energy efficient transfer to an adjacent luminescent material (e.g., a "signaling chromophore"). Energy transfer mechanisms include, for example, resonance energy transfer (e.g., forster (or fluorescence) resonance energy transfer, FRET), quantum charge exchange (tex energy transfer), and the like. In some cases, these energy transfer mechanisms are relatively short-range, i.e., the close proximity of the light trapping multichromophore system to the signaling chromophore can achieve efficient energy transfer. Under high energy transfer conditions, the emission of the signaling chromophore is amplified when the number of individual chromophores in the light trapping multichromophore system is large, i.e. when the incident light ("excitation light") is at a wavelength absorbed by the light trapping multichromophore system, the emission of the signaling chromophore is more intense than when it is directly excited by the pump light.

The multichromophore may be a conjugated polymer. Conjugated Polymers (CPs) are characterized by delocalized electronic structures and can be used as highly responsive optical reporter molecules for chemical and biological targets. Since the effective conjugation length is substantially shorter than the polymer chain length, the backbone comprises a large number of closely adjacent conjugated segments. Thus, conjugated polymers are efficient at capturing light and can achieve optical amplification by forster energy transfer.

Target polymer dyes include, but are not limited to, those described in U.S. Pat. Nos. 7,270,956, 7,629,448, 8,158,444, 8,227,187, 8,455,613, 8,575,303, 8,802,450, 8,969,509, 9,139,869, 9,371,559, 9,547,008, 10,094,838, 10,302,648, 10,458,989, 10,641,775 and 10,962,546, the disclosures of which are incorporated herein by reference in their entirety, and Gaylord et al, J.am.chem.Soc.,2001,123 (26), pp 6417-6418, feng et al, chem.Soc.Rev.,2010,39,2411-2419, and Traina et al, J.am.chem.Soc.,2011,133 (32), pp 12600-12607, the disclosures of which are incorporated herein by reference in their entirety. Specific polymer dyes that may be employed include, but are not limited to BD Horizon Brilliant^TM dyes, such as BD Horizon Brilliant^TM Violet dyes (e.g., BV421, BV510, BV605, BV650, BV711, BV 786), BD Horizon Brilliant^TM Ultraviolet dyes (e.g., BUV395, BUV496, BUV737, BUV 805), and BD Horizon Brilliant^TM Blue dyes (e.g., BB515, BB550, BB 790) (BD Biosciences, san Jose, CA). Any fluorescent pigments known to those skilled in the art, including but not limited to those described above, or not yet discovered, may be used in the subject methods.

In some cases, each fluorophore making up a given bar code may be excited by a common light source, such as a common laser. In such cases, each of the multiple fluorophores constituting a given barcode may have a common excitation wavelength range (e.g., they are excited by wavelength ranges that differ from each other by 50nm or less than 50nm, such as 25nm or less than 25nm, including 10nm or less than 10nm, e.g., 5nm or less than 5 nm), but differ from each other in emission maxima. In such cases, each of the multiple fluorophores that make up a given barcode may have a common excitation maximum, but differ from each other in terms of emission maximum.

As described above, any two distinguishable fluorescent barcodes may be distinguishable from each other based on the type of fluorophore comprising the barcode and/or the intensity of the signal it provides. Thus, any two different barcodes may be distinguished based on the fluorescent signal of the fluorescent signal obtained from the barcode and/or its intensity. For example, two distinguishable fluorescent barcodes can be distinguished from each other in that they are made up of a combination of different types of fluorophores, e.g., one of which contains fluorophore a, fluorophore b, and fluorophore c, and the other contains fluorophore b, fluorophore c, and fluorophore d. The two distinguishable fluorescent barcodes may also be distinguishable from each other in that they are composed of different amounts of fluorescent dye, e.g., one of which is composed of fluorophore a, fluorophore b, and fluorophore c present in a first amount in a given double indexed bead, and the other of which is composed of fluorophore present in a second amount different from the first amount, wherein the value of the second amount different from the first amount may be detected, e.g., by a difference in signal brightness. By binding different amounts of fluorophores to the double indexed beads, different brightnesses can be easily provided. Combinations of types and amounts of fluorophores can be employed to provide any desired number of unique fluorescent barcodes.

Oligonucleotide bar code

In addition to fluorescent barcodes, the double indexed beads employed in embodiments of the invention comprise oligonucleotide barcodes. Oligonucleotide barcodes may vary in length, in some cases from 10nt to 500nt, such as 15nt to 100nt. In some cases, the oligonucleotide barcode may be composed of ribonucleic acid or deoxyribonucleic acid, as desired. The oligonucleotide barcodes of embodiments of the invention may comprise a double indexed bead barcode domain, as well as other domains useful in embodiments of the invention, such domains may include capture sequences, primer binding sites, and the like.

The oligonucleotide barcode may comprise one or more of a double indexed bead barcode domain, a capture sequence, a primer binding site, and the like. The double-indexed bead barcode domain is a unique identifier and is a domain or region that can be used to identify the double-indexed bead to which it binds, for example, by its sequence. The unique identifier may be, for example, a nucleotide sequence having any suitable length, such as from about 4 nucleotides to about 200 nucleotides. In some embodiments, the unique identifier is a nucleotide sequence of 25 nucleotides to about 45 nucleotides in length. In some embodiments, the unique identifier may be 4 nucleotides, 5 nucleotides, 6 nucleotides, 7 nucleotides, 8 nucleotides, 9 nucleotides, 10 nucleotides, 15 nucleotides, 20 nucleotides, 25 nucleotides, 30 nucleotides, 35 nucleotides, 40 nucleotides, 45 nucleotides, 50 nucleotides, 55 nucleotides, 60 nucleotides, 70 nucleotides, 80 nucleotides, 90 nucleotides, 100 nucleotides, 200 nucleotides, or a range between any two of the above values in length of about the following, less than the following, greater than the following.

As described above, the oligonucleotide barcode may comprise a capture sequence, e.g., a domain or region thereof that is a binding site for a target binding region, e.g., a target binding region of a bead binding barcode nucleic acid. The target capture sequence may vary, and may be specific or random or semi-random, as desired. In some cases, the capture sequence is a sequence that hybridizes to a target binding region of the bead-binding nucleic acid, e.g., as described in more detail below. In some cases, the capture sequence is a poly (a) sequence configured to hybridize to an oligo t target binding region, as described in more detail below. In such cases, the length of the poly (a) capture sequence can vary, in some cases, the length of the poly (a) capture sequence is 3nt to 50nt, e.g., 5nt to 25nt. When present, the capture sequence may be located 5' to the oligonucleotide assembly.

The oligonucleotide barcode of the double indexed beads may comprise a primer binding site. When present, the primer binding site may be configured to bind to a primer used, for example, in the preparation of a sequencable-capable nucleic acid. For example, the oligonucleotide assembly may comprise a universal primer. A universal primer may refer to a universal or common nucleotide sequence in all specific binding members/oligonucleotide sub-barcodes employed in a given workflow. In some cases, the primer binding site can be at or about 1 nucleotide, 2 nucleotides, 3 nucleotides, 4 nucleotides, 5 nucleotides, 6 nucleotides, 7 nucleotides, 8 nucleotides, 9 nucleotides, 10 nucleotides, 11 nucleotides, 12 nucleotides, 13 nucleotides, 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides, 19 nucleotides, 20 nucleotides, 21 nucleotides, 22 nucleotides, 23 nucleotides, 24 nucleotides, 25 nucleotides, 26 nucleotides, 27 nucleotides, 28 nucleotides, 29 nucleotides, 30 nucleotides, or a number or range of lengths between any two of these nucleotides. The primer binding site length can vary and can be at least or up to 1 nucleotide, 2 nucleotides, 3 nucleotides, 4 nucleotides, 5 nucleotides, 6 nucleotides, 7 nucleotides, 8 nucleotides, 9 nucleotides, 10 nucleotides, 11 nucleotides, 12 nucleotides, 13 nucleotides, 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides, 19 nucleotides, 20 nucleotides, 21 nucleotides, 22 nucleotides, 23 nucleotides, 24 nucleotides, 25 nucleotides, 26 nucleotides, 27 nucleotides, 28 nucleotides, 29 nucleotides or 30 nucleotides in length. The length of the universal primer may vary, and in some cases may be 5 nucleotides to 30 nucleotides in length. The primer binding site may be located at the 5' end of the oligonucleotide barcode assembly.

As described in more detail below, the cell sample is mixed with a plurality of different double indexed beads while the cells of the cell sample are labeled. In the plurality of different double indexed beads, the oligonucleotide barcodes may share a common domain. For example, the oligonucleotide barcodes of different double indexed beads may have a common capture domain, primer binding site, and the like. In such cases, the capture domain, primer binding site, and other common domains can have the same sequence, such that the plurality of beads have the same common sequence, e.g., the same primer binding site, the same capture domain, etc.

FIG. 1 provides a schematic representation of a double indexed bead according to an embodiment of the invention. As shown, the double indexed beads 100 are polymer beads having a diameter of 0.5 μm to 20 μm, for example 1 μm to 10 μm. The polymer beads include three different fluorophores 102, 104, and 106, which together form the fluorescent barcode of the bead. Oligonucleotide barcodes 108 are also shown. As shown, by varying the number and/or brightness of the fluorophores of the bar code, a large number of different fluorescent bar codes can be obtained.

Cell binding members

In some embodiments, the cell binding member is a polymer, such as a hydrogel nanobottle, having a cavity size that accommodates a single cell. The nanovials may include one or more double indexed beads stably bound thereto, as well as binding members of cells, such as specific binding members for universal cell surface markers, as desired. Nanovials that may be modified as desired herein to include one or more double indexed beads are described in U.S. pending patent application publication nos. 2019321593 and 20210268465, and PCT application publication No. WO/2020/037214, the disclosures of which are incorporated herein by reference.

In some cases, the cell binding member may provide covalent binding to a cellular construct. In such cases, the double indexed beads may comprise functionality that provides covalent binding to cellular constituents. Such functionalization may include chemically reactive groups such as sulfhydryl groups, amino groups, carboxyl groups, and the like. In certain embodiments, the reactive groups are covalently attached to the double indexed beads, which covalent attachment may be direct attachment or attachment through a linker (e.g., a polymer linker, including polyethylene glycol or PEG). Examples of such functionalized double-indexed beads include, but are not limited to, double-indexed beads coupled with isothiocyanate groups, amino groups, haloacetyl groups, maleimides, succinimidyl esters, mercapto groups, aldehyde groups, hydrazides, and sulfonyl halides, all of which can be used to covalently attach the double-indexed beads to a second molecule (e.g., a molecule within or on a cell as described herein).

Manufacture of double index beaded beads

The double indexed beads can be manufactured using any convenient scheme. In some cases, the beads can be produced in separate batches, where the ligation (i.e., conjugation) chemistry allows for the addition of unique fluorescent barcodes and oligonucleotide barcodes. Or individual bead batches may be produced by sequential ligation reactions using the same chemistry type or orthogonal ligation chemistry. In embodiments, fluorescent barcoding and oligonucleotide barcoding can be performed with a single molecule or as a mixture of fluorophores to produce a fluorescent barcode and/or an oligonucleotide mixture to produce an oligonucleotide barcode.

Combining a cell sample with a plurality of double indexed beads

As outlined above, the method of embodiments of the present invention provides a plurality of distinguishable labeled cells, each having a different fluorescent characteristic associated therewith, wherein a given fluorescent characteristic is comprised of a fluorescent barcode provided by one or more double indexed beads associated with the cell. Although the number of different double indexed beads bound to a given labeled cell may vary, in some cases the number is from 1 to 10, such as from 1 to 5, including from 1 to 4, such as from 1 to 3, for example from 1 to 2, including 1,2, or 3. Each different double indexed bead has its own unique fluorescent barcode, and the collection of different fluorescent barcodes of the beads that bind to the cell together provide a bound fluorescent signature with the cell.

In practicing the methods of the invention, a cell sample comprising a plurality of cells is provided. Although the number of cells in a given cell sample may vary, in some cases the number of cells is from 50 to 50000000, such as from 100 to 1000000 and including from 500 to 100000. The cells present in a given cell sample may be any type of cell, including prokaryotic cells and eukaryotic cells. Suitable prokaryotic cells include, but are not limited to, bacteria such as E.coli (E.coli), various Bacillus species and extreme microorganisms such as thermophilic bacteria and the like. Suitable eukaryotic cells include, but are not limited to, fungi such as yeasts and filamentous fungi, species including Aspergillus, trichoderma and Neurospora, plant cells including cells of corn, sorghum, tobacco, canola, soybean, cotton, tomato, potato, alfalfa, sunflower, etc., and animal cells including cells of fish, birds and mammals. Suitable fish cells include, but are not limited to, cells from species of salmon, trout, tilapia, tuna, carp, flatfish, halibut, sisal, cod, and zebra fish. Suitable avian cells include, but are not limited to, chicken, duck, quail, pheasant and turkey and other chicken (jungle foul) or bird-hunting cells. Suitable mammalian cells include, but are not limited to, cells from horses, cattle, buffalo, deer, sheep, rabbits, rodents such as mice, rats, hamsters and guinea pigs, goats, pigs, primates, marine mammals including dolphins and whales, as well as cell lines such as human-derived cell lines of any tissue or stem cell type and stem cells including pluripotent and non-pluripotent stem cells, and non-human fertilized eggs. Suitable cells also include cell types associated with a variety of disease states, even in non-disease states. Thus, suitable eukaryotic cell types include, but are not limited to, tumor cells of all types (e.g., melanoma, myeloid leukemia, lung cancer, breast cancer, ovarian cancer, colon cancer, kidney cancer, prostate cancer, pancreatic cancer, and testicular cancer), cardiac muscle cells, dendritic cells, endothelial cells, epithelial cells, lymphocytes (T cells and B cells), mast cells, eosinophils, vascular intima cells, macrophages, natural killer cells, erythrocytes, hepatocytes, leukocytes including mononuclear leukocytes, stem cells such as hematopoietic stem cells, neural stem cells, skin stem cells, lung stem cells, kidney stem cells, liver stem cells, and muscle stem cells (for screening for differentiation and dedifferentiation factors), osteoclasts, chondrocytes, and other connective tissue cells, keratinocytes, melanocytes, liver cells, kidney cells, and adipocytes. In certain embodiments, the cell is a primary disease state cell, such as a primary tumor cell. Suitable cells also include known research cells including, but not limited to, jurkat T cells, NIH3T3 cells, CHO, COS, and the like. See ATCC cell line catalogues, which are expressly incorporated herein by reference in their entirety.

In certain embodiments, the cells used in the present invention are obtained from a subject. As used herein, "subject" refers to humans and other animals and organisms, such as laboratory animals. Thus, the methods and compositions described herein are suitable for human and veterinary applications. In certain embodiments, the subject is a mammal, including embodiments in which the subject is a human patient suffering from (or suspected of suffering from) a disease or pathological condition.

In certain embodiments, the cells to be analyzed are enriched prior to fluorescent barcoding, for example, as described in more detail below. For example, if the target cells are white blood cells derived from a human subject, whole blood from the subject may be density gradient centrifuged to enrich peripheral blood mononuclear cells (PBMCs or white blood cells). Cells may be enriched using any convenient method known in the art, including Fluorescence Activated Cell Sorting (FACS), magnetic Activated Cell Sorting (MACS), density gradient centrifugation, and the like. Parameters for enriching a particular cell from a mixed population include, but are not limited to, physical parameters (e.g., size, shape, density, etc.), in vitro growth characteristics (e.g., response to a particular nutrient in cell culture), and molecular expression (e.g., cell surface protein or carbohydrate expression, reporter molecules such as green fluorescent protein, etc.).

In certain embodiments, the cells are living cells that remain viable during the assay. "viability maintenance" refers to the maintenance of a specified percentage of cells at the end of an assay, including from about 20% to about 100% viability. In certain other embodiments, the methods of the invention are performed in a manner that renders the cells inactive during the assay, e.g., the cells may be immobilized, permeabilized, or maintained in a buffer or under conditions in which the cells are not viable. Such parameters are typically determined by the nature of the assay being performed and the reagents employed.

In some cases, the cells may be treated with, for example, a stimulus. The stimulus used to treat the cells may vary depending on the culture conditions, exposure to changes in temperature, such as heat or cold, exposure to electromagnetic radiation, such as light, exposure to active agents, exposure to mechanical changes, and the like. Different cell samples of the plurality of cell samples may be treated with the same or different stimuli, as desired. Thus, in some cases, the method includes differentially treating two or more of the plurality of cell samples, e.g., contacting two or more of the different samples with different active agents or different concentrations of the same active agent, etc.

In practicing embodiments of the invention, the methods comprise mixing a cell sample comprising a plurality of cells with a labeling composition comprising a plurality of different double-indexed beads, e.g., as described above, under conditions sufficient to stably bind cells in the cell sample to one or more double-indexed beads, to produce a labeled population of cells. In embodiments, the cell sample and the double indexed beads are combined in a liquid (e.g., aqueous) composition. The combination may be achieved under any suitable conditions that provide cell stable binding of the double indexed beads to the cell composition. The double indexed beads can be contacted with the cells of the cell sample, for example, by introducing the double indexed beads into a container of the cell sample, such as by manual dispensing or automatic fluid dispensing. In embodiments, an excess of different beads relative to the number of cells in the cell sample is combined with the cell sample, wherein the extent of the excess may vary, in some cases ranging from 0.5% to 100% or more than 100%, such as from 1% to 50% or more than 50%, including from 5% to 25% or more than 25%. The sample and beads are combined in such a way that the double indexed beads become stably bound to the cells of the cell sample, resulting in labeled cells. If desired, the cells and beads may be combined by mixing and incubating for a period of time at a temperature suitable to provide stable binding of the beads to the cells. In some cases, the combined incubation time of the cells and beads is 15 to 120 minutes, e.g., 30 to 90 minutes (e.g., 60 minutes), and the temperature is 20 to 25 ℃, e.g., 20 to 22 ℃.

Phenotypic markers

In certain embodiments of the invention, the method may comprise detecting one or more phenotypic characteristics of the cell. Detectable phenotypic characteristics include, but are not limited to, the presence of an analyte such as a cell surface or internal marker, physical characteristics (e.g., size, shape, particle size, etc.), cell number (or frequency), and the like. Almost any detectable target feature can be measured as a detectable target phenotypic feature. In certain embodiments, the methods of the invention are directed to qualitatively or quantitatively detecting the presence of analytes, such as markers, that bind to (e.g., are within, on or attached to) a cell being assayed. In some cases, the marker employed in the phenotypic marker is not a marker to which a cell binding member of a double indexed bead, e.g., as described above, binds.

In certain embodiments thereof, the method comprises contacting the labeled cell sample with a detectable analyte-specific binding agent. An "analyte-specific binding agent" refers to any molecule, such as a nucleic acid, small organic molecule, protein, nucleic acid binding dye (e.g., ethidium bromide), that is capable of binding to a particular analyte (or a particular isomer of an analyte) in a cell, but not to other substances. Target analytes include any molecule that binds to cells or is present within cells analyzed in the subject methods. Thus, analytes of interest include, but are not limited to, proteins, carbohydrates, organelles, nucleic acids, infectious particles (e.g., viruses, bacteria, parasites), metabolites, and the like. In certain embodiments, the analyte-specific binding agent is a protein. In certain embodiments thereof, the analyte-specific binding agent is an antibody or binding fragment thereof, e.g., as described above. Thus, the methods and compositions of the present invention can be used to detect an isoform of any particular element in a sample that is antigen-detectable and distinguishable from other isoforms of an activatable element present in the sample.

In certain embodiments, a plurality of detectable analyte-specific binding agents are employed in a method according to the invention. By "multiple analyte-specific binding agent" is meant that at least 2 or more than 2 analyte-specific binding agents are used, including 3 or more than 3, 4 or more than 4, 5 or more than 5, etc. In certain embodiments, each different analyte-specific binding agent is labeled (directly or indirectly) with a distinguishable label (e.g., a fluorophore having an emission wavelength detectable in a different channel of a flow cytometer, whether or not compensated). The plurality of analyte-specific binding agents may bind to the same analyte within or on the cell (e.g., two antibodies that bind different epitopes of the same protein), different analytes within or on the cell, or in any combination (e.g., two reagents that bind the same analyte and a third reagent that binds a different analyte). The upper limit on the amount of analyte-specific binding agent will depend primarily on the assay parameters and the detection capabilities of the detection system used.

FIG. 2 provides a graphical representation of labeled cells generated using the double indexed beads shown in FIG. 1. As shown in fig. 2, the labeled cells 200 include three double indexed beads 202, double indexed beads 204, and double indexed beads 206, which stably bind to the cells via specific binding members (not shown). Also shown are distinguishable fluorescent-labeled antibodies 210, 212, 214, and 216, which specifically bind to different target cell surface phenotypic markers.

FIG. 3 provides a graphical representation of labeled cells generated using double indexed beads that bind to a single cell using a gel nanobottle (nano-visual) as a cell binding member. As shown in fig. 3, the labeled cells 300 include three double indexed beads and are labeled with distinguishable fluorescent-labeled antibodies 310, 312, 314, and 316 that specifically bind to different target cell surface phenotypic markers. The labeled cells are present in nanovials 308, which stably bind double indexed beads 302, 304, and 306.

Acquisition of flow cytometry data and/or microscopy data

After generating the labeling composition (e.g., as described above), the method may include flow cytometry to determine the labeled composition. "flow cytometry assay" refers to the flow cytometry assay performed on a composition, such as the assay composition described above. Flow cytometry assays may include characterizing a sample, such as a sample comprising the assay composition, with a flow cytometer system. Flow cytometer measurements may include introducing the assay composition into a flow cytometer. Flow cytometers typically include a sample reservoir for receiving a fluid sample, e.g., including an assay composition, and a sheath reservoir containing a sheath fluid. The flow cytometer delivers particles in the fluid sample (including, for example, cells from the assay composition) as a cell stream to the flow cell while also directing sheath fluid to the flow cell. To characterize the composition of the flow stream, the flow stream is irradiated with light. Changes in the material in the flow stream, such as the presence of morphological or fluorescent markers, can lead to changes in the observed light, and these changes can be used for characterization and separation. For example, particles, such as molecules, analyte binding beads or single cells, in a fluid suspension pass through a detection zone where the particles are exposed to excitation light, typically from one or more lasers, and the light scattering properties and fluorescence properties of the particles are measured. The particles or components thereof are typically labeled with a fluorescent dye to facilitate detection. By labeling different particles or components with spectrally different fluorescent dyes, multiple different particles or components can be detected simultaneously. In some embodiments, the analyzer includes a plurality of detectors, one for each scattering parameter to be measured, and one or more detectors for each different dye to be detected. For example, some embodiments include a spectral configuration using more than one sensor or detector per dye. The data obtained includes the signal measured by each light scatter detector and the fluorescent emission. In certain embodiments, the flow cytometry assay can detect a signal indicative of the presence of the labeled secondary antibody in the sample. When a signal is detected, the sample may include one or more antibodies to an epitope of a coronavirus antigen.

As outlined above, the sample (e.g., in the flow stream of a flow cytometer) may be irradiated with light from a light source. In some embodiments, the light source is a broadband light source that emits light having a broad wavelength range, e.g., a wavelength that spans 50nm or greater, such as 100nm or greater than 100nm, such as 150nm or greater than 150nm, such as 200nm or greater than 200nm, such as 250nm or greater than 250nm, such as 300nm or greater than 300nm, such as 350nm or greater than 350nm, such as 400nm or greater than 400nm, and includes coverage of 500nm or greater than 500 nm. For example, one suitable broadband light source emits light having a wavelength of 200nm to 1500 nm. Another example of a suitable broadband light source includes a light source that emits light having a wavelength of 400nm to 1000 nm. When the method includes illumination with a broadband light source, the target broadband light source scheme may include, but is not limited to, halogen lamps, deuterium arc lamps, xenon arc lamps, stable fiber coupled broadband light sources, broadband LEDs with continuous spectrum, superluminescent light emitting diodes, semiconductor light emitting diodes, broad spectrum LED white light sources, multi-LED integrated white light sources, and other broadband light sources or any combination thereof.

In other embodiments, the method comprises irradiating with a narrow-band light source that emits light of a specific wavelength or narrow wavelength range, e.g., with a light source that emits light of a narrow wavelength range, e.g., 50nm or less than 50nm, e.g., 40nm or less than 40nm, e.g., 30nm or less than 30nm, e.g., 25nm or less than 25nm, e.g., 20nm or less than 20nm, e.g., 15nm or less than 15nm, e.g., 10nm or less than 10nm, e.g., 5nm or less than 5nm, e.g., 2nm or less than 2nm, and a light source that comprises light of a specific wavelength (i.e., monochromatic light). When the method includes illumination with a narrowband light source, the target narrowband light source scheme may include, but is not limited to, a narrowband wavelength LED, a laser diode, or a broadband light source coupled with one or more optical bandpass filters, a diffraction grating, a monochromator, or any combination thereof.

In certain embodiments, the method comprises irradiating the sample with one or more lasers. As described above, the type and number of lasers will vary depending on the sample and the light collected as needed, and may be a gas laser, such as a helium-neon laser, an argon laser, a krypton laser, a xenon laser, a nitrogen laser, a CO2 laser, a CO laser, an argon-fluorine (ArF) excimer laser, a krypton-fluorine (KrF) excimer laser, a xenon-chlorine (XeCl) excimer laser, or a xenon-fluorine (XeF) excimer laser, or a combination thereof. In other cases, the method includes irradiating the flow stream with a dye laser, such as a stilbene laser, a coumarin laser, or a rhodamine laser. In other cases, the method includes irradiating the flow stream with a metal vapor laser, such as a helium-cadmium (HeCd) laser, a helium-mercury (HeHg) laser, a helium-selenium (HeSe) laser, a helium-silver (HeAg) laser, a strontium laser, a neon-copper (NeCu) laser, a copper laser, or a gold laser, and combinations thereof. In other cases, the method includes irradiating the flow stream with a solid state laser, such as a ruby laser, a Nd: YAG laser, ndCrYAG laser, an Er: YAG laser, a Nd: YLF laser, a Nd: YVO4 laser, a Nd: YCa O (BO 3) 3 laser, a Nd: YCOB laser, a titanium sapphire laser, a thulium YAG laser, an ytterbium YAG laser, a Yb2O3 laser, or a cerium doped laser, and combinations thereof.

The sample may be illuminated with one or more of the above-described light sources, e.g. 2 or more than 2 light sources, e.g. 3 or more than 3 light sources, e.g. 4 or more than 4 light sources, e.g. 5 or more than 5 light sources, and comprising 10 or more than 10 light sources. The light sources may comprise any combination of various types of light sources. For example, in some embodiments, the method includes irradiating the sample in the flow stream with an array of lasers, such as an array having one or more gas lasers, one or more dye lasers, and one or more solid state lasers. Where necessary, at least one laser will be used for excitation of the fluorescent barcode, the other lasers for excitation of other fluorophores that bind to the cells.

In some cases, the flow stream is irradiated with a plurality of frequency shifted light and cells in the flow stream are imaged by fluorescence imaging using radio frequency marker emission (FIRE) to produce frequency encoded images, such as those described in Diebold et al, nature Photonics vol.7 (10), 806-810 (2013), and in U.S. patent nos. 9,423,353, 9,784,661, 9,983,132, 10,006,852, 10,078,045, 10,036,699, 10,222,316, 10,288,546, 10,324,019, 10,408,758, 10,451,538, 10,620,111, and U.S. patent publications 2017/0133857, 2017/038826, 2017/0350803, 2018/0275042, 2019/0376895, and 2019/0376894, the disclosures of which are incorporated herein by reference. In this case, the flow cytometry data may comprise image data of particles, e.g. cells present in the sample. (see, e.g., schraivogel et al, science Vol.375 (6578); 315-320 (2022)).

In certain embodiments, the method comprises spectrally resolving light from each fluorophore of a fluorophore-biomolecule reagent pair in the sample. In some embodiments, the overlap between each different fluorophore is determined and the contribution of each fluorophore to the overlapping fluorescence is calculated. In some embodiments, spectrally resolving the light from each fluorophore comprises calculating a spectral unmixed matrix of fluorescence spectra for each of a plurality of fluorophores having overlapping fluorescence in the sample detected by the light detection system. In some cases, spectral analysis of light from each fluorophore and calculation of the spectral unmixed matrix for each fluorophore can be used to estimate the abundance of each fluorophore, such as, for example, resolving the abundance of target cells in a sample.

In certain embodiments, the method includes spectrally resolving light detected by a plurality of photodetectors, as described, for example, in U.S. patent No. 11,009,400, U.S. patent application publication nos. 20210247293 and 20210325292, the disclosures of which are incorporated herein by reference in their entirety. For example, spectral analysis of light detected by a plurality of photodetectors of the second set of photodetectors may include solving a spectral unmixing matrix using one or more of 1) a weighted least squares algorithm, 2) a Sherman-Morrison iterative inverse updater, 3) an LU matrix decomposition, such as decomposing the matrix into the product of a lower triangle (L) matrix and an upper triangle (U) matrix, 4) a modified Jolly-Stokes decomposition, 5) a QR factorization, and 6) a calculation of the weighted least squares algorithm by singular value decomposition. In certain embodiments, the method further comprises characterizing the spill-over diffusion of light detected by the plurality of photodetectors, as described, for example, in U.S. patent application publication No. 20210349004, the disclosure of which is incorporated herein by reference.

In some cases, the abundance of a fluorophore that binds (e.g., chemically bound (i.e., covalently bound, ion bound) or physically bound) to a target particle is calculated from the spectrally resolved light from each fluorophore that binds to the particle. For example, in one example, the relative abundance of each fluorophore bound to the target particle is calculated from the spectrally resolved light from each fluorophore. In another example, the absolute abundance of each fluorophore bound to the target particle is calculated from the spectrally resolved light from each fluorophore. In certain embodiments, particles may be identified or classified based on the relative abundance of each fluorophore determined to be bound to the particle. Comparing the relative or absolute abundance of each fluorophore bound to the particle to a control sample of particles having known properties, or by performing a spectroscopic or other assay on a population of particles (e.g., a population of cells) having calculated relative or absolute abundance of the bound fluorophores.

In certain embodiments, the method may include sorting one or more particles (e.g., cells) of the sample that are identified based on an estimated abundance of fluorophores bound to the particles. The term "sort" is used herein in its conventional sense to refer to separating components of a sample (e.g., droplets containing cells, droplets containing non-cellular particles such as biological macromolecules) and, in some cases, delivering the separated components to one or more sample collection containers. For example, the method may comprise sorting 2 or more components of the sample, such as 3 or more 3 components, such as 4 or more 4 components, such as 5 or more 5 components, such as 10 or more 10 components, such as 15 or more 15 components, and comprises sorting 25 or more 25 components of the sample. In sorting particles identified based on abundance of fluorophore bound to the particles, the method includes data acquisition, analysis, and recording, such as by computer, wherein a plurality of data channels record data from each detector used to acquire overlapping spectra of a plurality of fluorophore-biomolecule reagent pairs bound to the particles. In these embodiments, analyzing includes identifying the particle based on spectrally resolved light (e.g., by calculating a spectral unmixed matrix) from the plurality of fluorophores having overlapping spectral fluorophore-biomolecular reagent pairs bound to the particle, and based on the estimated abundance of each fluorophore bound to the particle. The analysis may be communicated to a sorting system configured to generate a set of digitized parameters based on the particle classification. In some embodiments, methods for sorting sample components include sorting particles (e.g., cells in a biological sample), as described in U.S. patent nos. 3,960,449, 4,347,935, 4,667,830, 5,245,318, 5,464,581, 5,483,469, 5,602,039, 5,643,796, 5,700,692, 6,372,506, and 6,809,804, the disclosures of which are incorporated herein by reference. In some embodiments, the method includes sorting components of the sample with a particle sorting module, such as those described in U.S. patent nos. 9,551,643 and 10324019, U.S. patent publication nos. 2017/0299493, and international patent publication No. WO/2017/040151, the disclosures of which are incorporated herein by reference. In certain embodiments, cells of a sample are sorted using a sort decision module having a plurality of sort decision units, such as those described in U.S. patent No. 11,085,868, the disclosure of which is incorporated herein by reference.

Flow cytometry assays are well known in the art. See, e.g., ormerod (editions), flow cytometry: A PRACTICAL Apprach, oxford Univ. Press (1997), jaroszeski et al (editions ),Flow Cytometry Protocols,Methods in Molecular Biology No.91,Humana Press(1997);Practical Flow Cytometry,3rd ed.,Wiley-Liss(1995);Virgo et al (2012) Ann Clin biochem. Jan;49 (pt 1): 17-28; linden et al, semin Throm Hemost.2004Oct;30 (5): 502-11; alison et al J Pathol,2010Dec;222 (4): 335-344; and Herbig et al (2007) CRIT REV THER Drug Carrier System.24 (3): 203-255; the disclosures of which are incorporated herein by reference in certain aspects, flow Cytometry compositions include the use of Flow cytometers capable of simultaneously exciting and detecting multiple fluorophores, e.g., BD Biosciences FACSCanto^TM Flow cytometers, substantially in accordance with manufacturer's instructions.

In some embodiments, the subject systems are flow cytometry systems, such as those described in U.S. patent nos. 10,663,476, 10,620,111, 10,613,017, 10,605,713, 10,585,031, 10,578,542, 10,578,469, 10,481,074, 10,302,545, 10,145,793, 10,113,967, 10,006,852, 9,952,076, 9,933,341, 9,726,527, 9,453,789, 9,200,334, 9,097,640, 9,095,494, 9,092,034, 8,975,595, 8,753,573, 8,233,146, 8,140,300, 7,544,326, 7,201,875, 7,129,505, 6,821,740, 6,813,017, 6,809,804, 6,372,506, 5,700,692, 5,643,796, 5,627,040, 5,620,842, 5,602,039, 4,987,086, 4,498,766, the disclosures of which are incorporated herein by reference in their entirety.

In some embodiments, the subject systems are particle sorting systems configured to sort particles with a closed particle sorting module, such as those described in U.S. patent publication No. 2017/0299493, the disclosure of which is incorporated herein by reference. In certain embodiments, particles (e.g., cells) of a sample are sorted using a sort decision module having a plurality of sort decision units, such as those described in U.S. patent publication No. 2020/0256781, the disclosure of which is incorporated herein by reference. In some embodiments, the system includes a particle sorting module having a deflector plate, such as those described in U.S. patent publication No. 2017/0299493 filed on day 3/28, 2017, the disclosure of which is incorporated herein by reference.

In certain instances, the flow cytometry systems of the present invention are configured to image particles in a flow stream by fluorescence imaging using radio frequency marker emission (FIRE), such as those described in Diebold et al, nature Photonics vol.7 (10), 806-810 (2013), and in U.S. patent 9,423,353, 9,784,661, 9,983,132, 10,006,852, 10,078,045, 10,036,699, 10,222,316, 10,288,546, 10,324,019, 10,408,758, 10,451,538, 10,620,111, and U.S. patent publication nos. 2017/0136857, 2017/038826, 2017/0350803, 2018/0275042, 2019/0376895, and 2019/0376894, the disclosures of which are incorporated herein by reference. FIG. 4 provides a schematic representation of obtaining images of flow cytometry data comprising labeled cells via a FIRE protocol, for example using a FACSDiscover flow cytometer such as described in Schraivogel et al, science Vol.375 (6578); 315-320 (2022), according to embodiments of the present invention. As shown, image data may be obtained from fluorescent barcodes provided by fluorophores that have little effect on other detectors, such as barcodes provided by horizons^TM coupled to polymer dyes BB515, BB550, and BB790 (BD Biosciences).

As described above, the method includes a cytometry analysis that may include sorting. The target cells identified in the sample may be sorted and subsequently analyzed by any convenient analytical technique. Subsequent target analysis techniques include, but are not limited to, sequencing, detection by CellSearch, as described in Food and Drug Administration (2004) Final rule.Fed Regist 69:69:26036-26038, detection by CTC Chip, as described in Nagrath et al (2007) Nature 450:1235-1239, detection by MAGSWEEPER, talasaz et al (2009) Proc NATL ACAD SCI U S106:3970-3975, and detection by nanostructured substrates, as described in Wang S et al, (2011) ANGEW CHEM INT ED ENGL 50:3084-3088, the disclosure of which is incorporated herein by reference. When desired, the sorting protocol may include distinguishing between living and dead cells, and any convenient staining protocol for identifying such cells may be incorporated into the method. Of interest are the cytometry data obtained by BD FACSDiscover^TM S8 cell sorter and BD CellView^TM imaging technique (BD Biosciences).

Analysis of the data obtained for the labeled samples of the invention involves analyzing the cells for detectable target features (e.g., as described in more detail above). Analysis of the detectable features may be performed at any convenient step in the data analysis stage, including before, during or after deconvolution. In fact, there is no intention to limit the order of deconvolution and detectable feature analysis, as the acquired data can be analyzed randomly and repeatedly. For target cells, the obtained data may include cell fluorescence characteristics (e.g., provided by double indexed beads that bind to cells, as described above), as well as other characteristics of the cells, including markers that bind to cells, cell images, and the like. The data may be provided in any convenient format, such as a Flow Cytometry Standard (FCS) file format.

Obtaining sequencing data of labeled single cells

For example, as described above, after obtaining the cell count data for the labeled population of cells, the methods of embodiments of the invention can include obtaining sequencing data for the labeled cells of the sample (the sequencing data can be obtained from all or a portion of the cells of the labeled sample, e.g., target cells of the labeled sample, such as sorted cells obtained by a cell count step). Sequencing data may be obtained using any convenient protocol. In some cases, sequencing data may be obtained by a protocol that includes partitioning the labeled cells, then generating a sequencable library of nucleic acids obtained from the partitioned cells, and reading the sequencable library.

Partitioning the labeled cells

After producing labeled cells (i.e., cells labeled with one or more double indexed beads such as described above), an embodiment of the method includes partitioning the labeled cells to produce partitioned labeled single cells, wherein each cell has one or more double indexed beads bound thereto. In some cases, partitioning includes partitioning labeled cells into partitions or compartments such that the compartments contain a single labeled cell. "zoning" refers to placing labeled cells in a small reaction chamber, which may be a solid material-defined, fluid-partitioned structure such as microwells configured to contain labeled cells. In some embodiments of the disclosed methods, devices and systems, a plurality of microwells randomly distributed on a substrate is used. In some embodiments, the plurality of microwells are distributed on the substrate in an ordered pattern, such as an ordered array. In some embodiments, the plurality of microwells are distributed on the substrate in a random pattern, e.g., a random array. The micropores may be formed in various shapes and sizes. Suitable hole geometries include, but are not limited to, cylindrical, elliptical, cubic, conical, hemispherical, rectangular or polyhedral, for example, three-dimensional geometries composed of several planes, such as rectangular cuboid, hexagonal-column, octagonal-column, inverted triangular-pyramid, inverted rectangular-pyramid, inverted pentagonal-pyramid, inverted hexagonal-pyramid or inverted truncated-pyramid. In some embodiments, non-cylindrical microwells, such as wells with oval or square footprints, may provide advantages in being able to accommodate larger cells. In some embodiments, the upper and/or lower edges of the aperture wall may be rounded to avoid sharp corners, thereby reducing electrostatic forces generated at sharp edges or points due to concentration of the electrostatic field. Thus, the use of rounded corners may enhance the ability to recover beads from microwells. The pore size can be characterized by absolute dimensions. In some cases, the average diameter of the micropores may be from about 5 μm to about 100 μm. In other embodiments, the average pore diameter is at least 5 μm, at least 10 μm, at least 15 μm, at least 20 μm, at least 25 μm, at least 30 μm, at least 35 μm, at least 40 μm, at least 45 μm, at least 50 μm, at least 60 μm, at least 70 μm, at least 80 μm, at least 90 μm, or at least 100 μm. In yet other embodiments, the average pore diameter is at most 100 μm, at most 90 μm, at most 80 μm, at most 70 μm, at most 60 μm, at most 50 μm, at most 45 μm, at most 40 μm, at most 35 μm, at most 30 μm, at most 25 μm, at most 20 μm, at most 15 μm, at most 10 μm, or at most 5 μm. The volume of microwells used in the methods of the present invention may vary, in some cases from about 200 μm³ to about 800000 μm³. In some embodiments, the microwell volume is at least 200 μm³, at least 500 μm³, at least 1000 μm³, at least 10000 μm³, At least 25000 μm³, at least 50000 μm³, at least 100000 μm³, at least 200000 μm³, At least 300000 μm³, at least 400000 μm³, at least 500000 μm³, at least 600000 μm³, at least 700000 μm³ or at least 800000 μm³. In other embodiments, the microwell volume is up to 800000 μm³, up to 7000000 μm³, up to 600000 μm³、500000μm³, up to 400000 μm³, Up to 300000 μm³, up to 200000 μm³, up to 100000 μm³, up to 50000 μm³, Up to 25000 μm³, up to 10000 μm³, up to 1000 μm³, up to 500 μm³, or up to 200 μm³. The number of microwells in a given device employed in embodiments of the invention may vary, with in some cases the number being 100 or more than 100, such as 250 or more than 250, such as 500 or more than 500, including 1000 or more than 1000, such as 5000 or more than 5000, such as 10000 or more than 10000, with in some cases the number being 15000 or less 15000, such as 12500 or less 12500. Micropores suitable for use in embodiments of the present invention are also described in PCT application Ser. No. PCT/US2016/014612, published as WO/2016/118915, the disclosure of which is incorporated herein by reference. as used herein, a substrate may refer to a solid support. For example, the substrate may include a plurality of microwells. For example, the substrate may be an array of wells comprising two or more wells. In some embodiments, the microwells may include a defined volume of small reaction chambers. In some embodiments, a microwell may entrap one or more cells. In some embodiments, a microwell may retain only one cell. In some embodiments, the microwells may entrap one or more solid supports. In some embodiments, the microwells may entrap only one solid support. In some embodiments, the microwells entrap single cells and single solid supports (e.g., beads). Although the number of wells, e.g., microwells, in an orifice plate, e.g., microwell array, may vary in a given dispensing step, in some cases ranging from 5 to 500, e.g., from 5 to 100.

In partitioning the labeled cells, the labeled cells may be placed in the compartment (e.g., microwells of a microwell array) using any convenient protocol. The present disclosure provides methods for partitioning labeled cells into partitions to partition the labeled cells. For example, a collection of labeled cells may be introduced into a structure such as a microwell to partition the labeled cells. The labeled cells may be contacted, for example, by gravity flow, wherein the labeled cells may settle into the zonal structure. In some cases, the aqueous composition of labeled cells is contacted with the microwell array, e.g., by flowing it through the microwell array, causing the labeled cells to deposit in the microwells. An aqueous composition comprising labeled cells may flow through a flow cell in fluid communication with the microwells. A protocol and system suitable for partitioning captured particles into microwells is described in PCT application serial No. PCT/US2016/014612 published as WO/2016/118915, the disclosure of which is incorporated herein by reference. To partition cells of a cell sample, any convenient protocol may be used, such as dispensing an aliquot of the cell sample, e.g., pipetting, into a compartment, flowing the sample across the surface of an orifice plate, etc.

In some embodiments, partitioning the plurality of labeled cells further comprises providing particles, e.g., beads, comprising particles, e.g., bead-bound nucleic acids, into the compartment comprising single cells, wherein the bound nucleic acids are used to prepare a nucleic acid sequencing-ready composition, e.g., a sequencing-ready library, from the labeled cells. In some cases, a particle, e.g., a bead-binding nucleic acid, comprises a target binding region, e.g., that binds to a complementary sequence of a target nucleic acid species in a cell, as well as a capture sequence of a double indexed bead. For example, when the target nucleic acid species is cellular mRNA and the double indexed bead oligonucleotide barcode comprises a poly (a) capture sequence, the bead-binding nucleic acid may comprise a poly (T) domain as the target binding region. In addition to the target binding region, the binding nucleic acid may comprise one or more additional domains, such as, but not limited to, a cell marker domain, a barcode domain, a molecular index domain (e.g., a Unique Molecular Identifier (UMI) domain), a universal primer binding domain, and the like. For further details on particles with bound nucleic acid that may be provided in the compartment, see U.S. patent application publication No. US2018/0088112, U.S. patent application publication No. US2018/0200710, U.S. patent application publication No. US2018/0346970, U.S. patent application publication No. US2019/0056415, U.S. patent application publication No. US 2020/0248563, U.S. patent application publication No. US2020/0299672, and U.S. patent application publication No. US2021/0171940, the disclosures of which are incorporated herein by reference. Beads with bound nucleic acid may be provided in the compartment using any convenient protocol, including but not limited to those described above for cell partitioning, and further described in PCT application serial No. PCT/US2016/014612, published as WO/2016/118915, the disclosure of which is incorporated herein by reference. The particles, e.g., beads, may be partitioned into cells before, after, or in some cases in combination with the labeled cells, as desired.

Generation of a sequencable library

For example, as described above, partitioning of the labeled cells results in partitioned labeled cells spatially adjacent to the particle, e.g., bead, having bound cell marker domain nucleic acids comprising the target binding region as described above. When the cell marker domain nucleic acid is in close proximity to the target of the labeled single cell and/or the double indexed bead oligonucleotide barcode, the target/oligonucleotide barcode may hybridize to the cell marker domain nucleic acid. The cell marker domains comprising nucleic acid may be contacted in a proportion that is not depleted such that each different target can bind to a different cell marker domain comprising nucleic acid having its own unique UMI (if desired).

After lysing the labeled cells and releasing the nucleic acid molecules therefrom, the nucleic acid molecules can be randomly bound to the cell marker domain nucleic acids of a co-located solid support such as a bead. Binding may involve hybridization of the target recognition region of the cell marker domain nucleic acid to a complementary portion of the target nucleic acid molecule (e.g., the barcode's log (dT) may interact with the poly (a) tail of the target). The assay conditions (e.g., buffer pH, ionic strength, temperature, etc.) for hybridization can be selected to promote the formation of specific, stable hybrids. In some embodiments, the nucleic acid molecules released from the lysed cells can bind to (e.g., hybridize to) a plurality of probes on a substrate. When the probe comprises olog (dT), the mRNA molecules can be hybridized to the probe and reverse transcribed. The log (dT) portion of the oligonucleotide may serve as a primer for first strand synthesis of the cDNA molecule, for example, when subjected to DNA synthesis reaction conditions to produce a first strand cDNA domain comprising the capture nucleic acid. The cell marker domain nucleic acid can also hybridize to a complementary capture sequence of a double-indexed bead oligonucleotide barcode, such as a poly (a) sequence, that binds to the labeled cell. Thus, the cell marker domain nucleic acid can serve as a primer for reverse transcription using the double indexed bead oligonucleotide barcode as a template, for example, as described in more detail below.

Where desired, a given workflow may include a pooling (pooling) step in which a product composition, e.g., consisting of captured nucleic acid, synthesized first strand cDNA, or synthesized double-stranded cDNA, is mixed or pooled with a product composition obtained from one or more additional samples, e.g., labeled cells. In some cases, the pooling step is performed immediately after the step of hybridizing the cell marker domain nucleic acid to the target nucleic acid, e.g., as described above. The amount of different product compositions produced by different samples, e.g., cells, of a mixing or pooling in such embodiments may vary, with amounts ranging from 2 to 1000000, e.g., from 3 to 200000, including from 4 to 100000, such as from 5 to 50000, in some cases from 100 to 10000, e.g., from 1000 to 5000. Either before or after mixing, one or more of the product compositions may be amplified, for example by Polymerase Chain Reaction (PCR), as described in more detail below. Once the target-cell domain marker molecules are pooled, all subsequent treatments can be performed in a single reaction vessel. Further processing may include, for example, reverse transcription reactions, amplification reactions, cleavage reactions, dissociation reactions, and/or nucleic acid extension reactions. Further processing reactions can be performed within microwells, i.e., without first pooling labeled target nucleic acid molecules from multiple cells.

The present disclosure provides methods of producing target-cell marker domain conjugates using any convenient protocol, such as reverse transcription or nucleotide extension. The target-cell marker domain conjugate may comprise a complementary sequence of all or part of the cell marker domain and the target nucleic acid. Reverse transcription of the bound RNA molecule can be performed by adding reverse transcription primers and reverse transcriptase. The reverse transcription primer may be an oligo (dT) primer, a random hexanucleotide primer, or a target-specific oligonucleotide primer. The Oligo (dT) primer may be or may be about 12 to 18 nucleotides in length and binds to the endogenous poly (A) tail at the 3' end of mammalian mRNA. Random hexanucleotide primers can bind to mRNA at multiple complementary sites. Target-specific oligonucleotide primers typically selectively prime the target mRNA. Reverse transcription can be repeated to produce multiple cDNA molecules. The methods disclosed herein can comprise performing at least about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 reverse transcription reactions. The method may comprise performing at least about 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 reverse transcription reactions.

One or more nucleic acid amplification reactions can be performed to produce multiple copies of a target nucleic acid molecule. Amplification may be performed in a multiplex manner, wherein a plurality of target nucleic acid sequences are amplified simultaneously. The amplification reaction may be used to add sequencing adaptors to the nucleic acid molecules. The amplification reaction, if present, may include amplifying at least a portion of the sample label. The amplification reaction may include amplifying at least a portion of a cellular marker and/or a barcode sequence (e.g., a molecular marker). The amplification reaction can include amplifying at least a portion of a sample label, a cell label, a spatial label, a barcode sequence (e.g., a molecular label), a target nucleic acid, or a combination thereof. The amplification reaction may comprise amplifying 0.5％、1％、2％、3％、4％、5％、6％、7％、8％、9％、10％、15％、20％、25％、30％、35％、40％、45％、50％、55％、60％、65％、70％、75％、80％、85％、90％、95％、97％、100％, of the plurality of nucleic acids or a range or number between any two of these values. The method can further comprise performing one or more cDNA synthesis reactions to produce one or more cDNA copies of a target-barcode molecule comprising a sample label, a cell label, a spatial label, and/or a barcode sequence (e.g., a molecular label).

In some embodiments, amplification may be performed using the Polymerase Chain Reaction (PCR). As used herein, PCR may refer to a reaction that amplifies a particular DNA sequence in vitro by primer extension of complementary strands of DNA that occur simultaneously. As used herein, PCR may encompass derivative forms of the reaction including, but not limited to, RT-PCR, real-time PCR, nested PCR, quantitative PCR, multiplex PCR, digital PCR, and assembly PCR.

Amplification of nucleic acids may include non-PCR based methods. Examples of non-PCR based methods include, but are not limited to, multiple Displacement Amplification (MDA), transcription Mediated Amplification (TMA), nucleic Acid Sequence Based Amplification (NASBA), strand Displacement Amplification (SDA), real-time SDA, rolling circle amplification, or loop-to-loop amplification (circle-to-circle amplification). Other non-PCR based amplification methods include multiple cycles of DNA-dependent RNA polymerase driven RNA transcription amplification or RNA-directed DNA synthesis and transcription for amplifying a DNA target or RNA target, ligase Chain Reaction (LCR) and qβ replicase (qβ) methods, amplification using palindromic probes, strand displacement amplification, oligonucleotide driven amplification using restriction endonucleases, amplification methods where primers hybridize to a nucleic acid sequence and cleave the resulting duplex prior to extension reactions and amplification, strand displacement amplification using a nucleic acid polymerase lacking 5' exonuclease activity, rolling circle amplification, and branched extension amplification (RAM). In some embodiments, the amplification does not produce a circularized transcript.

In some embodiments, the methods disclosed herein further comprise performing a polymerase chain reaction on the nucleic acid (e.g., RNA, DNA, cDNA) to produce labeled amplicons (e.g., randomly labeled amplicons). The labeled amplicon may be a double stranded molecule. The double-stranded molecule may comprise a double-stranded RNA molecule, a double-stranded DNA molecule, or an RNA molecule that hybridizes to a DNA molecule. One or both strands of the double-stranded molecule may comprise a sample label, a spatial label, a cellular label, and/or a barcode sequence (e.g., a molecular label). The labeled amplicon may be a single stranded molecule. The single stranded molecule may comprise DNA, RNA, or a combination thereof. The nucleic acids of the present disclosure may include synthetic or altered nucleic acids. Thus, the method can include generating an amplicon composition from a first strand cDNA domain comprising a capture nucleic acid.

Amplification may include the use of one or more than one unnatural nucleotide. The non-natural nucleotides may include photolabile nucleotides or triggerable nucleotides. Examples of non-natural nucleotides may include, but are not limited to, peptide Nucleic Acids (PNAs), morpholino nucleic acids and Locked Nucleic Acids (LNAs), and ethylene Glycol Nucleic Acids (GNAs) and Threose Nucleic Acids (TNAs). The non-natural nucleotides may be added to one or more cycles of the amplification reaction. The addition of non-natural nucleotides can be used to identify products at specific cycles or time points in the amplification reaction.

One or more than one primer may comprise a universal primer. The universal primer can anneal to the universal primer binding site. One or more than one custom primer may anneal to a first sample label, a second sample label, a spatial label, a cellular label, a barcode sequence (e.g., a molecular label), a target, or any combination thereof. One or more primers may include universal primers and custom primers. Custom primers can be designed to amplify one or more targets. The target may comprise a subset of the total nucleic acids in one or more samples. The targets may comprise a subset of all labeled targets in one or more samples. One or more than one primer may comprise at least 96 or more than 96 custom primers. One or more than one primer may comprise at least 960 or more than 960 custom primers. One or more than one primer may comprise at least 9600 or more than 9600 custom primers. One or more than one custom primer may anneal to two or more than two different labeled nucleic acids. Two or more different labeled nucleic acids may correspond to one or more genes.

Any amplification protocol may be used in the methods of the present disclosure. For example, in one embodiment, the first round of PCR can amplify molecules attached to beads using gene-specific primers and primers directed to universal Illumina sequencing primer 1 sequences. The second round of PCR can amplify the first round of PCR product using nested gene-specific primers flanked by Illumina sequencing primer 2 sequences and primers directed against universal Illumina sequencing primer 1 sequences. Third round of PCR add P5 and P7 and sample index, change PCR products into Illumina sequencing library. Sequencing using 150bp×2 sequencing can show cell markers and barcode sequences (e.g., molecular markers) on sequencing fragment (read) 1, genes on sequencing fragment 2, and sample index on index 1 sequencing fragment.

In some embodiments, chemical cleavage may be used to remove nucleic acids from a substrate. For example, chemical groups or modified bases present in the nucleic acid can be used to facilitate its removal from the solid support. For example, enzymes can be used to remove nucleic acids from a substrate. For example, nucleic acids may be excised from the substrate by restriction endonuclease digestion. For example, treatment of nucleic acids containing dUTP or ddUTP with uracil-d-glycosidase (UDG) can be used to remove nucleic acids from a substrate. For example, nucleic acids can be excised from a substrate using an enzyme that performs nucleotide excision, such as a base excision repair enzyme, e.g., a purine-free/pyrimidine-free (AP) endonuclease. In some embodiments, the nucleic acid may be removed from the substrate using a photocleavable group and light. In some embodiments, the cleavable linker may be used to remove nucleic acid from a substrate. For example, the cleavable linker may comprise at least one of biotin/avidin, biotin/streptavidin, biotin/neutravidin, ig-protein a, a photolabile linker, an acid-labile or base-labile linker group, or an aptamer.

In some embodiments, amplification may be performed on a substrate, for example, using bridge amplification. The cDNA may be homopolymer tailed to generate compatible ends for bridge amplification using oligo (dT) probes on a substrate. In bridge amplification, the primer complementary to the 3' end of the template nucleic acid may be the first primer in each pair that is covalently attached to the solid particle. When a sample containing a template nucleic acid is contacted with the particle and subjected to a single thermal cycle, the template molecule may be annealed to the first primer, and by adding nucleotides, the first primer is extended in the forward direction to form a duplex molecule consisting of the template molecule and a newly formed DNA strand complementary to the template. In the next heating step of the cycle, the double-stranded molecule may be denatured, releasing the template molecule from the particle, and leaving the complementary DNA strand attached to the particle by the first primer. In the annealing stage of the subsequent annealing and extension steps, the complementary strand may hybridize with a second primer that is complementary to a fragment of the complementary strand at a location remote from the first primer. Such hybridization may result in the complementary strand forming a bridge between the first primer and the second primer, the bridge being immobilized to the first primer by a covalent bond and to the second primer by hybridization. In the extension phase, the second primer may be extended in the opposite direction by adding nucleotides in the same reaction mixture, thereby converting the bridge into a double-stranded bridge. The next cycle is then started, the double-stranded bridge can be denatured to produce two single-stranded nucleic acid molecules, one end of each molecule being attached to the particle surface by a first primer and a second primer, respectively, and the other end of each molecule being unattached. In the second cycle of annealing and extension steps, each strand may hybridize to other complementary primers on the same particle that were not previously used to form a new single-strand bridge. The extension of the two previously unused primers, now hybridized, converts the two new bridges into a double-stranded bridge. The amplification reaction may comprise amplifying at least 1％、2％、3％、4％、5％、6％、7％、8％、9％、10％、15％、20％、25％、30％、35％、40％、45％、50％、55％、60％、65％、70％、75％、80％、85％、90％、95％、97％ or 100% of the plurality of nucleic acids.

Amplification of the labeled nucleic acid may include PCR-based methods or non-PCR-based methods. Amplification of the labeled nucleic acid may include exponential amplification of the labeled nucleic acid. Amplification of the labeled nucleic acid may include linear amplification of the labeled nucleic acid. Amplification may be performed by Polymerase Chain Reaction (PCR). PCR may refer to a reaction that amplifies a particular DNA sequence in vitro by primer extension of complementary strands of DNA that occur simultaneously. PCR may encompass derivative forms of the reaction including, but not limited to, RT-PCR, real-time PCR, nested PCR, quantitative PCR, multiplex PCR, digital PCR, repression PCR, half-repression PCR, and assembly PCR.

In some embodiments, the amplification of the labeled nucleic acid comprises a non-PCR based method. Examples of non-PCR based methods include, but are not limited to, multiple Displacement Amplification (MDA), transcription Mediated Amplification (TMA), nucleic Acid Sequence Based Amplification (NASBA), strand Displacement Amplification (SDA), real-time SDA, rolling circle amplification, or loop-to-loop amplification. Other non-PCR based amplification methods include multiple cycles of DNA-dependent RNA polymerase driven RNA transcription amplification for amplifying DNA or RNA targets or RNA directed DNA synthesis and transcription, ligase Chain Reaction (LCR) and qβ replicase (qβ) methods, use of palindromic probes, strand displacement amplification, oligonucleotide driven amplification using restriction endonucleases, amplification methods in which primers hybridize to nucleic acid sequences and cleave the resulting duplex prior to extension reactions and amplification, strand displacement amplification using a nucleic acid polymerase lacking 5' exonuclease activity, rolling circle amplification, and/or branched extension amplification (RAM).

In some embodiments, the methods disclosed herein further comprise performing a nested polymerase chain reaction on the amplified amplicon (e.g., target). The amplicon may be a double stranded molecule. The double-stranded molecule may comprise a double-stranded RNA molecule, a double-stranded DNA molecule, or an RNA molecule that hybridizes to a DNA molecule. One or both strands of the double-stranded molecule may comprise a sample tag or molecular identifier tag. Alternatively, the amplicon may be a single stranded molecule. The single stranded molecule may comprise DNA, RNA, or a combination thereof. The nucleic acids of the invention may comprise synthetic or altered nucleic acids.

In some embodiments, the method comprises repeatedly amplifying the labeled nucleic acids to produce a plurality of amplicons. The methods disclosed herein can comprise performing at least about 1,2,3, 4, 5, 6, 7,8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 amplification reactions. Or the method comprises performing at least about 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 amplification reactions.

Amplification may further comprise adding one or more control nucleic acids to one or more samples comprising a plurality of nucleic acids. Amplification may further comprise adding one or more control nucleic acids to the plurality of nucleic acids. The control nucleic acid may comprise a control label.

Amplification may include the use of one or more than one unnatural nucleotide. The non-natural nucleotides may include photolabile nucleotides and/or triggerable nucleotides. Examples of non-natural nucleotides include, but are not limited to, peptide Nucleic Acids (PNAs), morpholino nucleic acids and Locked Nucleic Acids (LNAs), and ethylene Glycol Nucleic Acids (GNAs) and Threose Nucleic Acids (TNAs). The non-natural nucleotides may be added to one or more cycles of the amplification reaction. The addition of non-natural nucleotides can be used to identify products at specific cycles or time points in the amplification reaction.

Performing one or more amplification reactions may include using one or more primers. The one or more primers may comprise one or more than one oligonucleotide. One or more than one oligonucleotide may comprise at least about 7 to 9 nucleotides. One or more than one oligonucleotide may comprise from less than 12 to 15 nucleotides. One or more than one primer may anneal to at least a portion of the plurality of labeled nucleic acids. One or more primers may anneal to the 3 'and/or 5' ends of a plurality of labeled nucleic acids. One or more than one primer may anneal to an interior region of a plurality of labeled nucleic acids. The interior region can be at least about 50, 100, 150, 200, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 650, 700, 750, 800, 850, 900, or 1000 nucleotides from the 3' end of the plurality of labeled nucleic acids. One or more than one primer may comprise an immobilized primer set. The one or more primers may comprise at least one or more than one custom primer. The one or more than one primer may comprise at least one or more than one control primer. The one or more primers may comprise at least one or more housekeeping gene primers. One or more than one primer may comprise a universal primer. The universal primer can anneal to the universal primer binding site. One or more than one custom primer may anneal to a first sample tag, a second sample tag, a molecular identifier tag, a nucleic acid, or a product thereof. One or more primers may include universal primers and custom primers. Custom primers can be designed to amplify one or more than one target nucleic acid. The target nucleic acid may comprise a subset of the total nucleic acid in one or more samples. In some embodiments, the primer is a probe attached to an array of the invention.

In some embodiments, barcoding (e.g., randomization) multiple targets in the sample further comprises generating an indexed library of barcoded targets (e.g., randomization barcoded targets) or fragments of barcoded targets. The barcode sequences of different barcodes (e.g., molecular tags of different random barcodes) may be different from each other. Generating an indexed library of barcoded targets includes generating a plurality of indexed polynucleotides from a plurality of targets in a sample. For example, for an indexed library of barcoded targets comprising a first indexed target and a second indexed target, the tagged region of the first indexed polynucleotide may differ, by about, by at least, or by at most 1,2, 3,4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, or a number or range of nucleotides between any two of these values from the tagged region of the second indexed polynucleotide. In some embodiments, generating an indexed library of barcoded targets comprises contacting a plurality of targets, e.g., mRNA molecules, with a plurality of oligonucleotides comprising a poly (T) region and a tag region, and performing a first strand synthesis using a reverse transcriptase to produce single-stranded tagged cDNA molecules, each cDNA molecule comprising a cDNA region and a tag region, wherein the plurality of targets comprises at least two mRNA molecules of different sequences, and the plurality of oligonucleotides comprises at least two oligonucleotides of different sequences. Generating an indexed library of barcoded targets may also include amplifying single-stranded labeled cDNA molecules to produce double-stranded labeled cDNA molecules, and performing nested PCR on the double-stranded labeled cDNA molecules to produce labeled amplicons. In some embodiments, the method may include generating a linker-tagged amplicon.

Bar code labeling (e.g., random bar code labeling) can include labeling individual nucleic acid (e.g., DNA or RNA) molecules with a nucleic acid bar code or tag. In some embodiments, it involves adding a DNA barcode or tag to the cDNA molecule when generating cDNA from mRNA. Nested PCR can be performed to minimize PCR amplification bias. Adaptors may be added for sequencing using, for example, second generation sequencing (NGS). Sequencing results can be used to determine one or more copies of the cell markers, molecular markers, and nucleotide fragment sequences of the target.

Sequencing

In certain embodiments, the provided methods further comprise subjecting the prepared expression library, e.g., the amplicon compositions produced as described above, to a sequencing protocol, e.g., a second generation sequencing (NGS) protocol. The protocol may be performed on any suitable NGS sequencing platform. NGS sequencing platforms of interest include, but are not limited to, the sequencing platform described bySequencing platforms (e.g., hiSeq^TM、MiSeq^TM and/or NextSeq^TM sequencing systems), ion Torrent^TM (e.g., ion PGM^TM and/or Ion Proton^TM sequencing systems), pacific Biosciences (e.g., PACBIO RS II Sequel sequencing systems), life Technologies^TM (e.g., SOLiD sequencing systems), oxford Nanopore (e.g., minion), roche (e.g., 454GS FLX+ and/or GS Junior sequencing systems), or any other sequencing platform of interest are provided. NGS protocols will vary depending on the particular NGS sequencing system used. Detailed protocols for sequencing, which may include, for example, further amplification (e.g., solid phase amplification), sequencing amplicons, and analyzing sequencing data, are available from the manufacturer of the NGS sequencing system used.

Further details regarding the method of obtaining sequence data from single cells are provided, for example, as described above, in U.S. patent application publication No. US2018/0088112, U.S. patent application publication No. 2018/0200710, U.S. patent application publication No. US2018/0346970, U.S. patent application publication No. 2019/0056415, U.S. patent application publication No. US 2020/0248563, U.S. patent application publication No. 2020/0299672, and U.S. patent application publication No. 2021/0171940, the disclosures of which are incorporated herein by reference.

Correlating cell count data with sequencing data

The sequencing protocol generates sequencing data for the labeled cells. The sequencing data can be easily correlated to the cell count data of the labeled cells such that the cell count data obtained from the same cells can be paired with the sequencing data. In other words, a given cell count data, e.g., an imaging dataset, and a given sequencing dataset may be correlated to originate from the same cell, e.g., as described in more detail below.

After obtaining cell count data and sequencing data, e.g., as described above, the cell count data obtained for the self-contained cells is correlated with the sequencing data. Correlation refers to pairing cell count data with sequencing data as data from the same cell. Thus, cell count data obtained from the same labeled cells can be paired with sequencing data. In other words, a given cell count dataset and a given sequencing dataset may be identified as being obtained from the same cell and subsequently paired or otherwise associated with each other. Thus, correlated cell count data and sequencing data for single cells of a cell sample can be obtained.

Cell count data is correlated with sequencing data by fluorescence features and oligonucleotide barcodes provided by double indexed beads bound to labeled cells from which flow cytometry data and sequencing data were obtained. In the sequencing data obtained, sequence reads of the cell targets of the labeled cells and the double-indexed bead oligonucleotide barcodes are obtained, for example, as described above. In other words, for each labeled cell analyzed in a given workflow, the sequence of the double-indexed bead oligonucleotide barcode bound to the cell and the sequence of the cell target nucleic acid, e.g., mRNA from the cell, are obtained. For each labeled cell, these acquired sequences are acquired, for example, by a protocol as described above (which may be a second generation sequencing protocol) in which libraries are generated from the original sequences, with each member of a given library generated from the same partition sharing a common cell label. Thus, the sequence-sequenced fragments from the cell target nucleic acid and the double-indexed bead oligonucleotide barcode obtained from the cell both share the same cellular marker, i.e., they all share a common cellular marker. In correlating cells with imaging data, all sequencing fragments from the target nucleic acid and the double indexed bead oligonucleotide barcodes that have the same cell marker domain, i.e., share a common cell marker, can be paired or correlated. This pairing or association produces a collection of sequencing fragments comprising the target nucleic acid and the double indexed bead oligonucleotide barcode nucleic acid, and these sequencing fragments can be identified as being derived from the same cell.

The resulting sequencing data comprising the target nucleic acid and the sequenced fragment of the double-indexed bead oligonucleotide barcode nucleic acid can then be matched, i.e., paired or correlated, with the cell count data. As previously described, the cell count data for labeled cells comprises the fluorescent characteristics of those cells, wherein the characteristics are provided by one or more double indexed beads that bind to those cells. When cell count data is obtained by flow cytometry analysis, a series of fluorescent signals, or a collection thereof, of one or more double indexed beads bound to the cells is obtained, wherein the series may be referred to as cell-specific fluorescent features. Different cells in a given workflow have individually unique cell-specific fluorescent characteristics. The given fluorescent signal provided by the double indexed beads constituting such a cell-specific fluorescent label can be assigned to a specific portion of the sequenced fragment, since the sequence of the double indexed bead oligonucleotide barcode that generated the fluorescent signal is known. Thus, each cell-specific fluorescent characteristic obtained from a given labeled cell can be used to determine a different double-indexed bead oligonucleotide barcode sequence bound to that cell. Since the sequence of the double indexed bead oligonucleotide barcode is present in the sequencing fragment of the nucleotide barcode, a given cell-specific fluorescent characteristic can be determined to be associated with a given sequencing dataset. Once a cell-specific fluorescence signature is associated with a given sequencing dataset, it can be determined that the sequencing data originated from labeled cells within the same partition from which the fluorescence signature was acquired. In other words, the fluorescence characteristics of a given cell can be obtained from a series of fluorescence signals obtained from that cell in a cytometry analysis. Because a given fluorescent feature can be matched to a sequencing fragment of an oligonucleotide barcode from a double indexed bead, the fluorescent feature can be matched to a sequence sequencing fragment of the double indexed bead that produced the feature, wherein the matched sequencing fragment from the double indexed bead can then be used to identify all sequencing data acquired from a self-contained partition and cells within that partition. After the sequencing data is assigned to a given partition, the sequencing data can be readily correlated with the cell count data of the cells within that partition. Thus, correlated cell count data and sequencing data for single cells of a cell sample can be obtained.

Kit for detecting a substance in a sample

Aspects of the invention also include kits and compositions useful in practicing various embodiments of the methods of the invention. Kits of the invention can include a population of double indexed beads, beads comprising bead-bound nucleic acids, wherein the nucleic acids comprise, for example, a cell marker domain and a target binding region as described above, as desired, and/or other desired reagents. The double indexed bead population may comprise a variable number of unique fluorescent barcodes and oligonucleotide barcodes that differ from one another. Although the number of different double index beads for a given population may vary, in some cases the number is from 5 to 1000, for example from 10 to 500.

The kit may also include one or more additional components useful in practicing embodiments of the method. For example, the kit may comprise components for producing labeled cells, such as a large pore (macro-cell) plate, a liquid container such as a tube, and the like. In addition, the kit may include one or more components for obtaining sequence data, such as one or more primers, a polymerase (e.g., thermostable polymerase and reverse transcriptase, all having hot start properties, etc.), a double-strand specific DNase (dsDNAse), an exonuclease, dNTPs, a metal cofactor, one or more nuclease inhibitors (e.g., RNase inhibitor and/or DNase inhibitor), one or more molecular crowding agents (molecular crowding agent) (e.g., polyethylene glycol, etc.), one or more enzyme stabilizing components (e.g., DTT), a stimulus-responsive polymer, or any other desired kit component, such as, for example, a device, a solid support, a container, a cartridge, such as a tube, a bead, a plate, a microfluidic chip, etc., as described above. The components of the kit may be present in separate containers, or the components may be present in a single container.

In addition to the components described above, the subject kits may also include (in certain embodiments) instructions for practicing the subject methods. These instructions may be present in the subject kits in a variety of forms, one or more than one of which may be present in the kit. One form in which these instructions may be present is printed information on a suitable medium or substrate, such as printing the information on one or more sheets of paper in the package, package insert of the kit. Another form of such instructions is a computer-readable medium, such as a magnetic disk, compact Disk (CD), portable flash drive, etc., having information recorded thereon. Another form in which these specifications may exist is a website address, which may be used to access information from a remote website over the internet.

Although the foregoing invention has been described in some detail by way of illustration and example for purposes of clarity of understanding, it will be readily apparent to those of ordinary skill in the art in light of the teachings of this invention that certain changes and modifications may be made thereto without departing from the spirit or scope of the appended claims.

Thus, the foregoing merely illustrates the principles of the invention. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the principles of the invention and are included within its spirit and scope. Furthermore, all examples and conditional language recited herein are principally intended to aid the reader in understanding the principles of the invention and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Moreover, all statements herein reciting principles, aspects, and embodiments of the invention, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Furthermore, such equivalents are intended to include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure. Furthermore, nothing disclosed herein is intended to be dedicated to the public regardless of whether such disclosure is explicitly recited in the claims.

Thus, the scope of the invention is not intended to be limited to the exemplary embodiments shown and described herein. Rather, the scope and spirit of the invention are embodied by the appended claims. In the claims, 35u.s.c. ≡112 (f) or 35u.s.c. ≡112 (6) is explicitly defined as being incorporated into the claims only when the exact phrase "means for..or the exact phrase" step for..is stated at the beginning of the definition in the claims, 35u.s.c. ≡112 (f) or 35u.s.c. ≡112 (6) is not incorporated into the definition in the claims if such exact phrase is not used in the definition in the claims.

Claims

1. A method of preparing a population of labeled cells, the method comprising:

Mixing a cell sample comprising a plurality of cells with a labeling composition comprising a plurality of different double-indexed beads, each bead having a different fluorescent barcode and oligonucleotide barcode, under conditions sufficient to stably bind cells in the cell sample to one or more double-indexed beads to produce a labeled population of cells.

2. The method of claim 1, wherein the different fluorescent barcodes of the double indexed beads comprise unique combinations of one or more fluorophores at one or more signal levels.

3. The method of claim 2, wherein the one or more fluorophores are one to four fluorophores.

4. A method according to claim 2 or 3, wherein the one or more signal levels comprise one to five signal levels.

5. The method of any one of claims 2 to 4, wherein the one or more fluorophores comprise conjugated polymer dyes.

6. The method of any one of the preceding claims, wherein the different oligonucleotide barcodes of the double indexed beads are 10nt to 500nt in length.

7. The method of any one of the preceding claims, wherein the different oligonucleotide barcodes comprise in a 5 'to 3' direction:

A primer binding site;

Double index bead barcode domain, and

A capture domain.

8. The method of claim 7, wherein the different oligonucleotide barcodes have a common primer binding site and capture domain.

9. The method of any one of claims 7 to 8, wherein the capture domain comprises a polyA sequence.

10. The method of any one of the preceding claims, wherein the double indexed beads further comprise a cell binding member configured to provide stable binding to cells.

11. The method of claim 10, wherein the cell binding member comprises a specific binding member.

12. The method of claim 11, wherein the specific binding member comprises an antibody or binding fragment thereof.

13. The method of claims 11 and 12, wherein the specific binding member specifically binds a universal cellular marker.

14. The method of claim 13, wherein the universal cellular marker is a non-phenotypic marker.

15. The method of claim 14, wherein the universal cellular marker is selected from the group consisting of CD44, CD45, CD47 and beta-2 microglobulin.

16. The method of claim 10, wherein the cell binding member is covalently bound to a cellular component.

17. The method of any one of the preceding claims, wherein the double indexed beads comprise a polymeric material.

18. The method of any one of the preceding claims, wherein the double indexed beads have a diameter of 1 μιη to 10 μιη.

19. The method of any one of the preceding claims, wherein the labeled cells of the labeled population of cells comprise one to four stably-bound double indexed beads.

20. The method of any one of the preceding claims, wherein the labeling composition comprises an excess of different double indexed beads relative to the number of cells in the cell sample.

21. The method of any one of the preceding claims, wherein the cell sample comprises 50 to 50000000 cells.

22. The method of any one of the preceding claims, wherein the method further comprises labeling the cells with a phenotypic biomarker marker.

23. The method of claim 22, wherein the phenotypic biomarker marker comprises a fluorescently labeled specific binding member.

24. The method of any one of the preceding claims, further comprising determining the labeled cell population with flow cytometry to obtain flow cytometry data for the labeled cell population.

25. The method of claim 24, wherein the flow cytometry data comprises imaging data.

26. The method of any one of the preceding claims, wherein the method further comprises obtaining sequencing data for the labeled population of cells.

27. The method of claim 26, wherein the sequencing data is obtained using a second generation sequencing protocol.

28. The method of claim 27, wherein the second generation sequencing protocol comprises generating a sequencing ready library from the population of labeled cells.

29. The method of claim 28, wherein the sequencing-ready library is generated using a barcoded bead/partition scheme.

30. The method of any one of claims 26 to 29, wherein the method further comprises correlating sequencing data of one or more determined cells with flow cytometry data.

31. A different population of double indexed beads, wherein each double indexed bead has a different fluorescent barcode and oligonucleotide barcode.

32. The population of claim 31, wherein the different fluorescent barcodes of the double indexed beads comprise unique combinations of one or more fluorophores at one or more signal levels.

33. The population of claim 32, wherein the one or more fluorophores are one to four fluorophores.

34. The population of claim 32 or 33, wherein the one or more than one signal level comprises one to five signal levels.

35. The population of any one of claims 32 to 34, wherein the one or more fluorophores comprise conjugated polymer dyes.

36. The population of any one of claims 31 to 35, wherein the different oligonucleotide barcodes of the double indexed beads are 10nt to 500nt in length.

37. The population of any one of claims 31 to 36, wherein the different oligonucleotide barcodes comprise in a 5 'to 3' direction:

A primer binding site;

Double index bead barcode domain, and

A capture domain.

38. The population of claim 37, wherein the different oligonucleotide barcodes have a common primer binding site and capture domain.

39. The population of any one of claims 37-38, wherein the capture domain comprises a polyA sequence.

40. The population of any one of claims 31-39, wherein the double indexed beads further comprise a cell binding member configured to provide stable binding to cells.

41. The population of claim 40, wherein the cell binding members comprise specific binding members.

42. The population of claim 41, wherein the specific binding members comprise antibodies or binding fragments thereof.

43. The population of claims 41 and 42, wherein the specific binding member specifically binds a universal cellular marker.

44. The population of claim 43, wherein the universal cellular markers are non-phenotypic markers.

45. The population of claim 44, wherein the universal cellular markers are selected from the group consisting of CD44, CD45, CD47 and beta-2 microglobulin.

46. The population of claim 40, wherein the cell binding member is covalently bound to a cellular component.

47. The population of any of the preceding claims, wherein the double indexed beads comprise polymeric material.

48. The population of any one of the preceding claims, wherein the double indexed beads have a diameter of 1 μιη to 10 μιη.

49. A double indexed bead of the population of any of claims 31-48.

50. A kit comprising a population of double indexed beads according to any of claims 31 to 48.