The present application claims priority from U.S. provisional patent application Ser. No. 63/448,930, filed on 28 at 2 months 2023, the disclosure of which is incorporated herein by reference in its entirety.
Current technology allows measurement of gene expression of single cells in a massively parallel manner (e.g., >10000 cells) by attaching cell-specific oligonucleotide barcodes to poly (a) mRNA molecules from single cells, as each cell is co-located with barcoded reagent beads in a compartment. BD RhapsodyTM Single cell analysis System is a platform that supports measurement of single cell gene expression in a massively parallel manner. BD RhapsodyTM Single cell analysis systems are platforms that can capture nucleic acids from single cells in high throughput using simple microplate (cartridge) workflow and multi-layered bar code systems. The resulting capture materials can be used to generate various types of second generation sequencing (NGS) libraries, including libraries suitable for whole transcriptome analysis, e.g., libraries for whole transcriptome analysis for discovery biology and for targeted RNA analysis for high sensitivity transcript detection. Shum et al ,"Quantitation of mRNA Transcripts and Proteins Using the BD RhapsodyTM Single-Cell Analysis System,"Adv Exp Med Biol.2019;1129:63-79.
Gene expression can affect protein expression. Protein-protein interactions can affect gene expression and protein expression. Accordingly, systems and methods have recently been developed that can quantitatively analyze the protein expression of cells and simultaneously measure the protein expression and gene expression of cells. BD Abseq platforms are one such platform. Abseq is a method for analyzing proteins in and on single cells. In Abseq, conventional fluorescently labeled antibodies are replaced with nucleic acid sequence tags that can be read at the single cell level, for example, by bar code and NGS sequencing. "Abseq is aimed at enabling sensitive, precise and comprehensive characterization of protein and mRNA transcripts in a large number of single cells. Like conventional immunostaining, cells bind to antibodies directed against different target epitopes, but the antibodies are labeled with unique sequence tags. When an antibody binds to its target, the DNA tag is carried along with it, so that the presence of the target can be deduced from the presence of the tag. In this way, counting the tags provides an assessment of the different epitopes present in the cell, such as the cell detected by antibody binding. Shahi et al ,"Abseq:Ultrahigh-throughput single cell protein profiling with droplet microfluidic barcoding.Sci Rep 7,44447(2017).".
Flow cytometry is a technique for characterizing physical and/or chemical properties of a cell sample, such as detection, measurement, and the like. In flow cytometry, a cell sample is suspended in a fluid and injected into a flow cytometer instrument. The sample is focused to flow the laser beam, ideally one cell at a time, where the scattered light is characteristic of the cells and their components. Cells are typically labeled with a fluorescent label (e.g., an antibody-fluorophore) such that light is emitted in the wavelength band after being absorbed. Tens of thousands of cells can be rapidly detected and data collected therefrom. Flow cytometry is routinely used in basic research, clinical practice and clinical trials. Applications for flow cytometry include, but are not limited to, cell counting, cell sorting, determining cell characteristics and function, microbiological detection, biomarker detection, protein engineering detection, diagnosis of health conditions, genome size measurement, and the like. A flow cytometer is an instrument that provides quantifiable data of a sample. Other instruments that employ flow cytometry include cell sorters that physically separate and thereby purify target cells based on their optical properties.
Detailed Description
Methods and compositions are provided for preparing labeled cell populations, e.g., which can be used in protocols for obtaining correlated single cell flow cytometry data and sequencing (e.g., multicell) data. Aspects of the methods include mixing a cell sample composed of a plurality of cells with a labeling composition comprising a plurality of different double-indexed beads, each bead having a unique fluorescent barcode and oligonucleotide barcode, under conditions sufficient to stably bind the cells in the cell sample to one or more double-indexed beads to produce a labeled population of cells. In embodiments, the flow cytometry and sequencing workflow is performed on a population of labeled cells, and the obtained flow cytometry data and sequencing data may be correlated. Compositions for carrying out the method are also provided.
Before the present invention is described in more detail, it is to be understood that this invention is not limited to particular embodiments described, as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting, since the scope of the present invention will be limited only by the appended claims.
Where a range of values is provided, it is understood that unless the context clearly dictates otherwise, to the nearest tenth of the unit of the lower limit, the invention includes every intermediate value between the upper and lower limits of that range, and any other stated or intermediate value within that range. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges, and are also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the invention.
Certain ranges set forth herein are used with the term "about" preceding the numerical values. The term "about" as used herein provides literal support for the exact numbers following, as well as for numbers near or near the end of the figure. In determining whether a number is close or approximately to a specifically recited number, the close or approximately non-recited number may be the number that provides substantial identity to the specifically recited number in the context in which it appears.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present invention, representative illustrative methods and materials are described below.
All publications and patents cited in this specification are herein incorporated by reference as if each individual publication or patent were specifically and individually indicated to be incorporated by reference and were set forth herein by reference to disclose and describe the methods and/or materials in connection with which the publications were cited. The citation of any publication is for its disclosure prior to the filing date and should not be construed as an admission that the present invention is not entitled to antedate such publication by virtue of prior invention. Furthermore, the dates of publication provided may be different from the actual publication dates, which may need to be independently confirmed.
It should be noted that, as used herein and in the appended claims, the term "pre-countless" includes plural referents unless the context clearly dictates otherwise. It should also be noted that the claims may be written to exclude any optional elements. Accordingly, this statement is intended to serve as antecedent basis for use of exclusive terminology such as "solely," "only" and the like in connection with recitation of claim elements, or use of "negative" limitation.
It will be apparent to those skilled in the art after reading this disclosure that each of the individual embodiments described and illustrated herein has discrete compositions and features that can be readily separated from or combined with the features of any of the other several embodiments without departing from the scope or spirit of the invention. Any recited method may be performed in the order of the recited events or in any other order that is logically possible.
Although the system and method has been or will be described for the sake of grammatical fluidity and functional explanation, it is to be clearly understood that the claims are not to be construed as necessarily limited in any way by the interpretation of the terms of "means" or "steps" limitation, but are to be accorded the full scope of meaning and equivalents of the definitions provided by the claims in accordance with the judicial doctrine of equivalents, and are to be accorded the full legal equivalents in accordance with 35U.S. C. ≡112 where the claims are expressly specified in accordance with 35U.S. C. ≡112.
Method of
As outlined above, methods of preparing a population of labeled cells are provided. In some cases, the population of labeled cells is comprised of a plurality of distinguishable labeled cells, wherein the plurality of distinguishable labeled cells have different or unique fluorescent characteristics associated therewith. "fluorescence signature" (i.e., fluorescent identifier) refers to a composite spectrum or aggregate spectrum composed of fluorescence emission signals obtained from one or more fluorophores stably associated with labeled cells, such as provided by one or more double indexed beads (as described in more detail below) associated with the cells. In the population of labeled cells produced by the methods of embodiments of the present invention, different ones of the plurality of cells have distinguishable fluorophores constituting fluorescent identifiers bound thereto, and thus provide different fluorescent characteristics, for example, when determined by a flow cytometry protocol. Thus, the different labeled cells in the cell population are distinguishable from each other by the presence of their unique fluorescent characteristics.
Double indexed beads
As described above, the fluorescent characteristics of a given labeled cell produced by an embodiment of the invention are provided by one or more double indexed beads that stably bind to the cell. Double indexed beads are particle compositions comprising one or more fluorophores that make up the fluorescent barcode of the bead, as well as oligonucleotide barcodes. The double indexed beads that may be employed in embodiments of the invention may have any convenient shape and size. The beads are solid supports, may comprise any type of solid, porous or hollow spheres, carriers, cylinders or other similar configurations composed of plastic, ceramic, metal or polymeric materials (e.g., hydrogels), onto which nucleic acids may be immobilized (e.g., covalently or non-covalently) and one or more fluorophores may be incorporated. The solid support may comprise discrete particles that are spherical (e.g., microspheres) or have non-spherical or irregular shapes such as cubes, cuboids, cones, cylinders, cones, ovals, discs, etc. The beads may be non-spherical. The bead size can vary as desired, and is typically sized such that the double indexed beads are smaller than the cells, with the double indexed beads in some cases ranging in size from 0.5 μm to 20 μm, such as from 1 μm to 10 μm.
Fluorescent bar code
The fluorescent barcode of a given double indexed bead comprises one or more than one fluorescent dye. When a given fluorescent barcode contains more than one fluorescent dye, the two or more fluorescent dyes together form a fluorescent barcode of a double indexed bead. Thus, in embodiments of the invention, a given fluorescent barcode may consist of a single fluorescent dye or two or more fluorescent dyes, e.g., 2 to 20, such as 2 to 10, including 2 to 5, such as 2 to 3, fluorescent dyes, which together comprise the fluorescent barcode of the bead. Thus, the number of different fluorophores constituting a given fluorescent barcode may vary, in some cases from 1 to 10, such as from 1 to 5, including from 1 to 3. Any given two distinguishable fluorescent barcodes may be distinguishable from each other based on the fluorophore type and/or the intensity of the signal they provide. Thus, any two distinguishable fluorescent barcodes may be distinguishable from each other based on the fluorescent signal (e.g., emission wavelength maximum) and/or its intensity, the fluorescent dye comprising the fluorescent barcode, and/or the amount thereof. For example, two distinguishable fluorescent barcodes may be distinguishable from each other in that they are made up of a combination of different types of fluorophore dyes, e.g., one comprising fluorophore a, fluorophore b, and fluorophore c, and the other comprising fluorophore b, fluorophore c, and fluorophore d. The two distinguishable fluorescent barcodes may also be distinguishable from each other in that they are composed of different amounts of fluorescent dye, e.g., one of which is composed of fluorophore a, fluorophore b, and fluorophore c present in a first amount in a given double indexed bead, and the other of which is composed of fluorophore present in a second amount different from the first amount, wherein the value of the second amount different from the first amount may be detected, e.g., by a difference in signal brightness. Combinations of types and amounts of fluorophores can be employed to provide any desired number of unique fluorescent barcodes.
If desired, the fluorescent barcode comprises one or more fluorophores. Thus, the double indexed beads may comprise a single type of fluorophore. Or a given double indexed bead may comprise two or more different types of fluorophores. Examples of fluorophores include, but are not limited to, acridine and its derivatives such as acridine, acridine orange, acridine yellow, acridine red and acridine isothiocyanate, 5- (2' -aminoethyl) aminonaphthalene-1-sulfonic acid (EDANS), 4-amino-N- [ 3-vinylsulfonyl) phenyl ] naphthalimide-3, 5 disulfonic acid (Lucifer Yellow VS), N- (4-amino-1-naphthyl) maleimide, anthranilamide, brilliant yellow, coumarin and its derivatives such as Coumarin, 7-amino-4-methylcoumarin (AMC, coumarin 120), 7-amino-4-trifluoromethylcoumarin (Coumaran), cyanine and its derivatives such as acid Red 92 (cyanosine), cy3, cy5, Cy5.5 and Cy7, 4', 6-diamidino-2-phenylindole (DAPI), 5' -dibromo-o-phenyl-sulfonyl-phthalein (bromophthal-red), 7-diethylamino-3- (4 ' -isothiocyanaphenyl) -4-methylcoumarin, diethylaminocoumarin, diethylenetriamine pentaacetic acid, 4' -diiso-dihydrostilbene-2, 2' -disulfonic acid, 4' -diisocyanato stilbene-2, 2' -disulfonic acid, 5- [ dimethylamino ] naphthalene-1-sulfonyl chloride (DNS, dansyl chloride), 4- (4 ' -dimethylaminophenyl azo) benzoic acid (DABCYL), 4-dimethylaminophenyl-4 ' -isothiocyanate (DABITC), eosin and its derivatives such as eosin, erythrosin and its derivatives such as erythrosin B and erythrosin, ethidium bromide, fluorescein and its derivatives such as 5-carboxyfluorescein (FAM), and its derivatives such as erythrosin, 5- (4, 6-dichlorotriazin-2-yl) aminofluorescein (DTAF), 2'7' -dimethoxy-4 ', 5' -dichloro-6-carboxyfluorescein (JOE), fluorescein Isothiocyanate (FITC), chlorotriazinyl fluorescein, naphthofluorescein and QFITC (XRITC), fluorescamine, IR144, IR1446, LISSAMINE TM, lissamine rhodamine, fluorescein, malachite green, 4-methylumbelliferone, o-cresolphthalein, nitrotyrosine, accessory red, nile red, oregon green, phenol red, B-phycoerythrin, phthaldehyde, pyrene and derivatives thereof such as pyrene, Pyrene butyrate and pyrene butyrate succinimide, activated Red 4 (CibacronT TM bright Red 3B-A), rhodamine and its derivatives such as 6-carboxy-X-Rhodamine (ROX), 6-carboxy rhodamine (R6G), 4,7-dichloro rhodamine-lissamine (4, 7-dichlororhodamine lissamine), rhodamine B sulfonyl chloride, rhodamine (Rhod), rhodamine B, rhodamine 123, rhodamine isothiocyanate X, sulforhodamine B, sulforhodamine 101, sulfochloride derivatives of sulforhodamine 101 (Texas Red), N, N' -tetramethyl-6-carboxyrhodamine (TAMRA), tetramethyl rhodamine, and Tetramethyl Rhodamine Isothiocyanate (TRITC); riboflavin, rhodoic acid and terbium chelate derivatives, xanthenes, alexa-Fluor dyes (e.g. Alexa Fluor 350、Alexa Fluor 430、Alexa Fluor 488、Alexa Fluor 546、Alexa Fluor 555、Alexa Fluor 568、Alexa Fluor 594、Alexa Fluor 633、Alexa Fluor 647、Alexa Fluor 660、Alexa Fluor 680、Alexa Fluor700、Alexa Fluor 750)、Pacific Blue、Pacific Orange、Cascade Blue、Cascade Yellow; Quantum dot dye (Quantum Dot Corporation)), dyight dyes of Pierce (Rockford, ill.) including Dyight 800, Dyight 680, dyight 649, dyight 633, dyight 549, dyight 488, dyight 405, or combinations thereof. Other fluorophores known to those skilled in the art, or combinations thereof, may also be used, such as those available from Molecular Probes (Eugene, oreg.) and expiton (Dayton, ohio).
In some cases, the fluorophore is a polymeric dye (e.g., a fluorescent polymeric dye). Fluorescent polymer dyes useful in the subject methods are manifold. In some cases of the method, the polymeric dye comprises a conjugated polymer. Conjugated Polymers (CPs) are characterized by a delocalized electron structure comprising a backbone of alternating unsaturated bonds (e.g., double and/or triple bonds) and saturated bonds (e.g., single bonds), wherein pi electrons can move from one bond to another. Thus, the conjugated backbone may impart an extended linear structure to the polymeric dye with limited bond angles between the polymer repeat units. For example, proteins and nucleic acids are polymers at the same time, but in some cases do not form extended rod-like structures but fold into a more advanced three-dimensional shape. In addition, CPs may form a "rigid rod" polymer backbone and experience limited torsion (e.g., twist) angles between repeating units along the polymer backbone. In some cases, the polymeric dye comprises a CP having a rigid rod structure. The structural characteristics of the polymeric dye can affect the fluorescent properties of the molecule.
Any convenient polymeric dye may be used in the subject devices and methods. In some cases, the polymeric dye is a multichromophore having a structure capable of capturing light that amplifies the fluorescence output of the fluorophore. In some cases, the polymeric dye can capture light and convert it efficiently to longer wavelength emitted light. In some cases, the polymeric dye has a light trapping multichromophore system that is capable of energy efficient transfer to an adjacent luminescent material (e.g., a "signaling chromophore"). Energy transfer mechanisms include, for example, resonance energy transfer (e.g., forster (or fluorescence) resonance energy transfer, FRET), quantum charge exchange (tex energy transfer), and the like. In some cases, these energy transfer mechanisms are relatively short-range, i.e., the close proximity of the light trapping multichromophore system to the signaling chromophore can achieve efficient energy transfer. Under high energy transfer conditions, the emission of the signaling chromophore is amplified when the number of individual chromophores in the light trapping multichromophore system is large, i.e. when the incident light ("excitation light") is at a wavelength absorbed by the light trapping multichromophore system, the emission of the signaling chromophore is more intense than when it is directly excited by the pump light.
The multichromophore may be a conjugated polymer. Conjugated Polymers (CPs) are characterized by delocalized electronic structures and can be used as highly responsive optical reporter molecules for chemical and biological targets. Since the effective conjugation length is substantially shorter than the polymer chain length, the backbone comprises a large number of closely adjacent conjugated segments. Thus, conjugated polymers are efficient at capturing light and can achieve optical amplification by forster energy transfer.
Target polymer dyes include, but are not limited to, those described in U.S. Pat. Nos. 7,270,956, 7,629,448, 8,158,444, 8,227,187, 8,455,613, 8,575,303, 8,802,450, 8,969,509, 9,139,869, 9,371,559, 9,547,008, 10,094,838, 10,302,648, 10,458,989, 10,641,775 and 10,962,546, the disclosures of which are incorporated herein by reference in their entirety, and Gaylord et al, J.am.chem.Soc.,2001,123 (26), pp 6417-6418, feng et al, chem.Soc.Rev.,2010,39,2411-2419, and Traina et al, J.am.chem.Soc.,2011,133 (32), pp 12600-12607, the disclosures of which are incorporated herein by reference in their entirety. Specific polymer dyes that may be employed include, but are not limited to BD Horizon BrilliantTM dyes, such as BD Horizon BrilliantTM Violet dyes (e.g., BV421, BV510, BV605, BV650, BV711, BV 786), BD Horizon BrilliantTM Ultraviolet dyes (e.g., BUV395, BUV496, BUV737, BUV 805), and BD Horizon BrilliantTM Blue dyes (e.g., BB515, BB550, BB 790) (BD Biosciences, san Jose, CA). Any fluorescent pigments known to those skilled in the art, including but not limited to those described above, or not yet discovered, may be used in the subject methods.
In some cases, each fluorophore making up a given bar code may be excited by a common light source, such as a common laser. In such cases, each of the multiple fluorophores constituting a given barcode may have a common excitation wavelength range (e.g., they are excited by wavelength ranges that differ from each other by 50nm or less than 50nm, such as 25nm or less than 25nm, including 10nm or less than 10nm, e.g., 5nm or less than 5 nm), but differ from each other in emission maxima. In such cases, each of the multiple fluorophores that make up a given barcode may have a common excitation maximum, but differ from each other in terms of emission maximum.
As described above, any two distinguishable fluorescent barcodes may be distinguishable from each other based on the type of fluorophore comprising the barcode and/or the intensity of the signal it provides. Thus, any two different barcodes may be distinguished based on the fluorescent signal of the fluorescent signal obtained from the barcode and/or its intensity. For example, two distinguishable fluorescent barcodes can be distinguished from each other in that they are made up of a combination of different types of fluorophores, e.g., one of which contains fluorophore a, fluorophore b, and fluorophore c, and the other contains fluorophore b, fluorophore c, and fluorophore d. The two distinguishable fluorescent barcodes may also be distinguishable from each other in that they are composed of different amounts of fluorescent dye, e.g., one of which is composed of fluorophore a, fluorophore b, and fluorophore c present in a first amount in a given double indexed bead, and the other of which is composed of fluorophore present in a second amount different from the first amount, wherein the value of the second amount different from the first amount may be detected, e.g., by a difference in signal brightness. By binding different amounts of fluorophores to the double indexed beads, different brightnesses can be easily provided. Combinations of types and amounts of fluorophores can be employed to provide any desired number of unique fluorescent barcodes.
Oligonucleotide bar code
In addition to fluorescent barcodes, the double indexed beads employed in embodiments of the invention comprise oligonucleotide barcodes. Oligonucleotide barcodes may vary in length, in some cases from 10nt to 500nt, such as 15nt to 100nt. In some cases, the oligonucleotide barcode may be composed of ribonucleic acid or deoxyribonucleic acid, as desired. The oligonucleotide barcodes of embodiments of the invention may comprise a double indexed bead barcode domain, as well as other domains useful in embodiments of the invention, such domains may include capture sequences, primer binding sites, and the like.
The oligonucleotide barcode may comprise one or more of a double indexed bead barcode domain, a capture sequence, a primer binding site, and the like. The double-indexed bead barcode domain is a unique identifier and is a domain or region that can be used to identify the double-indexed bead to which it binds, for example, by its sequence. The unique identifier may be, for example, a nucleotide sequence having any suitable length, such as from about 4 nucleotides to about 200 nucleotides. In some embodiments, the unique identifier is a nucleotide sequence of 25 nucleotides to about 45 nucleotides in length. In some embodiments, the unique identifier may be 4 nucleotides, 5 nucleotides, 6 nucleotides, 7 nucleotides, 8 nucleotides, 9 nucleotides, 10 nucleotides, 15 nucleotides, 20 nucleotides, 25 nucleotides, 30 nucleotides, 35 nucleotides, 40 nucleotides, 45 nucleotides, 50 nucleotides, 55 nucleotides, 60 nucleotides, 70 nucleotides, 80 nucleotides, 90 nucleotides, 100 nucleotides, 200 nucleotides, or a range between any two of the above values in length of about the following, less than the following, greater than the following.
As described above, the oligonucleotide barcode may comprise a capture sequence, e.g., a domain or region thereof that is a binding site for a target binding region, e.g., a target binding region of a bead binding barcode nucleic acid. The target capture sequence may vary, and may be specific or random or semi-random, as desired. In some cases, the capture sequence is a sequence that hybridizes to a target binding region of the bead-binding nucleic acid, e.g., as described in more detail below. In some cases, the capture sequence is a poly (a) sequence configured to hybridize to an oligo t target binding region, as described in more detail below. In such cases, the length of the poly (a) capture sequence can vary, in some cases, the length of the poly (a) capture sequence is 3nt to 50nt, e.g., 5nt to 25nt. When present, the capture sequence may be located 5' to the oligonucleotide assembly.
The oligonucleotide barcode of the double indexed beads may comprise a primer binding site. When present, the primer binding site may be configured to bind to a primer used, for example, in the preparation of a sequencable-capable nucleic acid. For example, the oligonucleotide assembly may comprise a universal primer. A universal primer may refer to a universal or common nucleotide sequence in all specific binding members/oligonucleotide sub-barcodes employed in a given workflow. In some cases, the primer binding site can be at or about 1 nucleotide, 2 nucleotides, 3 nucleotides, 4 nucleotides, 5 nucleotides, 6 nucleotides, 7 nucleotides, 8 nucleotides, 9 nucleotides, 10 nucleotides, 11 nucleotides, 12 nucleotides, 13 nucleotides, 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides, 19 nucleotides, 20 nucleotides, 21 nucleotides, 22 nucleotides, 23 nucleotides, 24 nucleotides, 25 nucleotides, 26 nucleotides, 27 nucleotides, 28 nucleotides, 29 nucleotides, 30 nucleotides, or a number or range of lengths between any two of these nucleotides. The primer binding site length can vary and can be at least or up to 1 nucleotide, 2 nucleotides, 3 nucleotides, 4 nucleotides, 5 nucleotides, 6 nucleotides, 7 nucleotides, 8 nucleotides, 9 nucleotides, 10 nucleotides, 11 nucleotides, 12 nucleotides, 13 nucleotides, 14 nucleotides, 15 nucleotides, 16 nucleotides, 17 nucleotides, 18 nucleotides, 19 nucleotides, 20 nucleotides, 21 nucleotides, 22 nucleotides, 23 nucleotides, 24 nucleotides, 25 nucleotides, 26 nucleotides, 27 nucleotides, 28 nucleotides, 29 nucleotides or 30 nucleotides in length. The length of the universal primer may vary, and in some cases may be 5 nucleotides to 30 nucleotides in length. The primer binding site may be located at the 5' end of the oligonucleotide barcode assembly.
As described in more detail below, the cell sample is mixed with a plurality of different double indexed beads while the cells of the cell sample are labeled. In the plurality of different double indexed beads, the oligonucleotide barcodes may share a common domain. For example, the oligonucleotide barcodes of different double indexed beads may have a common capture domain, primer binding site, and the like. In such cases, the capture domain, primer binding site, and other common domains can have the same sequence, such that the plurality of beads have the same common sequence, e.g., the same primer binding site, the same capture domain, etc.
FIG. 1 provides a schematic representation of a double indexed bead according to an embodiment of the invention. As shown, the double indexed beads 100 are polymer beads having a diameter of 0.5 μm to 20 μm, for example 1 μm to 10 μm. The polymer beads include three different fluorophores 102, 104, and 106, which together form the fluorescent barcode of the bead. Oligonucleotide barcodes 108 are also shown. As shown, by varying the number and/or brightness of the fluorophores of the bar code, a large number of different fluorescent bar codes can be obtained.
Cell binding members
The double indexed beads of embodiments of the invention may include a cell binding member configured to provide stable binding to cells. The cell binding member is an entity that stably binds the double indexed beads to cells such that the double indexed beads remain bound to cells during a cell count analysis. In some cases, the cell binding member may be a specific binding member. The specific binding member may vary. The term "specific binding" refers to direct binding between two molecules due to, for example, covalent, electrostatic, hydrophobic, ionic and/or hydrogen bond interactions, including, for example, salt and water bridge interactions. Specific binding members describe members of a pair of molecules that have binding specificity for each other. Members of a specific binding pair may be of natural origin or wholly or partially synthetically produced. One member of the pair has a region or cavity on its surface that specifically binds to and is therefore complementary to the specific spatial and polar structure of the other member of the pair. Thus, the members of the pair have the property of specifically binding to each other. Examples of specific binding member pairs are antigen-antibodies, biotin-avidin, hormone-hormone receptors, receptor-ligands, enzyme-substrates. Specific binding members of a binding pair exhibit high affinity and binding specificity for binding to each other. Typically, the affinity between a pair of specific binding members is characterized by a Kd (dissociation constant) of 10-6 M or less than 10-6 M, e.g., 10-7 M or less than 10-7 M, including 10-8 M or less than 10-8 M, e.g., 10-9 M or less than 10-9M、10-10 M or less than 10-10M、10-11 M or less than 10-11M、10-12 M or less than 10-12M、10-13 M or less than 10-13M、10-14 M or less than 10-14 M, including 10-15 M or less than 10-15 M. "affinity" refers to the binding strength, with increased binding affinity associated with lower KD. In embodiments, affinity is determined by Surface Plasmon Resonance (SPR), such as the method used by the Biacore system. The affinity of one molecule for another is determined by measuring the binding kinetics of the interaction, for example at 25 ℃. "affinity" refers to the binding strength, with increased binding affinity associated with lower KD. In embodiments, affinity is determined by Surface Plasmon Resonance (SPR), for example, the method used by the Biacore system. The affinity of one molecule for another is determined by measuring the binding kinetics of the interaction, for example at 25 ℃. Specific binding members may vary, with examples of specific binding members including, but not limited to, polypeptides, nucleic acids, carbohydrates, lipids, peptoids (peptides), and the like. In some cases, the specific binding member is proteinaceous. The term "protein" as used herein refers to a moiety consisting of amino acid residues. The portion of the protein may be a polypeptide. In some cases, the protein-specific binding member is an antibody. In certain embodiments, the protein-specific binding member is an antibody fragment, e.g., a binding fragment of an antibody that specifically binds to a polymeric dye. The terms "antibody" and "antibody molecule" are used interchangeably herein to refer to a protein consisting essentially of one or more polypeptides encoded by all or part of a putative immunoglobulin gene. Recognized immunoglobulin genes, for example in humans, include kappa (k), lambda (l), and heavy chain loci, which together constitute a large number of variable region genes, as well as constant region genes μ (u), δ (d), γ (g), σ (e), and α (a), which encode IgM, igD, igG, igE and IgA isoforms, respectively. Immunoglobulin light chain variable or heavy chain variable regions consist of a "framework" region (FR) interrupted by three hypervariable regions (also known as "complementarity determining regions" or "CDRs"). The framework regions and CDR ranges have been precisely defined (see "Sequences of Proteins of Immunological Interest," E.Kabat et al, U.S. device of HEALTH AND Human Services, (1991)). The amino acid sequence numbers of all antibodies discussed herein are in accordance with the Kabat system. The framework region sequences of different light or heavy chains are relatively conserved in species. The framework regions of antibodies, i.e., the framework regions that make up the combination of the light and heavy chains, are used to position and align the CDRs. CDRs are primarily responsible for binding to epitopes of the antigen. The term antibody is intended to include full length antibodies and may refer to natural antibodies, engineered antibodies from any organism, or recombinantly produced antibodies for experimental, therapeutic, or other purposes as further defined below. Antibody fragments of interest include, but are not limited to, fab ', F (ab') 2, fv, scFv, or other antibody antigen-binding subsequences, which are produced by modification of intact antibodies or synthesized de novo using recombinant DNA techniques. Antibodies can be monoclonal or polyclonal, and can have other specific activities on cells (e.g., antagonists, agonists, neutralizing antibodies, inhibitory antibodies, or stimulatory antibodies). It is understood that antibodies may have additional conservative amino acid substitutions that have substantially no effect on antigen binding or other antibody functions. In certain embodiments, the specific binding member is a Fab fragment, a F (ab') 2 fragment, a scFv, a diabody, or a triabody. In certain embodiments, the specific binding member is an antibody. In some cases, the specific binding member is a murine antibody or binding fragment thereof. In certain instances, the specific binding member is a recombinant antibody or binding fragment thereof.
The specific binding member may specifically bind any convenient cellular marker. In some cases, the specific binding member binds to a cell surface marker, wherein the target cell surface marker includes, but is not limited to, a ubiquitous cell surface marker, i.e., a cell surface marker that is expected to be present on all cells of a given cell sample to be treated in a given workflow according to the invention. Examples of ubiquitous cell surface markers to which specific binding members/oligonucleotides can specifically bind include, but are not limited to, CD44, CD45, beta-2 microglobulin, and the like. Such cell surface markers may be considered non-phenotypic cell surface markers. FIG. 1 provides a schematic representation of a double indexed bead 100 comprising antibodies 110 that bind to ubiquitous cell surface markers (e.g., CD44, CD45, beta-2 microglobulin, etc.), and provide stable binding to cells for the double indexed bead.
In some embodiments, the cell binding member is a polymer, such as a hydrogel nanobottle, having a cavity size that accommodates a single cell. The nanovials may include one or more double indexed beads stably bound thereto, as well as binding members of cells, such as specific binding members for universal cell surface markers, as desired. Nanovials that may be modified as desired herein to include one or more double indexed beads are described in U.S. pending patent application publication nos. 2019321593 and 20210268465, and PCT application publication No. WO/2020/037214, the disclosures of which are incorporated herein by reference.
In some cases, the cell binding member may provide covalent binding to a cellular construct. In such cases, the double indexed beads may comprise functionality that provides covalent binding to cellular constituents. Such functionalization may include chemically reactive groups such as sulfhydryl groups, amino groups, carboxyl groups, and the like. In certain embodiments, the reactive groups are covalently attached to the double indexed beads, which covalent attachment may be direct attachment or attachment through a linker (e.g., a polymer linker, including polyethylene glycol or PEG). Examples of such functionalized double-indexed beads include, but are not limited to, double-indexed beads coupled with isothiocyanate groups, amino groups, haloacetyl groups, maleimides, succinimidyl esters, mercapto groups, aldehyde groups, hydrazides, and sulfonyl halides, all of which can be used to covalently attach the double-indexed beads to a second molecule (e.g., a molecule within or on a cell as described herein).
Manufacture of double index beaded beads
The double indexed beads can be manufactured using any convenient scheme. In some cases, the beads can be produced in separate batches, where the ligation (i.e., conjugation) chemistry allows for the addition of unique fluorescent barcodes and oligonucleotide barcodes. Or individual bead batches may be produced by sequential ligation reactions using the same chemistry type or orthogonal ligation chemistry. In embodiments, fluorescent barcoding and oligonucleotide barcoding can be performed with a single molecule or as a mixture of fluorophores to produce a fluorescent barcode and/or an oligonucleotide mixture to produce an oligonucleotide barcode.
Combining a cell sample with a plurality of double indexed beads
As outlined above, the method of embodiments of the present invention provides a plurality of distinguishable labeled cells, each having a different fluorescent characteristic associated therewith, wherein a given fluorescent characteristic is comprised of a fluorescent barcode provided by one or more double indexed beads associated with the cell. Although the number of different double indexed beads bound to a given labeled cell may vary, in some cases the number is from 1 to 10, such as from 1 to 5, including from 1 to 4, such as from 1 to 3, for example from 1 to 2, including 1,2, or 3. Each different double indexed bead has its own unique fluorescent barcode, and the collection of different fluorescent barcodes of the beads that bind to the cell together provide a bound fluorescent signature with the cell.
In practicing the methods of the invention, a cell sample comprising a plurality of cells is provided. Although the number of cells in a given cell sample may vary, in some cases the number of cells is from 50 to 50000000, such as from 100 to 1000000 and including from 500 to 100000. The cells present in a given cell sample may be any type of cell, including prokaryotic cells and eukaryotic cells. Suitable prokaryotic cells include, but are not limited to, bacteria such as E.coli (E.coli), various Bacillus species and extreme microorganisms such as thermophilic bacteria and the like. Suitable eukaryotic cells include, but are not limited to, fungi such as yeasts and filamentous fungi, species including Aspergillus, trichoderma and Neurospora, plant cells including cells of corn, sorghum, tobacco, canola, soybean, cotton, tomato, potato, alfalfa, sunflower, etc., and animal cells including cells of fish, birds and mammals. Suitable fish cells include, but are not limited to, cells from species of salmon, trout, tilapia, tuna, carp, flatfish, halibut, sisal, cod, and zebra fish. Suitable avian cells include, but are not limited to, chicken, duck, quail, pheasant and turkey and other chicken (jungle foul) or bird-hunting cells. Suitable mammalian cells include, but are not limited to, cells from horses, cattle, buffalo, deer, sheep, rabbits, rodents such as mice, rats, hamsters and guinea pigs, goats, pigs, primates, marine mammals including dolphins and whales, as well as cell lines such as human-derived cell lines of any tissue or stem cell type and stem cells including pluripotent and non-pluripotent stem cells, and non-human fertilized eggs. Suitable cells also include cell types associated with a variety of disease states, even in non-disease states. Thus, suitable eukaryotic cell types include, but are not limited to, tumor cells of all types (e.g., melanoma, myeloid leukemia, lung cancer, breast cancer, ovarian cancer, colon cancer, kidney cancer, prostate cancer, pancreatic cancer, and testicular cancer), cardiac muscle cells, dendritic cells, endothelial cells, epithelial cells, lymphocytes (T cells and B cells), mast cells, eosinophils, vascular intima cells, macrophages, natural killer cells, erythrocytes, hepatocytes, leukocytes including mononuclear leukocytes, stem cells such as hematopoietic stem cells, neural stem cells, skin stem cells, lung stem cells, kidney stem cells, liver stem cells, and muscle stem cells (for screening for differentiation and dedifferentiation factors), osteoclasts, chondrocytes, and other connective tissue cells, keratinocytes, melanocytes, liver cells, kidney cells, and adipocytes. In certain embodiments, the cell is a primary disease state cell, such as a primary tumor cell. Suitable cells also include known research cells including, but not limited to, jurkat T cells, NIH3T3 cells, CHO, COS, and the like. See ATCC cell line catalogues, which are expressly incorporated herein by reference in their entirety.
In certain embodiments, the cells used in the present invention are obtained from a subject. As used herein, "subject" refers to humans and other animals and organisms, such as laboratory animals. Thus, the methods and compositions described herein are suitable for human and veterinary applications. In certain embodiments, the subject is a mammal, including embodiments in which the subject is a human patient suffering from (or suspected of suffering from) a disease or pathological condition.
In certain embodiments, the cells to be analyzed are enriched prior to fluorescent barcoding, for example, as described in more detail below. For example, if the target cells are white blood cells derived from a human subject, whole blood from the subject may be density gradient centrifuged to enrich peripheral blood mononuclear cells (PBMCs or white blood cells). Cells may be enriched using any convenient method known in the art, including Fluorescence Activated Cell Sorting (FACS), magnetic Activated Cell Sorting (MACS), density gradient centrifugation, and the like. Parameters for enriching a particular cell from a mixed population include, but are not limited to, physical parameters (e.g., size, shape, density, etc.), in vitro growth characteristics (e.g., response to a particular nutrient in cell culture), and molecular expression (e.g., cell surface protein or carbohydrate expression, reporter molecules such as green fluorescent protein, etc.).
In certain embodiments, the cells are living cells that remain viable during the assay. "viability maintenance" refers to the maintenance of a specified percentage of cells at the end of an assay, including from about 20% to about 100% viability. In certain other embodiments, the methods of the invention are performed in a manner that renders the cells inactive during the assay, e.g., the cells may be immobilized, permeabilized, or maintained in a buffer or under conditions in which the cells are not viable. Such parameters are typically determined by the nature of the assay being performed and the reagents employed.
In some cases, the cells may be treated with, for example, a stimulus. The stimulus used to treat the cells may vary depending on the culture conditions, exposure to changes in temperature, such as heat or cold, exposure to electromagnetic radiation, such as light, exposure to active agents, exposure to mechanical changes, and the like. Different cell samples of the plurality of cell samples may be treated with the same or different stimuli, as desired. Thus, in some cases, the method includes differentially treating two or more of the plurality of cell samples, e.g., contacting two or more of the different samples with different active agents or different concentrations of the same active agent, etc.
In practicing embodiments of the invention, the methods comprise mixing a cell sample comprising a plurality of cells with a labeling composition comprising a plurality of different double-indexed beads, e.g., as described above, under conditions sufficient to stably bind cells in the cell sample to one or more double-indexed beads, to produce a labeled population of cells. In embodiments, the cell sample and the double indexed beads are combined in a liquid (e.g., aqueous) composition. The combination may be achieved under any suitable conditions that provide cell stable binding of the double indexed beads to the cell composition. The double indexed beads can be contacted with the cells of the cell sample, for example, by introducing the double indexed beads into a container of the cell sample, such as by manual dispensing or automatic fluid dispensing. In embodiments, an excess of different beads relative to the number of cells in the cell sample is combined with the cell sample, wherein the extent of the excess may vary, in some cases ranging from 0.5% to 100% or more than 100%, such as from 1% to 50% or more than 50%, including from 5% to 25% or more than 25%. The sample and beads are combined in such a way that the double indexed beads become stably bound to the cells of the cell sample, resulting in labeled cells. If desired, the cells and beads may be combined by mixing and incubating for a period of time at a temperature suitable to provide stable binding of the beads to the cells. In some cases, the combined incubation time of the cells and beads is 15 to 120 minutes, e.g., 30 to 90 minutes (e.g., 60 minutes), and the temperature is 20 to 25 ℃, e.g., 20 to 22 ℃.
Phenotypic markers
In certain embodiments of the invention, the method may comprise detecting one or more phenotypic characteristics of the cell. Detectable phenotypic characteristics include, but are not limited to, the presence of an analyte such as a cell surface or internal marker, physical characteristics (e.g., size, shape, particle size, etc.), cell number (or frequency), and the like. Almost any detectable target feature can be measured as a detectable target phenotypic feature. In certain embodiments, the methods of the invention are directed to qualitatively or quantitatively detecting the presence of analytes, such as markers, that bind to (e.g., are within, on or attached to) a cell being assayed. In some cases, the marker employed in the phenotypic marker is not a marker to which a cell binding member of a double indexed bead, e.g., as described above, binds.
In certain embodiments thereof, the method comprises contacting the labeled cell sample with a detectable analyte-specific binding agent. An "analyte-specific binding agent" refers to any molecule, such as a nucleic acid, small organic molecule, protein, nucleic acid binding dye (e.g., ethidium bromide), that is capable of binding to a particular analyte (or a particular isomer of an analyte) in a cell, but not to other substances. Target analytes include any molecule that binds to cells or is present within cells analyzed in the subject methods. Thus, analytes of interest include, but are not limited to, proteins, carbohydrates, organelles, nucleic acids, infectious particles (e.g., viruses, bacteria, parasites), metabolites, and the like. In certain embodiments, the analyte-specific binding agent is a protein. In certain embodiments thereof, the analyte-specific binding agent is an antibody or binding fragment thereof, e.g., as described above. Thus, the methods and compositions of the present invention can be used to detect an isoform of any particular element in a sample that is antigen-detectable and distinguishable from other isoforms of an activatable element present in the sample.
In certain embodiments, a plurality of detectable analyte-specific binding agents are employed in a method according to the invention. By "multiple analyte-specific binding agent" is meant that at least 2 or more than 2 analyte-specific binding agents are used, including 3 or more than 3, 4 or more than 4, 5 or more than 5, etc. In certain embodiments, each different analyte-specific binding agent is labeled (directly or indirectly) with a distinguishable label (e.g., a fluorophore having an emission wavelength detectable in a different channel of a flow cytometer, whether or not compensated). The plurality of analyte-specific binding agents may bind to the same analyte within or on the cell (e.g., two antibodies that bind different epitopes of the same protein), different analytes within or on the cell, or in any combination (e.g., two reagents that bind the same analyte and a third reagent that binds a different analyte). The upper limit on the amount of analyte-specific binding agent will depend primarily on the assay parameters and the detection capabilities of the detection system used.
FIG. 2 provides a graphical representation of labeled cells generated using the double indexed beads shown in FIG. 1. As shown in fig. 2, the labeled cells 200 include three double indexed beads 202, double indexed beads 204, and double indexed beads 206, which stably bind to the cells via specific binding members (not shown). Also shown are distinguishable fluorescent-labeled antibodies 210, 212, 214, and 216, which specifically bind to different target cell surface phenotypic markers.
FIG. 3 provides a graphical representation of labeled cells generated using double indexed beads that bind to a single cell using a gel nanobottle (nano-visual) as a cell binding member. As shown in fig. 3, the labeled cells 300 include three double indexed beads and are labeled with distinguishable fluorescent-labeled antibodies 310, 312, 314, and 316 that specifically bind to different target cell surface phenotypic markers. The labeled cells are present in nanovials 308, which stably bind double indexed beads 302, 304, and 306.
Acquisition of flow cytometry data and/or microscopy data
After generating the labeling composition (e.g., as described above), the method may include flow cytometry to determine the labeled composition. "flow cytometry assay" refers to the flow cytometry assay performed on a composition, such as the assay composition described above. Flow cytometry assays may include characterizing a sample, such as a sample comprising the assay composition, with a flow cytometer system. Flow cytometer measurements may include introducing the assay composition into a flow cytometer. Flow cytometers typically include a sample reservoir for receiving a fluid sample, e.g., including an assay composition, and a sheath reservoir containing a sheath fluid. The flow cytometer delivers particles in the fluid sample (including, for example, cells from the assay composition) as a cell stream to the flow cell while also directing sheath fluid to the flow cell. To characterize the composition of the flow stream, the flow stream is irradiated with light. Changes in the material in the flow stream, such as the presence of morphological or fluorescent markers, can lead to changes in the observed light, and these changes can be used for characterization and separation. For example, particles, such as molecules, analyte binding beads or single cells, in a fluid suspension pass through a detection zone where the particles are exposed to excitation light, typically from one or more lasers, and the light scattering properties and fluorescence properties of the particles are measured. The particles or components thereof are typically labeled with a fluorescent dye to facilitate detection. By labeling different particles or components with spectrally different fluorescent dyes, multiple different particles or components can be detected simultaneously. In some embodiments, the analyzer includes a plurality of detectors, one for each scattering parameter to be measured, and one or more detectors for each different dye to be detected. For example, some embodiments include a spectral configuration using more than one sensor or detector per dye. The data obtained includes the signal measured by each light scatter detector and the fluorescent emission. In certain embodiments, the flow cytometry assay can detect a signal indicative of the presence of the labeled secondary antibody in the sample. When a signal is detected, the sample may include one or more antibodies to an epitope of a coronavirus antigen.
As outlined above, the sample (e.g., in the flow stream of a flow cytometer) may be irradiated with light from a light source. In some embodiments, the light source is a broadband light source that emits light having a broad wavelength range, e.g., a wavelength that spans 50nm or greater, such as 100nm or greater than 100nm, such as 150nm or greater than 150nm, such as 200nm or greater than 200nm, such as 250nm or greater than 250nm, such as 300nm or greater than 300nm, such as 350nm or greater than 350nm, such as 400nm or greater than 400nm, and includes coverage of 500nm or greater than 500 nm. For example, one suitable broadband light source emits light having a wavelength of 200nm to 1500 nm. Another example of a suitable broadband light source includes a light source that emits light having a wavelength of 400nm to 1000 nm. When the method includes illumination with a broadband light source, the target broadband light source scheme may include, but is not limited to, halogen lamps, deuterium arc lamps, xenon arc lamps, stable fiber coupled broadband light sources, broadband LEDs with continuous spectrum, superluminescent light emitting diodes, semiconductor light emitting diodes, broad spectrum LED white light sources, multi-LED integrated white light sources, and other broadband light sources or any combination thereof.
In other embodiments, the method comprises irradiating with a narrow-band light source that emits light of a specific wavelength or narrow wavelength range, e.g., with a light source that emits light of a narrow wavelength range, e.g., 50nm or less than 50nm, e.g., 40nm or less than 40nm, e.g., 30nm or less than 30nm, e.g., 25nm or less than 25nm, e.g., 20nm or less than 20nm, e.g., 15nm or less than 15nm, e.g., 10nm or less than 10nm, e.g., 5nm or less than 5nm, e.g., 2nm or less than 2nm, and a light source that comprises light of a specific wavelength (i.e., monochromatic light). When the method includes illumination with a narrowband light source, the target narrowband light source scheme may include, but is not limited to, a narrowband wavelength LED, a laser diode, or a broadband light source coupled with one or more optical bandpass filters, a diffraction grating, a monochromator, or any combination thereof.
In certain embodiments, the method comprises irradiating the sample with one or more lasers. As described above, the type and number of lasers will vary depending on the sample and the light collected as needed, and may be a gas laser, such as a helium-neon laser, an argon laser, a krypton laser, a xenon laser, a nitrogen laser, a CO2 laser, a CO laser, an argon-fluorine (ArF) excimer laser, a krypton-fluorine (KrF) excimer laser, a xenon-chlorine (XeCl) excimer laser, or a xenon-fluorine (XeF) excimer laser, or a combination thereof. In other cases, the method includes irradiating the flow stream with a dye laser, such as a stilbene laser, a coumarin laser, or a rhodamine laser. In other cases, the method includes irradiating the flow stream with a metal vapor laser, such as a helium-cadmium (HeCd) laser, a helium-mercury (HeHg) laser, a helium-selenium (HeSe) laser, a helium-silver (HeAg) laser, a strontium laser, a neon-copper (NeCu) laser, a copper laser, or a gold laser, and combinations thereof. In other cases, the method includes irradiating the flow stream with a solid state laser, such as a ruby laser, a Nd: YAG laser, ndCrYAG laser, an Er: YAG laser, a Nd: YLF laser, a Nd: YVO4 laser, a Nd: YCa O (BO 3) 3 laser, a Nd: YCOB laser, a titanium sapphire laser, a thulium YAG laser, an ytterbium YAG laser, a Yb2O3 laser, or a cerium doped laser, and combinations thereof.
The sample may be illuminated with one or more of the above-described light sources, e.g. 2 or more than 2 light sources, e.g. 3 or more than 3 light sources, e.g. 4 or more than 4 light sources, e.g. 5 or more than 5 light sources, and comprising 10 or more than 10 light sources. The light sources may comprise any combination of various types of light sources. For example, in some embodiments, the method includes irradiating the sample in the flow stream with an array of lasers, such as an array having one or more gas lasers, one or more dye lasers, and one or more solid state lasers. Where necessary, at least one laser will be used for excitation of the fluorescent barcode, the other lasers for excitation of other fluorophores that bind to the cells.
In some cases, the flow stream is irradiated with a plurality of frequency shifted light and cells in the flow stream are imaged by fluorescence imaging using radio frequency marker emission (FIRE) to produce frequency encoded images, such as those described in Diebold et al, nature Photonics vol.7 (10), 806-810 (2013), and in U.S. patent nos. 9,423,353, 9,784,661, 9,983,132, 10,006,852, 10,078,045, 10,036,699, 10,222,316, 10,288,546, 10,324,019, 10,408,758, 10,451,538, 10,620,111, and U.S. patent publications 2017/0133857, 2017/038826, 2017/0350803, 2018/0275042, 2019/0376895, and 2019/0376894, the disclosures of which are incorporated herein by reference. In this case, the flow cytometry data may comprise image data of particles, e.g. cells present in the sample. (see, e.g., schraivogel et al, science Vol.375 (6578); 315-320 (2022)).
Aspects of the method include collecting scattered or fluorescent light with a fluorescence detector. In some cases, the fluorescence detector may be configured to detect fluorescent emissions from fluorescent molecules, such as labeled specific binding members (e.g., labeled antibodies that specifically bind a target marker) that bind to particles in the flow cell. In certain embodiments, the method comprises detecting fluorescence from the sample with one or more fluorescence detectors, such as 2 or more than 2, such as 3 or more than 3, such as 4 or more than 4, such as 5 or more than 5, such as 6 or more than 6, such as 7 or more than 7, such as 8 or more than 8, such as9 or more than 9, such as 10 or more than 10, such as 15 or more than 15, and including 25 or more than 25 fluorescence detectors. In an embodiment, each fluorescence detector is configured to generate a fluorescence data signal. Fluorescence from the sample may be detected by each fluorescence detector independently within one or more than one of the wavelength ranges of 200nm to 1200 nm. In some cases, the method includes detecting fluorescence from the sample over a wavelength range such as 200nm to 1200nm, such as 300nm to 1100nm, such as 400nm to 1000nm, such as 500nm to 900nm, including 600nm to 800 nm. In other cases, the method includes detecting fluorescence with each fluorescence detector at one or more specific wavelengths. For example, depending on the number of different fluorescence detectors in the target light detection system, fluorescence may be detected at one or more wavelengths in 450nm、518nm、519nm、561nm、578nm、605nm、607nm、625nm、650nm、660nm、667nm、670nm、668nm、695nm、710nm、723nm、780nm、785nm、647nm、617nm and any combination thereof. In certain embodiments, the method comprises detecting a wavelength of light corresponding to a fluorescence peak wavelength of certain fluorophores present in the sample. In embodiments, fluorescence flow cytometer data is received from one or more fluorescence detectors (e.g., one or more detection channels), such as two or more, such as three or more, such as four or more, such as five or more, such as six or more, and including eight or more fluorescence detectors (e.g., eight or more than eight detection channels).
Light from the sample may be measured at one or more wavelengths, such as 5 or more than 5 different wavelengths, such as 10 or more than 10 different wavelengths, such as 25 or more than 25 different wavelengths, such as 50 or more than 50 different wavelengths, such as 100 or more than 100 different wavelengths, such as 200 or more than 200 different wavelengths, such as 300 or more than 300 different wavelengths, and including measuring collected light at 400 or more than 400 different wavelengths.
In certain embodiments, the method comprises spectrally resolving light from each fluorophore of a fluorophore-biomolecule reagent pair in the sample. In some embodiments, the overlap between each different fluorophore is determined and the contribution of each fluorophore to the overlapping fluorescence is calculated. In some embodiments, spectrally resolving the light from each fluorophore comprises calculating a spectral unmixed matrix of fluorescence spectra for each of a plurality of fluorophores having overlapping fluorescence in the sample detected by the light detection system. In some cases, spectral analysis of light from each fluorophore and calculation of the spectral unmixed matrix for each fluorophore can be used to estimate the abundance of each fluorophore, such as, for example, resolving the abundance of target cells in a sample.
In certain embodiments, the method includes spectrally resolving light detected by a plurality of photodetectors, as described, for example, in U.S. patent No. 11,009,400, U.S. patent application publication nos. 20210247293 and 20210325292, the disclosures of which are incorporated herein by reference in their entirety. For example, spectral analysis of light detected by a plurality of photodetectors of the second set of photodetectors may include solving a spectral unmixing matrix using one or more of 1) a weighted least squares algorithm, 2) a Sherman-Morrison iterative inverse updater, 3) an LU matrix decomposition, such as decomposing the matrix into the product of a lower triangle (L) matrix and an upper triangle (U) matrix, 4) a modified Jolly-Stokes decomposition, 5) a QR factorization, and 6) a calculation of the weighted least squares algorithm by singular value decomposition. In certain embodiments, the method further comprises characterizing the spill-over diffusion of light detected by the plurality of photodetectors, as described, for example, in U.S. patent application publication No. 20210349004, the disclosure of which is incorporated herein by reference.
In some cases, the abundance of a fluorophore that binds (e.g., chemically bound (i.e., covalently bound, ion bound) or physically bound) to a target particle is calculated from the spectrally resolved light from each fluorophore that binds to the particle. For example, in one example, the relative abundance of each fluorophore bound to the target particle is calculated from the spectrally resolved light from each fluorophore. In another example, the absolute abundance of each fluorophore bound to the target particle is calculated from the spectrally resolved light from each fluorophore. In certain embodiments, particles may be identified or classified based on the relative abundance of each fluorophore determined to be bound to the particle. Comparing the relative or absolute abundance of each fluorophore bound to the particle to a control sample of particles having known properties, or by performing a spectroscopic or other assay on a population of particles (e.g., a population of cells) having calculated relative or absolute abundance of the bound fluorophores.
In certain embodiments, the method may include sorting one or more particles (e.g., cells) of the sample that are identified based on an estimated abundance of fluorophores bound to the particles. The term "sort" is used herein in its conventional sense to refer to separating components of a sample (e.g., droplets containing cells, droplets containing non-cellular particles such as biological macromolecules) and, in some cases, delivering the separated components to one or more sample collection containers. For example, the method may comprise sorting 2 or more components of the sample, such as 3 or more 3 components, such as 4 or more 4 components, such as 5 or more 5 components, such as 10 or more 10 components, such as 15 or more 15 components, and comprises sorting 25 or more 25 components of the sample. In sorting particles identified based on abundance of fluorophore bound to the particles, the method includes data acquisition, analysis, and recording, such as by computer, wherein a plurality of data channels record data from each detector used to acquire overlapping spectra of a plurality of fluorophore-biomolecule reagent pairs bound to the particles. In these embodiments, analyzing includes identifying the particle based on spectrally resolved light (e.g., by calculating a spectral unmixed matrix) from the plurality of fluorophores having overlapping spectral fluorophore-biomolecular reagent pairs bound to the particle, and based on the estimated abundance of each fluorophore bound to the particle. The analysis may be communicated to a sorting system configured to generate a set of digitized parameters based on the particle classification. In some embodiments, methods for sorting sample components include sorting particles (e.g., cells in a biological sample), as described in U.S. patent nos. 3,960,449, 4,347,935, 4,667,830, 5,245,318, 5,464,581, 5,483,469, 5,602,039, 5,643,796, 5,700,692, 6,372,506, and 6,809,804, the disclosures of which are incorporated herein by reference. In some embodiments, the method includes sorting components of the sample with a particle sorting module, such as those described in U.S. patent nos. 9,551,643 and 10324019, U.S. patent publication nos. 2017/0299493, and international patent publication No. WO/2017/040151, the disclosures of which are incorporated herein by reference. In certain embodiments, cells of a sample are sorted using a sort decision module having a plurality of sort decision units, such as those described in U.S. patent No. 11,085,868, the disclosure of which is incorporated herein by reference.
Flow cytometry assays are well known in the art. See, e.g., ormerod (editions), flow cytometry: A PRACTICAL Apprach, oxford Univ. Press (1997), jaroszeski et al (editions ),Flow Cytometry Protocols,Methods in Molecular Biology No.91,Humana Press(1997);Practical Flow Cytometry,3rd ed.,Wiley-Liss(1995);Virgo et al (2012) Ann Clin biochem. Jan;49 (pt 1): 17-28; linden et al, semin Throm Hemost.2004Oct;30 (5): 502-11; alison et al J Pathol,2010Dec;222 (4): 335-344; and Herbig et al (2007) CRIT REV THER Drug Carrier System.24 (3): 203-255; the disclosures of which are incorporated herein by reference in certain aspects, flow Cytometry compositions include the use of Flow cytometers capable of simultaneously exciting and detecting multiple fluorophores, e.g., BD Biosciences FACSCantoTM Flow cytometers, substantially in accordance with manufacturer's instructions.
Suitable Flow Cytometry systems may include, but are not limited to, ormerod (eds.), flow Cytometry APRACTICAL APPROACH, oxford Univ. Press (1997), jaroszeski et al (eds. ),Flow Cytometry Protocols,Methods in Molecular Biology No.91,Humana Press(1997);Practical Flow Cytometry, third edition, wiley-Lists (1995); virgo et al (2012) Ann Clin biochem. Jan;49 (pt 1): 17-28; linden et al, semin Throm Hemost.2004Oct;30 (5): 502-11; alison et al J Pathol,2010Dec;222 (4): 335-344; and Herbig et al (2007) CRIT REV THER Drug Carrier System.24 (3): 203-255; the disclosures of which are incorporated herein by reference. In some cases, the target flow cytometry system includes BD Biosciences FACSCantoTM flow cytometer, BD Biosciences FACSCantoTM II flow cytometer, BD AccuriTM flow cytometer, BD AccuriTM C6 Plus flow cytometer, BD Biosciences FACSCelestaTM flow cytometer, BD Biosciences FACSLyricTM flow cytometer, BD Biosciences FACSVerseTM flow cytometer, BD Biosciences FACSymphonyTM flow cytometer, BD Biosciences LSRFortessaTM flow cytometer, BD Biosciences LSRFortessaTM X-20 flow cytometer, BD Biosciences FACSPrestoTM flow cytometer, BD Biosciences FACSViaTM flow cytometer, BD Biosciences FACSCaliburTM cell sorter, BD Biosciences FACSCountTM cell sorter, BD Biosciences FACSLyricTM cell sorter, BD Biosciences ViaTM cell sorter, BD Biosciences InfluxTM cell sorter, BD Biosciences JazzTM cell sorter, BD Biosciences AriaTM cell sorter, BD Biosciences FACSAriaTM II cell sorter, BD Biosciences FACSAriaTM III cell sorter, BD Biosciences FACSAriaTM Fusion cell sorter and BD Biosciences FACSMelodyTM cell sorter, BD Biosciences FACSymphonyTM S6 cell sorter, etc.
In some embodiments, the subject systems are flow cytometry systems, such as those described in U.S. patent nos. 10,663,476, 10,620,111, 10,613,017, 10,605,713, 10,585,031, 10,578,542, 10,578,469, 10,481,074, 10,302,545, 10,145,793, 10,113,967, 10,006,852, 9,952,076, 9,933,341, 9,726,527, 9,453,789, 9,200,334, 9,097,640, 9,095,494, 9,092,034, 8,975,595, 8,753,573, 8,233,146, 8,140,300, 7,544,326, 7,201,875, 7,129,505, 6,821,740, 6,813,017, 6,809,804, 6,372,506, 5,700,692, 5,643,796, 5,627,040, 5,620,842, 5,602,039, 4,987,086, 4,498,766, the disclosures of which are incorporated herein by reference in their entirety.
In some embodiments, the subject systems are particle sorting systems configured to sort particles with a closed particle sorting module, such as those described in U.S. patent publication No. 2017/0299493, the disclosure of which is incorporated herein by reference. In certain embodiments, particles (e.g., cells) of a sample are sorted using a sort decision module having a plurality of sort decision units, such as those described in U.S. patent publication No. 2020/0256781, the disclosure of which is incorporated herein by reference. In some embodiments, the system includes a particle sorting module having a deflector plate, such as those described in U.S. patent publication No. 2017/0299493 filed on day 3/28, 2017, the disclosure of which is incorporated herein by reference.
In certain instances, the flow cytometry systems of the present invention are configured to image particles in a flow stream by fluorescence imaging using radio frequency marker emission (FIRE), such as those described in Diebold et al, nature Photonics vol.7 (10), 806-810 (2013), and in U.S. patent 9,423,353, 9,784,661, 9,983,132, 10,006,852, 10,078,045, 10,036,699, 10,222,316, 10,288,546, 10,324,019, 10,408,758, 10,451,538, 10,620,111, and U.S. patent publication nos. 2017/0136857, 2017/038826, 2017/0350803, 2018/0275042, 2019/0376895, and 2019/0376894, the disclosures of which are incorporated herein by reference. FIG. 4 provides a schematic representation of obtaining images of flow cytometry data comprising labeled cells via a FIRE protocol, for example using a FACSDiscover flow cytometer such as described in Schraivogel et al, science Vol.375 (6578); 315-320 (2022), according to embodiments of the present invention. As shown, image data may be obtained from fluorescent barcodes provided by fluorophores that have little effect on other detectors, such as barcodes provided by horizonsTM coupled to polymer dyes BB515, BB550, and BB790 (BD Biosciences).
As described above, the method includes a cytometry analysis that may include sorting. The target cells identified in the sample may be sorted and subsequently analyzed by any convenient analytical technique. Subsequent target analysis techniques include, but are not limited to, sequencing, detection by CellSearch, as described in Food and Drug Administration (2004) Final rule.Fed Regist 69:69:26036-26038, detection by CTC Chip, as described in Nagrath et al (2007) Nature 450:1235-1239, detection by MAGSWEEPER, talasaz et al (2009) Proc NATL ACAD SCI U S106:3970-3975, and detection by nanostructured substrates, as described in Wang S et al, (2011) ANGEW CHEM INT ED ENGL 50:3084-3088, the disclosure of which is incorporated herein by reference. When desired, the sorting protocol may include distinguishing between living and dead cells, and any convenient staining protocol for identifying such cells may be incorporated into the method. Of interest are the cytometry data obtained by BD FACSDiscoverTM S8 cell sorter and BD CellViewTM imaging technique (BD Biosciences).
Analysis of the data obtained for the labeled samples of the invention involves analyzing the cells for detectable target features (e.g., as described in more detail above). Analysis of the detectable features may be performed at any convenient step in the data analysis stage, including before, during or after deconvolution. In fact, there is no intention to limit the order of deconvolution and detectable feature analysis, as the acquired data can be analyzed randomly and repeatedly. For target cells, the obtained data may include cell fluorescence characteristics (e.g., provided by double indexed beads that bind to cells, as described above), as well as other characteristics of the cells, including markers that bind to cells, cell images, and the like. The data may be provided in any convenient format, such as a Flow Cytometry Standard (FCS) file format.
Obtaining sequencing data of labeled single cells
For example, as described above, after obtaining the cell count data for the labeled population of cells, the methods of embodiments of the invention can include obtaining sequencing data for the labeled cells of the sample (the sequencing data can be obtained from all or a portion of the cells of the labeled sample, e.g., target cells of the labeled sample, such as sorted cells obtained by a cell count step). Sequencing data may be obtained using any convenient protocol. In some cases, sequencing data may be obtained by a protocol that includes partitioning the labeled cells, then generating a sequencable library of nucleic acids obtained from the partitioned cells, and reading the sequencable library.
Partitioning the labeled cells
After producing labeled cells (i.e., cells labeled with one or more double indexed beads such as described above), an embodiment of the method includes partitioning the labeled cells to produce partitioned labeled single cells, wherein each cell has one or more double indexed beads bound thereto. In some cases, partitioning includes partitioning labeled cells into partitions or compartments such that the compartments contain a single labeled cell. "zoning" refers to placing labeled cells in a small reaction chamber, which may be a solid material-defined, fluid-partitioned structure such as microwells configured to contain labeled cells. In some embodiments of the disclosed methods, devices and systems, a plurality of microwells randomly distributed on a substrate is used. In some embodiments, the plurality of microwells are distributed on the substrate in an ordered pattern, such as an ordered array. In some embodiments, the plurality of microwells are distributed on the substrate in a random pattern, e.g., a random array. The micropores may be formed in various shapes and sizes. Suitable hole geometries include, but are not limited to, cylindrical, elliptical, cubic, conical, hemispherical, rectangular or polyhedral, for example, three-dimensional geometries composed of several planes, such as rectangular cuboid, hexagonal-column, octagonal-column, inverted triangular-pyramid, inverted rectangular-pyramid, inverted pentagonal-pyramid, inverted hexagonal-pyramid or inverted truncated-pyramid. In some embodiments, non-cylindrical microwells, such as wells with oval or square footprints, may provide advantages in being able to accommodate larger cells. In some embodiments, the upper and/or lower edges of the aperture wall may be rounded to avoid sharp corners, thereby reducing electrostatic forces generated at sharp edges or points due to concentration of the electrostatic field. Thus, the use of rounded corners may enhance the ability to recover beads from microwells. The pore size can be characterized by absolute dimensions. In some cases, the average diameter of the micropores may be from about 5 μm to about 100 μm. In other embodiments, the average pore diameter is at least 5 μm, at least 10 μm, at least 15 μm, at least 20 μm, at least 25 μm, at least 30 μm, at least 35 μm, at least 40 μm, at least 45 μm, at least 50 μm, at least 60 μm, at least 70 μm, at least 80 μm, at least 90 μm, or at least 100 μm. In yet other embodiments, the average pore diameter is at most 100 μm, at most 90 μm, at most 80 μm, at most 70 μm, at most 60 μm, at most 50 μm, at most 45 μm, at most 40 μm, at most 35 μm, at most 30 μm, at most 25 μm, at most 20 μm, at most 15 μm, at most 10 μm, or at most 5 μm. The volume of microwells used in the methods of the present invention may vary, in some cases from about 200 μm3 to about 800000 μm3. In some embodiments, the microwell volume is at least 200 μm3, at least 500 μm3, at least 1000 μm3, at least 10000 μm3, At least 25000 μm3, at least 50000 μm3, at least 100000 μm3, at least 200000 μm3, At least 300000 μm3, at least 400000 μm3, at least 500000 μm3, at least 600000 μm3, at least 700000 μm3 or at least 800000 μm3. In other embodiments, the microwell volume is up to 800000 μm3, up to 7000000 μm3, up to 600000 μm3、500000μm3, up to 400000 μm3, Up to 300000 μm3, up to 200000 μm3, up to 100000 μm3, up to 50000 μm3, Up to 25000 μm3, up to 10000 μm3, up to 1000 μm3, up to 500 μm3, or up to 200 μm3. The number of microwells in a given device employed in embodiments of the invention may vary, with in some cases the number being 100 or more than 100, such as 250 or more than 250, such as 500 or more than 500, including 1000 or more than 1000, such as 5000 or more than 5000, such as 10000 or more than 10000, with in some cases the number being 15000 or less 15000, such as 12500 or less 12500. Micropores suitable for use in embodiments of the present invention are also described in PCT application Ser. No. PCT/US2016/014612, published as WO/2016/118915, the disclosure of which is incorporated herein by reference. as used herein, a substrate may refer to a solid support. For example, the substrate may include a plurality of microwells. For example, the substrate may be an array of wells comprising two or more wells. In some embodiments, the microwells may include a defined volume of small reaction chambers. In some embodiments, a microwell may entrap one or more cells. In some embodiments, a microwell may retain only one cell. In some embodiments, the microwells may entrap one or more solid supports. In some embodiments, the microwells may entrap only one solid support. In some embodiments, the microwells entrap single cells and single solid supports (e.g., beads). Although the number of wells, e.g., microwells, in an orifice plate, e.g., microwell array, may vary in a given dispensing step, in some cases ranging from 5 to 500, e.g., from 5 to 100.
In partitioning the labeled cells, the labeled cells may be placed in the compartment (e.g., microwells of a microwell array) using any convenient protocol. The present disclosure provides methods for partitioning labeled cells into partitions to partition the labeled cells. For example, a collection of labeled cells may be introduced into a structure such as a microwell to partition the labeled cells. The labeled cells may be contacted, for example, by gravity flow, wherein the labeled cells may settle into the zonal structure. In some cases, the aqueous composition of labeled cells is contacted with the microwell array, e.g., by flowing it through the microwell array, causing the labeled cells to deposit in the microwells. An aqueous composition comprising labeled cells may flow through a flow cell in fluid communication with the microwells. A protocol and system suitable for partitioning captured particles into microwells is described in PCT application serial No. PCT/US2016/014612 published as WO/2016/118915, the disclosure of which is incorporated herein by reference. To partition cells of a cell sample, any convenient protocol may be used, such as dispensing an aliquot of the cell sample, e.g., pipetting, into a compartment, flowing the sample across the surface of an orifice plate, etc.
In some embodiments, partitioning the plurality of labeled cells further comprises providing particles, e.g., beads, comprising particles, e.g., bead-bound nucleic acids, into the compartment comprising single cells, wherein the bound nucleic acids are used to prepare a nucleic acid sequencing-ready composition, e.g., a sequencing-ready library, from the labeled cells. In some cases, a particle, e.g., a bead-binding nucleic acid, comprises a target binding region, e.g., that binds to a complementary sequence of a target nucleic acid species in a cell, as well as a capture sequence of a double indexed bead. For example, when the target nucleic acid species is cellular mRNA and the double indexed bead oligonucleotide barcode comprises a poly (a) capture sequence, the bead-binding nucleic acid may comprise a poly (T) domain as the target binding region. In addition to the target binding region, the binding nucleic acid may comprise one or more additional domains, such as, but not limited to, a cell marker domain, a barcode domain, a molecular index domain (e.g., a Unique Molecular Identifier (UMI) domain), a universal primer binding domain, and the like. For further details on particles with bound nucleic acid that may be provided in the compartment, see U.S. patent application publication No. US2018/0088112, U.S. patent application publication No. US2018/0200710, U.S. patent application publication No. US2018/0346970, U.S. patent application publication No. US2019/0056415, U.S. patent application publication No. US 2020/0248563, U.S. patent application publication No. US2020/0299672, and U.S. patent application publication No. US2021/0171940, the disclosures of which are incorporated herein by reference. Beads with bound nucleic acid may be provided in the compartment using any convenient protocol, including but not limited to those described above for cell partitioning, and further described in PCT application serial No. PCT/US2016/014612, published as WO/2016/118915, the disclosure of which is incorporated herein by reference. The particles, e.g., beads, may be partitioned into cells before, after, or in some cases in combination with the labeled cells, as desired.
Generation of a sequencable library
For example, as described above, partitioning of the labeled cells results in partitioned labeled cells spatially adjacent to the particle, e.g., bead, having bound cell marker domain nucleic acids comprising the target binding region as described above. When the cell marker domain nucleic acid is in close proximity to the target of the labeled single cell and/or the double indexed bead oligonucleotide barcode, the target/oligonucleotide barcode may hybridize to the cell marker domain nucleic acid. The cell marker domains comprising nucleic acid may be contacted in a proportion that is not depleted such that each different target can bind to a different cell marker domain comprising nucleic acid having its own unique UMI (if desired).
As described above, after partitioning the labeled cells, the labeled cells can be lysed to release the target molecules, such that the released target molecules, e.g., nucleic acids, can bind to the target binding region of the cell marker domain nucleic acid to produce captured nucleic acids. Cell lysis may be accomplished by any of a variety of methods, for example, by chemical or biochemical means, by osmotic shock (osmotic shock), or by thermal, mechanical or optical lysis. The particles can be lysed by adding a cell lysis buffer comprising a detergent (e.g., SDS, lithium dodecyl sulfate, triton X-100, tween-20, or NP-40), an organic solvent (e.g., methanol or acetone), or a digestive enzyme (e.g., proteinase K, pepsin, or trypsin), or any combination thereof. To increase binding of the target to the barcode, the diffusion rate of the target molecule may be altered by, for example, reducing the temperature and/or increasing the viscosity of the lysate. In some embodiments, the sample may be lysed using filter paper. The filter paper may be soaked with lysis buffer on top of the filter paper. The filter paper may be applied to the sample with pressure, which may facilitate cleavage of the sample and hybridization of the sample's target to the substrate. In some embodiments, the cleavage may be performed by mechanical cleavage, thermal cleavage, optical cleavage, and/or chemical cleavage. Chemical cleavage may include the use of digestive enzymes such as proteinase K, pepsin and trypsin. Lysis may be performed by adding a lysis buffer to the substrate. The lysis buffer may comprise Tris HCl. The lysis buffer may comprise at least about 0.01M, 0.05M, 0.1M, 0.5M, or 1M or greater Tris HCl. The lysis buffer may comprise up to about 0.01M, 0.05M, 0.1M, 0.5M or 1M or greater Tris HCl. The lysis buffer may comprise about 0.1M Tris HCl. The pH of the lysis buffer may be at least about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or above 10. The pH of the lysis buffer may be up to about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or above 10. In some embodiments, the lysis buffer has a pH of about 7.5. The lysis buffer may comprise a salt (e.g., liCl). The concentration of salt in the lysis buffer may be at least about 0.1M, 0.5M, or 1M or above 1M. The concentration of salt in the lysis buffer may be up to about 0.1M, 0.5M or 1M or above 1M. In some embodiments, the salt concentration in the lysis buffer is about 0.5M. The lysis buffer may comprise a detergent (e.g., SDS, lithium dodecyl sulfate, triton X, tween, NP-40). The concentration of detergent in the lysis buffer may be at least about 0.0001%, 0.0005%, 0.001%, 0.005%, 0.01%, 0.05%, 0.1%, 0.5%, 1%, 2%, 3%, 4%, 5%, 6% or 7% or greater than 7%. The concentration of detergent in the lysis buffer may be up to about 0.0001%, 0.0005%, 0.001%, 0.005%, 0.01%, 0.05%, 0.1%, 0.5%, 1%, 2%, 3%, 4%, 5%, 6% or 7% or more than 7%. In some embodiments, the concentration of detergent in the lysis buffer is about 1% lithium dodecyl sulfate. The time used in the method for cleavage depends on the amount of detergent used. In some embodiments, the more detergent used, the less time is required for lysis. The lysis buffer may comprise a chelating agent (e.g., EDTA, EGTA). The concentration of chelating agent in the lysis buffer may be at least about 1mM, 5mM, 10mM, 15mM, 20mM, 25mM, or 30mM or greater than 30mM. The concentration of chelating agent in the lysis buffer may be up to about 1mM, 5mM, 10mM, 15mM, 20mM, 25mM or 30mM or greater than 30mM. In some embodiments, the concentration of chelating agent in the lysis buffer is about 10mM. The lysis buffer may contain a reducing agent (e.g., beta-mercaptoethanol, DTT). The concentration of reducing agent in the lysis buffer may be at least about 1mM, 5mM, 10mM, 15mM, or 20mM or greater than 20mM. The concentration of reducing agent in the lysis buffer may be up to about 1mM, 5mM, 10mM, 15mM, or 20mM or greater than 20mM. In some embodiments, the concentration of reducing agent in the lysis buffer is about 5mM. In some embodiments, the lysis buffer may comprise about 0.1M TrisHCl at a pH of about 7.5, about 0.5M LiCl, about 1% lithium dodecyl sulfate, about 10mM EDTA, and about 5mM DTT. The cleavage may be performed at a temperature of about 4 ℃,10 ℃, 15 ℃, 20 ℃,25 ℃, or 30 ℃. The lysis may be performed for about 1 minute, 5 minutes, 10 minutes, 15 minutes or 20 minutes or longer than 20 minutes. the lysed cells may comprise at least about 100000, 200000, 300000, 400000, 500000, 600000 or 700000 or more than 700000 target nucleic acid molecules. Lysed cells may contain up to about 100000, 200000, 300000, 400000, 500000, 600000 or 700000 or more than 700000 target nucleic acid molecules.
After lysing the labeled cells and releasing the nucleic acid molecules therefrom, the nucleic acid molecules can be randomly bound to the cell marker domain nucleic acids of a co-located solid support such as a bead. Binding may involve hybridization of the target recognition region of the cell marker domain nucleic acid to a complementary portion of the target nucleic acid molecule (e.g., the barcode's log (dT) may interact with the poly (a) tail of the target). The assay conditions (e.g., buffer pH, ionic strength, temperature, etc.) for hybridization can be selected to promote the formation of specific, stable hybrids. In some embodiments, the nucleic acid molecules released from the lysed cells can bind to (e.g., hybridize to) a plurality of probes on a substrate. When the probe comprises olog (dT), the mRNA molecules can be hybridized to the probe and reverse transcribed. The log (dT) portion of the oligonucleotide may serve as a primer for first strand synthesis of the cDNA molecule, for example, when subjected to DNA synthesis reaction conditions to produce a first strand cDNA domain comprising the capture nucleic acid. The cell marker domain nucleic acid can also hybridize to a complementary capture sequence of a double-indexed bead oligonucleotide barcode, such as a poly (a) sequence, that binds to the labeled cell. Thus, the cell marker domain nucleic acid can serve as a primer for reverse transcription using the double indexed bead oligonucleotide barcode as a template, for example, as described in more detail below.
Where desired, a given workflow may include a pooling (pooling) step in which a product composition, e.g., consisting of captured nucleic acid, synthesized first strand cDNA, or synthesized double-stranded cDNA, is mixed or pooled with a product composition obtained from one or more additional samples, e.g., labeled cells. In some cases, the pooling step is performed immediately after the step of hybridizing the cell marker domain nucleic acid to the target nucleic acid, e.g., as described above. The amount of different product compositions produced by different samples, e.g., cells, of a mixing or pooling in such embodiments may vary, with amounts ranging from 2 to 1000000, e.g., from 3 to 200000, including from 4 to 100000, such as from 5 to 50000, in some cases from 100 to 10000, e.g., from 1000 to 5000. Either before or after mixing, one or more of the product compositions may be amplified, for example by Polymerase Chain Reaction (PCR), as described in more detail below. Once the target-cell domain marker molecules are pooled, all subsequent treatments can be performed in a single reaction vessel. Further processing may include, for example, reverse transcription reactions, amplification reactions, cleavage reactions, dissociation reactions, and/or nucleic acid extension reactions. Further processing reactions can be performed within microwells, i.e., without first pooling labeled target nucleic acid molecules from multiple cells.
The present disclosure provides methods of producing target-cell marker domain conjugates using any convenient protocol, such as reverse transcription or nucleotide extension. The target-cell marker domain conjugate may comprise a complementary sequence of all or part of the cell marker domain and the target nucleic acid. Reverse transcription of the bound RNA molecule can be performed by adding reverse transcription primers and reverse transcriptase. The reverse transcription primer may be an oligo (dT) primer, a random hexanucleotide primer, or a target-specific oligonucleotide primer. The Oligo (dT) primer may be or may be about 12 to 18 nucleotides in length and binds to the endogenous poly (A) tail at the 3' end of mammalian mRNA. Random hexanucleotide primers can bind to mRNA at multiple complementary sites. Target-specific oligonucleotide primers typically selectively prime the target mRNA. Reverse transcription can be repeated to produce multiple cDNA molecules. The methods disclosed herein can comprise performing at least about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 reverse transcription reactions. The method may comprise performing at least about 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 reverse transcription reactions.
One or more nucleic acid amplification reactions can be performed to produce multiple copies of a target nucleic acid molecule. Amplification may be performed in a multiplex manner, wherein a plurality of target nucleic acid sequences are amplified simultaneously. The amplification reaction may be used to add sequencing adaptors to the nucleic acid molecules. The amplification reaction, if present, may include amplifying at least a portion of the sample label. The amplification reaction may include amplifying at least a portion of a cellular marker and/or a barcode sequence (e.g., a molecular marker). The amplification reaction can include amplifying at least a portion of a sample label, a cell label, a spatial label, a barcode sequence (e.g., a molecular label), a target nucleic acid, or a combination thereof. The amplification reaction may comprise amplifying 0.5%、1%、2%、3%、4%、5%、6%、7%、8%、9%、10%、15%、20%、25%、30%、35%、40%、45%、50%、55%、60%、65%、70%、75%、80%、85%、90%、95%、97%、100%, of the plurality of nucleic acids or a range or number between any two of these values. The method can further comprise performing one or more cDNA synthesis reactions to produce one or more cDNA copies of a target-barcode molecule comprising a sample label, a cell label, a spatial label, and/or a barcode sequence (e.g., a molecular label).
In some embodiments, amplification may be performed using the Polymerase Chain Reaction (PCR). As used herein, PCR may refer to a reaction that amplifies a particular DNA sequence in vitro by primer extension of complementary strands of DNA that occur simultaneously. As used herein, PCR may encompass derivative forms of the reaction including, but not limited to, RT-PCR, real-time PCR, nested PCR, quantitative PCR, multiplex PCR, digital PCR, and assembly PCR.
Amplification of nucleic acids may include non-PCR based methods. Examples of non-PCR based methods include, but are not limited to, multiple Displacement Amplification (MDA), transcription Mediated Amplification (TMA), nucleic Acid Sequence Based Amplification (NASBA), strand Displacement Amplification (SDA), real-time SDA, rolling circle amplification, or loop-to-loop amplification (circle-to-circle amplification). Other non-PCR based amplification methods include multiple cycles of DNA-dependent RNA polymerase driven RNA transcription amplification or RNA-directed DNA synthesis and transcription for amplifying a DNA target or RNA target, ligase Chain Reaction (LCR) and qβ replicase (qβ) methods, amplification using palindromic probes, strand displacement amplification, oligonucleotide driven amplification using restriction endonucleases, amplification methods where primers hybridize to a nucleic acid sequence and cleave the resulting duplex prior to extension reactions and amplification, strand displacement amplification using a nucleic acid polymerase lacking 5' exonuclease activity, rolling circle amplification, and branched extension amplification (RAM). In some embodiments, the amplification does not produce a circularized transcript.
In some embodiments, the methods disclosed herein further comprise performing a polymerase chain reaction on the nucleic acid (e.g., RNA, DNA, cDNA) to produce labeled amplicons (e.g., randomly labeled amplicons). The labeled amplicon may be a double stranded molecule. The double-stranded molecule may comprise a double-stranded RNA molecule, a double-stranded DNA molecule, or an RNA molecule that hybridizes to a DNA molecule. One or both strands of the double-stranded molecule may comprise a sample label, a spatial label, a cellular label, and/or a barcode sequence (e.g., a molecular label). The labeled amplicon may be a single stranded molecule. The single stranded molecule may comprise DNA, RNA, or a combination thereof. The nucleic acids of the present disclosure may include synthetic or altered nucleic acids. Thus, the method can include generating an amplicon composition from a first strand cDNA domain comprising a capture nucleic acid.
Amplification may include the use of one or more than one unnatural nucleotide. The non-natural nucleotides may include photolabile nucleotides or triggerable nucleotides. Examples of non-natural nucleotides may include, but are not limited to, peptide Nucleic Acids (PNAs), morpholino nucleic acids and Locked Nucleic Acids (LNAs), and ethylene Glycol Nucleic Acids (GNAs) and Threose Nucleic Acids (TNAs). The non-natural nucleotides may be added to one or more cycles of the amplification reaction. The addition of non-natural nucleotides can be used to identify products at specific cycles or time points in the amplification reaction.
Performing one or more amplification reactions may include using one or more primers. One or more than one primer may comprise, for example, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or 15 or more than 15 nucleotides. One or more than one primer may comprise at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or 15 or more than 15 nucleotides. One or more than one primer may comprise from less than 12 to 15 nucleotides. One or more than one primer may anneal to at least a portion of a plurality of labeled targets (e.g., randomly labeled targets). One or more than one primer may anneal to the 3 'or 5' ends of a plurality of labeled targets. One or more than one primer may anneal to an interior region of a plurality of labeled targets. The interior region can be at least about 50, 100, 150, 200, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 650, 700, 750, 800, 850, 900, or 1000 nucleotides from the 3' end of the plurality of labeled targets. One or more than one primer may comprise an immobilized primer set. The one or more primers may comprise at least one or more than one custom primer. The one or more than one primer may comprise at least one or more than one control primer. The one or more than one primer may comprise at least one or more than one gene-specific primer.
One or more than one primer may comprise a universal primer. The universal primer can anneal to the universal primer binding site. One or more than one custom primer may anneal to a first sample label, a second sample label, a spatial label, a cellular label, a barcode sequence (e.g., a molecular label), a target, or any combination thereof. One or more primers may include universal primers and custom primers. Custom primers can be designed to amplify one or more targets. The target may comprise a subset of the total nucleic acids in one or more samples. The targets may comprise a subset of all labeled targets in one or more samples. One or more than one primer may comprise at least 96 or more than 96 custom primers. One or more than one primer may comprise at least 960 or more than 960 custom primers. One or more than one primer may comprise at least 9600 or more than 9600 custom primers. One or more than one custom primer may anneal to two or more than two different labeled nucleic acids. Two or more different labeled nucleic acids may correspond to one or more genes.
Any amplification protocol may be used in the methods of the present disclosure. For example, in one embodiment, the first round of PCR can amplify molecules attached to beads using gene-specific primers and primers directed to universal Illumina sequencing primer 1 sequences. The second round of PCR can amplify the first round of PCR product using nested gene-specific primers flanked by Illumina sequencing primer 2 sequences and primers directed against universal Illumina sequencing primer 1 sequences. Third round of PCR add P5 and P7 and sample index, change PCR products into Illumina sequencing library. Sequencing using 150bp×2 sequencing can show cell markers and barcode sequences (e.g., molecular markers) on sequencing fragment (read) 1, genes on sequencing fragment 2, and sample index on index 1 sequencing fragment.
In some embodiments, chemical cleavage may be used to remove nucleic acids from a substrate. For example, chemical groups or modified bases present in the nucleic acid can be used to facilitate its removal from the solid support. For example, enzymes can be used to remove nucleic acids from a substrate. For example, nucleic acids may be excised from the substrate by restriction endonuclease digestion. For example, treatment of nucleic acids containing dUTP or ddUTP with uracil-d-glycosidase (UDG) can be used to remove nucleic acids from a substrate. For example, nucleic acids can be excised from a substrate using an enzyme that performs nucleotide excision, such as a base excision repair enzyme, e.g., a purine-free/pyrimidine-free (AP) endonuclease. In some embodiments, the nucleic acid may be removed from the substrate using a photocleavable group and light. In some embodiments, the cleavable linker may be used to remove nucleic acid from a substrate. For example, the cleavable linker may comprise at least one of biotin/avidin, biotin/streptavidin, biotin/neutravidin, ig-protein a, a photolabile linker, an acid-labile or base-labile linker group, or an aptamer.
In some embodiments, amplification may be performed on a substrate, for example, using bridge amplification. The cDNA may be homopolymer tailed to generate compatible ends for bridge amplification using oligo (dT) probes on a substrate. In bridge amplification, the primer complementary to the 3' end of the template nucleic acid may be the first primer in each pair that is covalently attached to the solid particle. When a sample containing a template nucleic acid is contacted with the particle and subjected to a single thermal cycle, the template molecule may be annealed to the first primer, and by adding nucleotides, the first primer is extended in the forward direction to form a duplex molecule consisting of the template molecule and a newly formed DNA strand complementary to the template. In the next heating step of the cycle, the double-stranded molecule may be denatured, releasing the template molecule from the particle, and leaving the complementary DNA strand attached to the particle by the first primer. In the annealing stage of the subsequent annealing and extension steps, the complementary strand may hybridize with a second primer that is complementary to a fragment of the complementary strand at a location remote from the first primer. Such hybridization may result in the complementary strand forming a bridge between the first primer and the second primer, the bridge being immobilized to the first primer by a covalent bond and to the second primer by hybridization. In the extension phase, the second primer may be extended in the opposite direction by adding nucleotides in the same reaction mixture, thereby converting the bridge into a double-stranded bridge. The next cycle is then started, the double-stranded bridge can be denatured to produce two single-stranded nucleic acid molecules, one end of each molecule being attached to the particle surface by a first primer and a second primer, respectively, and the other end of each molecule being unattached. In the second cycle of annealing and extension steps, each strand may hybridize to other complementary primers on the same particle that were not previously used to form a new single-strand bridge. The extension of the two previously unused primers, now hybridized, converts the two new bridges into a double-stranded bridge. The amplification reaction may comprise amplifying at least 1%、2%、3%、4%、5%、6%、7%、8%、9%、10%、15%、20%、25%、30%、35%、40%、45%、50%、55%、60%、65%、70%、75%、80%、85%、90%、95%、97% or 100% of the plurality of nucleic acids.
Amplification of the labeled nucleic acid may include PCR-based methods or non-PCR-based methods. Amplification of the labeled nucleic acid may include exponential amplification of the labeled nucleic acid. Amplification of the labeled nucleic acid may include linear amplification of the labeled nucleic acid. Amplification may be performed by Polymerase Chain Reaction (PCR). PCR may refer to a reaction that amplifies a particular DNA sequence in vitro by primer extension of complementary strands of DNA that occur simultaneously. PCR may encompass derivative forms of the reaction including, but not limited to, RT-PCR, real-time PCR, nested PCR, quantitative PCR, multiplex PCR, digital PCR, repression PCR, half-repression PCR, and assembly PCR.
In some embodiments, the amplification of the labeled nucleic acid comprises a non-PCR based method. Examples of non-PCR based methods include, but are not limited to, multiple Displacement Amplification (MDA), transcription Mediated Amplification (TMA), nucleic Acid Sequence Based Amplification (NASBA), strand Displacement Amplification (SDA), real-time SDA, rolling circle amplification, or loop-to-loop amplification. Other non-PCR based amplification methods include multiple cycles of DNA-dependent RNA polymerase driven RNA transcription amplification for amplifying DNA or RNA targets or RNA directed DNA synthesis and transcription, ligase Chain Reaction (LCR) and qβ replicase (qβ) methods, use of palindromic probes, strand displacement amplification, oligonucleotide driven amplification using restriction endonucleases, amplification methods in which primers hybridize to nucleic acid sequences and cleave the resulting duplex prior to extension reactions and amplification, strand displacement amplification using a nucleic acid polymerase lacking 5' exonuclease activity, rolling circle amplification, and/or branched extension amplification (RAM).
In some embodiments, the methods disclosed herein further comprise performing a nested polymerase chain reaction on the amplified amplicon (e.g., target). The amplicon may be a double stranded molecule. The double-stranded molecule may comprise a double-stranded RNA molecule, a double-stranded DNA molecule, or an RNA molecule that hybridizes to a DNA molecule. One or both strands of the double-stranded molecule may comprise a sample tag or molecular identifier tag. Alternatively, the amplicon may be a single stranded molecule. The single stranded molecule may comprise DNA, RNA, or a combination thereof. The nucleic acids of the invention may comprise synthetic or altered nucleic acids.
In some embodiments, the method comprises repeatedly amplifying the labeled nucleic acids to produce a plurality of amplicons. The methods disclosed herein can comprise performing at least about 1,2,3, 4, 5, 6, 7,8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 amplification reactions. Or the method comprises performing at least about 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 amplification reactions.
Amplification may further comprise adding one or more control nucleic acids to one or more samples comprising a plurality of nucleic acids. Amplification may further comprise adding one or more control nucleic acids to the plurality of nucleic acids. The control nucleic acid may comprise a control label.
Amplification may include the use of one or more than one unnatural nucleotide. The non-natural nucleotides may include photolabile nucleotides and/or triggerable nucleotides. Examples of non-natural nucleotides include, but are not limited to, peptide Nucleic Acids (PNAs), morpholino nucleic acids and Locked Nucleic Acids (LNAs), and ethylene Glycol Nucleic Acids (GNAs) and Threose Nucleic Acids (TNAs). The non-natural nucleotides may be added to one or more cycles of the amplification reaction. The addition of non-natural nucleotides can be used to identify products at specific cycles or time points in the amplification reaction.
Performing one or more amplification reactions may include using one or more primers. The one or more primers may comprise one or more than one oligonucleotide. One or more than one oligonucleotide may comprise at least about 7 to 9 nucleotides. One or more than one oligonucleotide may comprise from less than 12 to 15 nucleotides. One or more than one primer may anneal to at least a portion of the plurality of labeled nucleic acids. One or more primers may anneal to the 3 'and/or 5' ends of a plurality of labeled nucleic acids. One or more than one primer may anneal to an interior region of a plurality of labeled nucleic acids. The interior region can be at least about 50, 100, 150, 200, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 650, 700, 750, 800, 850, 900, or 1000 nucleotides from the 3' end of the plurality of labeled nucleic acids. One or more than one primer may comprise an immobilized primer set. The one or more primers may comprise at least one or more than one custom primer. The one or more than one primer may comprise at least one or more than one control primer. The one or more primers may comprise at least one or more housekeeping gene primers. One or more than one primer may comprise a universal primer. The universal primer can anneal to the universal primer binding site. One or more than one custom primer may anneal to a first sample tag, a second sample tag, a molecular identifier tag, a nucleic acid, or a product thereof. One or more primers may include universal primers and custom primers. Custom primers can be designed to amplify one or more than one target nucleic acid. The target nucleic acid may comprise a subset of the total nucleic acid in one or more samples. In some embodiments, the primer is a probe attached to an array of the invention.
In some embodiments, barcoding (e.g., randomization) multiple targets in the sample further comprises generating an indexed library of barcoded targets (e.g., randomization barcoded targets) or fragments of barcoded targets. The barcode sequences of different barcodes (e.g., molecular tags of different random barcodes) may be different from each other. Generating an indexed library of barcoded targets includes generating a plurality of indexed polynucleotides from a plurality of targets in a sample. For example, for an indexed library of barcoded targets comprising a first indexed target and a second indexed target, the tagged region of the first indexed polynucleotide may differ, by about, by at least, or by at most 1,2, 3,4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, or a number or range of nucleotides between any two of these values from the tagged region of the second indexed polynucleotide. In some embodiments, generating an indexed library of barcoded targets comprises contacting a plurality of targets, e.g., mRNA molecules, with a plurality of oligonucleotides comprising a poly (T) region and a tag region, and performing a first strand synthesis using a reverse transcriptase to produce single-stranded tagged cDNA molecules, each cDNA molecule comprising a cDNA region and a tag region, wherein the plurality of targets comprises at least two mRNA molecules of different sequences, and the plurality of oligonucleotides comprises at least two oligonucleotides of different sequences. Generating an indexed library of barcoded targets may also include amplifying single-stranded labeled cDNA molecules to produce double-stranded labeled cDNA molecules, and performing nested PCR on the double-stranded labeled cDNA molecules to produce labeled amplicons. In some embodiments, the method may include generating a linker-tagged amplicon.
Bar code labeling (e.g., random bar code labeling) can include labeling individual nucleic acid (e.g., DNA or RNA) molecules with a nucleic acid bar code or tag. In some embodiments, it involves adding a DNA barcode or tag to the cDNA molecule when generating cDNA from mRNA. Nested PCR can be performed to minimize PCR amplification bias. Adaptors may be added for sequencing using, for example, second generation sequencing (NGS). Sequencing results can be used to determine one or more copies of the cell markers, molecular markers, and nucleotide fragment sequences of the target.
Sequencing
In certain embodiments, the provided methods further comprise subjecting the prepared expression library, e.g., the amplicon compositions produced as described above, to a sequencing protocol, e.g., a second generation sequencing (NGS) protocol. The protocol may be performed on any suitable NGS sequencing platform. NGS sequencing platforms of interest include, but are not limited to, the sequencing platform described bySequencing platforms (e.g., hiSeqTM、MiSeqTM and/or NextSeqTM sequencing systems), ion TorrentTM (e.g., ion PGMTM and/or Ion ProtonTM sequencing systems), pacific Biosciences (e.g., PACBIO RS II Sequel sequencing systems), life TechnologiesTM (e.g., SOLiD sequencing systems), oxford Nanopore (e.g., minion), roche (e.g., 454GS FLX+ and/or GS Junior sequencing systems), or any other sequencing platform of interest are provided. NGS protocols will vary depending on the particular NGS sequencing system used. Detailed protocols for sequencing, which may include, for example, further amplification (e.g., solid phase amplification), sequencing amplicons, and analyzing sequencing data, are available from the manufacturer of the NGS sequencing system used.
In some cases, the methods further include using the oligonucleotide-labeled cell component binding reagent, e.g., in applications where it is desired to detect, e.g., quantify, one or more cellular components, e.g., surface proteins (e.g., by the BD AbSeq protocol). The oligonucleotide-labeled cell component binding reagent used in such embodiments comprises a cell component binding reagent, such as an antibody or binding fragment thereof, coupled to a cell component binding reagent-specific oligonucleotide comprising an identifier sequence of the cell component binding reagent to which the cell component binding reagent-specific oligonucleotide binds. In this case, the magnetic capture beads may comprise nucleic acids configured to capture, e.g., specifically bind, domains of the cell component binding agent specific oligonucleotides. In this way, protein expression can be analyzed in conjunction with gene expression, for example in the case where multiple sets of chemical analysis are desired, such as in the case of a combination of transcriptome and proteome analysis. In this case, the method may include preparing the captured sample with an oligonucleotide-labeled cell component binding reagent, followed by providing for capture of the released cell component binding reagent-specific oligonucleotide from the captured, segmented cells. Further details regarding the use of oligonucleotide-labeled cell component binding reagents can be found in U.S. published patent applications No. US20180267036 and No. US20200248263, the disclosures of which are incorporated herein by reference.
Further details regarding the method of obtaining sequence data from single cells are provided, for example, as described above, in U.S. patent application publication No. US2018/0088112, U.S. patent application publication No. 2018/0200710, U.S. patent application publication No. US2018/0346970, U.S. patent application publication No. 2019/0056415, U.S. patent application publication No. US 2020/0248563, U.S. patent application publication No. 2020/0299672, and U.S. patent application publication No. 2021/0171940, the disclosures of which are incorporated herein by reference.
Correlating cell count data with sequencing data
The sequencing protocol generates sequencing data for the labeled cells. The sequencing data can be easily correlated to the cell count data of the labeled cells such that the cell count data obtained from the same cells can be paired with the sequencing data. In other words, a given cell count data, e.g., an imaging dataset, and a given sequencing dataset may be correlated to originate from the same cell, e.g., as described in more detail below.
After obtaining cell count data and sequencing data, e.g., as described above, the cell count data obtained for the self-contained cells is correlated with the sequencing data. Correlation refers to pairing cell count data with sequencing data as data from the same cell. Thus, cell count data obtained from the same labeled cells can be paired with sequencing data. In other words, a given cell count dataset and a given sequencing dataset may be identified as being obtained from the same cell and subsequently paired or otherwise associated with each other. Thus, correlated cell count data and sequencing data for single cells of a cell sample can be obtained.
Cell count data is correlated with sequencing data by fluorescence features and oligonucleotide barcodes provided by double indexed beads bound to labeled cells from which flow cytometry data and sequencing data were obtained. In the sequencing data obtained, sequence reads of the cell targets of the labeled cells and the double-indexed bead oligonucleotide barcodes are obtained, for example, as described above. In other words, for each labeled cell analyzed in a given workflow, the sequence of the double-indexed bead oligonucleotide barcode bound to the cell and the sequence of the cell target nucleic acid, e.g., mRNA from the cell, are obtained. For each labeled cell, these acquired sequences are acquired, for example, by a protocol as described above (which may be a second generation sequencing protocol) in which libraries are generated from the original sequences, with each member of a given library generated from the same partition sharing a common cell label. Thus, the sequence-sequenced fragments from the cell target nucleic acid and the double-indexed bead oligonucleotide barcode obtained from the cell both share the same cellular marker, i.e., they all share a common cellular marker. In correlating cells with imaging data, all sequencing fragments from the target nucleic acid and the double indexed bead oligonucleotide barcodes that have the same cell marker domain, i.e., share a common cell marker, can be paired or correlated. This pairing or association produces a collection of sequencing fragments comprising the target nucleic acid and the double indexed bead oligonucleotide barcode nucleic acid, and these sequencing fragments can be identified as being derived from the same cell.
The resulting sequencing data comprising the target nucleic acid and the sequenced fragment of the double-indexed bead oligonucleotide barcode nucleic acid can then be matched, i.e., paired or correlated, with the cell count data. As previously described, the cell count data for labeled cells comprises the fluorescent characteristics of those cells, wherein the characteristics are provided by one or more double indexed beads that bind to those cells. When cell count data is obtained by flow cytometry analysis, a series of fluorescent signals, or a collection thereof, of one or more double indexed beads bound to the cells is obtained, wherein the series may be referred to as cell-specific fluorescent features. Different cells in a given workflow have individually unique cell-specific fluorescent characteristics. The given fluorescent signal provided by the double indexed beads constituting such a cell-specific fluorescent label can be assigned to a specific portion of the sequenced fragment, since the sequence of the double indexed bead oligonucleotide barcode that generated the fluorescent signal is known. Thus, each cell-specific fluorescent characteristic obtained from a given labeled cell can be used to determine a different double-indexed bead oligonucleotide barcode sequence bound to that cell. Since the sequence of the double indexed bead oligonucleotide barcode is present in the sequencing fragment of the nucleotide barcode, a given cell-specific fluorescent characteristic can be determined to be associated with a given sequencing dataset. Once a cell-specific fluorescence signature is associated with a given sequencing dataset, it can be determined that the sequencing data originated from labeled cells within the same partition from which the fluorescence signature was acquired. In other words, the fluorescence characteristics of a given cell can be obtained from a series of fluorescence signals obtained from that cell in a cytometry analysis. Because a given fluorescent feature can be matched to a sequencing fragment of an oligonucleotide barcode from a double indexed bead, the fluorescent feature can be matched to a sequence sequencing fragment of the double indexed bead that produced the feature, wherein the matched sequencing fragment from the double indexed bead can then be used to identify all sequencing data acquired from a self-contained partition and cells within that partition. After the sequencing data is assigned to a given partition, the sequencing data can be readily correlated with the cell count data of the cells within that partition. Thus, correlated cell count data and sequencing data for single cells of a cell sample can be obtained.
Kit for detecting a substance in a sample
Aspects of the invention also include kits and compositions useful in practicing various embodiments of the methods of the invention. Kits of the invention can include a population of double indexed beads, beads comprising bead-bound nucleic acids, wherein the nucleic acids comprise, for example, a cell marker domain and a target binding region as described above, as desired, and/or other desired reagents. The double indexed bead population may comprise a variable number of unique fluorescent barcodes and oligonucleotide barcodes that differ from one another. Although the number of different double index beads for a given population may vary, in some cases the number is from 5 to 1000, for example from 10 to 500.
The kit may also include one or more additional components useful in practicing embodiments of the method. For example, the kit may comprise components for producing labeled cells, such as a large pore (macro-cell) plate, a liquid container such as a tube, and the like. In addition, the kit may include one or more components for obtaining sequence data, such as one or more primers, a polymerase (e.g., thermostable polymerase and reverse transcriptase, all having hot start properties, etc.), a double-strand specific DNase (dsDNAse), an exonuclease, dNTPs, a metal cofactor, one or more nuclease inhibitors (e.g., RNase inhibitor and/or DNase inhibitor), one or more molecular crowding agents (molecular crowding agent) (e.g., polyethylene glycol, etc.), one or more enzyme stabilizing components (e.g., DTT), a stimulus-responsive polymer, or any other desired kit component, such as, for example, a device, a solid support, a container, a cartridge, such as a tube, a bead, a plate, a microfluidic chip, etc., as described above. The components of the kit may be present in separate containers, or the components may be present in a single container.
In addition to the components described above, the subject kits may also include (in certain embodiments) instructions for practicing the subject methods. These instructions may be present in the subject kits in a variety of forms, one or more than one of which may be present in the kit. One form in which these instructions may be present is printed information on a suitable medium or substrate, such as printing the information on one or more sheets of paper in the package, package insert of the kit. Another form of such instructions is a computer-readable medium, such as a magnetic disk, compact Disk (CD), portable flash drive, etc., having information recorded thereon. Another form in which these specifications may exist is a website address, which may be used to access information from a remote website over the internet.
Although the foregoing invention has been described in some detail by way of illustration and example for purposes of clarity of understanding, it will be readily apparent to those of ordinary skill in the art in light of the teachings of this invention that certain changes and modifications may be made thereto without departing from the spirit or scope of the appended claims.
Thus, the foregoing merely illustrates the principles of the invention. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the principles of the invention and are included within its spirit and scope. Furthermore, all examples and conditional language recited herein are principally intended to aid the reader in understanding the principles of the invention and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Moreover, all statements herein reciting principles, aspects, and embodiments of the invention, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Furthermore, such equivalents are intended to include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure. Furthermore, nothing disclosed herein is intended to be dedicated to the public regardless of whether such disclosure is explicitly recited in the claims.
Thus, the scope of the invention is not intended to be limited to the exemplary embodiments shown and described herein. Rather, the scope and spirit of the invention are embodied by the appended claims. In the claims, 35u.s.c. ≡112 (f) or 35u.s.c. ≡112 (6) is explicitly defined as being incorporated into the claims only when the exact phrase "means for..or the exact phrase" step for..is stated at the beginning of the definition in the claims, 35u.s.c. ≡112 (f) or 35u.s.c. ≡112 (6) is not incorporated into the definition in the claims if such exact phrase is not used in the definition in the claims.