[0557] In some aspects, the oligonucleotide comprises a nucleic acid sequence as set forth in any one of SEQ ID NOs: 31-78, or a nucleic acid sequence greater than, equal to, at least, at most, or about 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100%, or any percentage derivable therein, identical thereto.

[0558] In some aspects, the RBP-targeting agent is labeled. In some aspects, the label comprises, consists essentially of, or consists of a radioisotopes, a hapten, a fluorescent label, a fluorescent polypeptide, a phosphorescent molecule, a chemiluminescent molecule, a chromophore, a luminescent molecule, a photoaffinity molecule, a colored particle, and/or a ligand. In some aspects, the RBP-targeting agent comprises a fluorescent label. In some aspects, the fluorescent label comprises, consists essentially of, or consists of Green Fluorescent Protein (GFP), eGFP, Red Fluorescent Protein (RFP), Teal Fluorescent Protein (TFP), Blue Fluorescent Protein (BFP), Yellow Fluorescent Protein (YFP), miRFP, cerulean fluorescent protein (CFP), eCyanFP, mCherry, mVenus, mOrange, mTurquoise, tdTomato, aminocoumarin, fluorescein, texas red, Alexa Fluor dyes (e.g. Alexa Fluor 488, Alexa Fluor 555, Alexa Fluor 594, Alexa Fluor 647, Alexa Fluor 350, Alexa Fluor 532, and Alexa Fluor 700), Cy dyes (e.g. Cy3, Cy5), DyLight dyes, FITC, or Rhodamine, or functional variants thereof.

III. Methods

[0559] In some aspects, the current disclosure encompasses methods of determining one or more RNA interaction sites of a RNA-binding protein (RBP) in a biological sample. In some aspects, the method comprises the steps of a) contacting (e.g., incubating together) a RBP- targeting agent to the RBP, wherein the RBP-targeting agent specifically binds the RBP to form a first complex; b) contacting (e.g., incubating together) the first complex with one or more secondary binding agents that specifically bind the RBP-targeting agent, to form a second complex; c) incubating the first or the second complex with the transcriptase composition disclosed herein, to obtain cDNA; d) sequencing the cDNA to determine the one or more RNA interaction sites of the RBP. In some aspects, the method may further comprise fixing the biological sample prior to steps (a) - (d). In some aspects, the method my further comprise permeabilizing the biological sample. Thus, in some aspects, a method comprises identifying one or more RNA interaction sites of a RNA-binding Protein (RBP) in a biological sample, comprising: a) fixing the biological sample; b) contacting (e.g., incubating together) the biological sample with an agent that permeabilizes cell membranes; c) providing an RBP- targeting agent to the sample, wherein the RBP-targeting agent interacts with the RBP of interest; d) providing a transcriptase composition comprising a polypeptide construct comprising a targeting moiety and a reverse transcriptase enzyme; wherein the targeting moiety interacts with the RBP-targeting agent; e) incubating the sample with the transcriptase composition to produce cDNA; and f) sequencing the cDNA. These methods, and variations thereof are broadly referred to herein as ARTR-seq.

[0560] In some aspects, also provided herein are methods for determining the RNA interactions sites of more than one RNA binding protein. In some aspects, the method may be used to map the RNA binding sites for greater than, equal to, at least, at most 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500 RBPs. In some aspects, the methods disclosed herein may be used to map the RNA binding sites of all the RBPs in a cell. These methods are broadly referred to as multiplex ARTR-seq.

[0561] In some aspects, also provided herein are modifications of the methods (ARTR-seq and multiplex ARTR-seq) for determining RNA binding sites with spatial resolution. These method are broadly referred to herein as spatial ARTR-seq. In some aspects, any of the methods disclosed herein (for example, ARTR-seq, multiple ARTR-seq, spatial ARTR-seq) may be modified to study RNA modification sites. The aspects provided herein are in no way limiting, and additional aspects with obvious modifications of the disclosed methods may be envisaged by a person of ordinary skill in the art. Some of these aspects are described in detail herein. Any one or more of the preceding steps of each of the methods disclosed can be excluded from certain aspects of the disclosure. A person of skill in the art is well aware of common techniques to accomplish each of the preceding steps.

A. ARTR-seq

[0562] In some aspects, the method is an ARTR-seq method. Some aspects of the method are disclosed herein. 1. Biological sample

[0563] The term biological sample, as used herein encompasses any sample obtained from an organism or prepared in vitro to mimic a sample of biological origin. Non-limiting examples of biological samples include isolated or assembled RNA-protein complexes, biological fluid, cells, tissue samples, or biological materials derived from cells or tissue samples. In some aspects, the biological sample may be obtained from a prokaryotic, or a eukaryotic organism. In some aspects, the eukaryotic organism may be from the kingdoms Animalia, Plantae, Fungi, Protista. In an aspect, the eukaryotic organism is a mammal. In an aspect, the eukaryotic organism is a laboratory animal, for example a primate, a rodent - a mouse, a rat, a gerbil, a nematode, or a fruit fly. In some aspects, the laboratory animal is a genetically engineered animal. In some aspects, the mammal is a human. In some aspects, the mammal has, or is at a risk of having a disease.

[0564] In certain aspects, the disclosed methods comprise obtaining a sample (also a “biological sample”) from a subject wherein the subject has, or is at a risk of having a disease or disorder. In some aspects, the methods of obtaining a biological sample can include methods of biopsy such as fine needle aspiration, core needle biopsy, vacuum assisted biopsy, incisional biopsy, excisional biopsy, punch biopsy, shave biopsy or skin biopsy. In other aspects the sample can be obtained from any of the tissues provided herein that include but are not limited to non-cancerous or cancerous tissue and non-cancerous or cancerous tissue from the serum, gall bladder, mucosal, skin, heart, lung, breast, pancreas, blood, liver, muscle, kidney, smooth muscle, bladder, colon, intestine, brain, prostate, esophagus, or thyroid tissue. Alternatively, the sample can be obtained from any other source including but not limited to blood, sweat, hair follicle, buccal tissue, tears, menses, feces, or saliva. In certain aspects of the current methods, any medical professional such as a doctor, nurse or medical technician can obtain a biological sample for testing. Yet further, the biological sample can be obtained without the assistance of a medical professional.

[0565] A sample can include but is not limited to, tissue, cells, or biological material from cells or derived from cells of a subject. The biological sample can be a heterogeneous or homogeneous population of cells or tissues. The biological sample can be obtained using any method known to the art that can provide a sample suitable for the analytical methods described herein. The sample can be obtained by non -invasive methods including but not limited to: scraping of the skin or cervix, swabbing of the cheek, saliva collection, urine collection, feces collection, collection of menses, tears, or semen. [0566] The sample can be obtained by methods known in the art. In certain aspects the samples are obtained by biopsy. In other aspects the sample is obtained by swabbing, endoscopy, scraping, phlebotomy, or any other methods known in the art. In some cases, the sample can be obtained, stored, or transported using components of a kit of the present methods. In some cases, multiple samples, such as multiple esophageal samples can be obtained for diagnosis by the methods described herein. In other cases, multiple samples, such as one or more samples from one tissue type (for example esophagus) and one or more samples from another specimen (for example serum) can be obtained for diagnosis by the methods. In some cases, multiple samples such as one or more samples from one tissue type (e.g. esophagus) and one or more samples from another specimen (e.g. serum) can be obtained at the same or different times. Samples can be obtained at different times are stored and/or analyzed by different methods. For example, a sample can be obtained and analyzed by routine staining methods or any other cytological analysis methods.

[0567] In some aspects the biological sample can be obtained by a physician, nurse, or other medical professional such as a medical technician, endocrinologist, cytologist, phlebotomist, radiologist, or a pulmonologist. The medical professional can indicate the appropriate test or assay to perform on the sample. In certain aspects a molecular profiling business can consult on which assays or tests are most appropriately indicated. In further aspects of the current methods, the patient or subject can obtain a biological sample for testing without the assistance of a medical professional, such as obtaining a whole blood sample, a urine sample, a fecal sample, a buccal sample, or a saliva sample.

[0568] In other cases, the sample is obtained by an invasive procedure including but not limited to: biopsy, needle aspiration, endoscopy, or phlebotomy. The method of needle aspiration can further include fine needle aspiration, core needle biopsy, vacuum assisted biopsy, or large core biopsy. In some aspects, multiple samples can be obtained by the methods herein to ensure a sufficient amount of biological material.

[0569] General methods for obtaining biological samples are also known in the art. Publications such as Ramzy, Ibrahim Clinical Cytopathology and Aspiration Biopsy 2001, which is herein incorporated by reference in its entirety, describes general methods for biopsy and cytological methods. In some aspects, the sample is a fine needle aspirate of a esophageal or a suspected esophageal tumor or neoplasm. In some cases, the fine needle aspirate sampling procedure can be guided by the use of an ultrasound, X-ray, or other imaging device.

[0570] In some aspects of the present methods, a molecular profiling business can obtain the biological sample from a subject directly, from a medical professional, from a third party, or from a kit provided by a molecular profiling business or a third party. In some cases, the biological sample can be obtained by the molecular profiling business after the subject, a medical professional, or a third party acquires and sends the biological sample to the molecular profiling business. In some cases, the molecular profiling business can provide suitable containers, and excipients for storage and transport of the biological sample to the molecular profiling business.

[0571] In some aspects of the methods described herein, a medical professional need not be involved in the initial diagnosis or sample acquisition. An individual can alternatively obtain a sample through the use of an over the counter (OTC) kit. An OTC kit can contain a means for obtaining said sample as described herein, a means for storing said sample for inspection, and instructions for proper use of the kit. In some cases, molecular profiling services are included in the price for purchase of the kit. In other cases, the molecular profiling services are billed separately. A sample suitable for use by the molecular profiling business can be any material containing tissues, cells, nucleic acids, genes, gene fragments, expression products, gene expression products, or gene expression product fragments of an individual to be tested. Methods for determining sample suitability and/or adequacy are provided.

[0572] In some aspects, the subject can be referred to a specialist such as an oncologist, surgeon, or endocrinologist. The specialist can likewise obtain a biological sample for testing or refer the individual to a testing center or laboratory for submission of the biological sample. In some cases the medical professional can refer the subject to a testing center or laboratory for submission of the biological sample. In other cases, the subject can provide the sample. In some cases, a molecular profiling business can obtain the sample.

2. Sample preparation

[0573] In some aspects, the current disclosure also encompasses methods of preparing the biological sample, as disclosed herein for further processing. Methods for preparing the samples are well known in the art and can comprise use of common laboratory equipment, for example centrifuges, perfusion equipment, dissection equipment, cryostats, mounting equipment, mounting media, solid surface, for example slides, multi-well plates, capillaries etc, microscopes, staining equipment etc. In an exemplary set up, once a tissue sample is obtained, the tissue may be placed in O.C.T, and frozen in liquid nitrogen, and sliced using a cryostat (for example, Leica CM1900). The tissue sections may then be mounted on a suitable solid surface, and further fixed and permeabilized. In another exemplary set up, a tissue obtained may be further dissected into cells, diluted and mounted on a solid surface. In yet another aspect, one or more cells from a cell line may be obtained, and processed. In yet another exemplary aspect, a ribosome, a polysome, or other RNA-protein complexes may be isolated used in the disclosed methods.

[0574] In some aspects, the processed biological sample may be fixed. A person of skill in the art is familiar with common techniques to accomplish fixation of a sample. In some aspects, the fixing step can comprise, consist, or consist essentially of rapidly freezing the sample, or can comprise, consist, or consist essentially of treating the sample with formaldehyde and/or paraformaldehyde (PF A).

[0575] A cellular sample can be fixed by treatment with a fixing agent. A fixing agent can comprise, consist, or consist essentially of a crosslinking agent, including aldehydes like formalin, glutaraldehyde, formaldehyde, PF A, or a precipitating agent, including organic solvents like methanol, acetone, or piric acid, or any combination thereof. In some aspects, the fixing step is quenched, for example with glycine. A person of skill in the art is familiar with common techniques to accomplish quenching of a fixing reaction, including addition of sodium borohydride, or addition of exogenous amine -containing reagents like ammonium chloride and/or glycine. In some aspects, the fixing step comprises, consists, or consists essentially of treating the sample with formaldehyde. In some aspects, the fixing step comprises, consists, or consists essentially of treating the sample with paraformaldehyde (PF A). In some aspects, the fixing step comprises, consists, or consists essentially of treating the sample with greater than, equal to, at least, at most 0.1%, 0.2%, 0.3%, 0.4%, 0.5%, 0.6%, 0.7%, 0.8%, 0.9%, 1.0%, 1.1%, 1.2%, 1.3%, 1.4%, 1.5%, 1.6%, 1.7%, 1.8%, 1.9%, 2.0%, 2.1%, 2.2%, 2.3%, 2.4%, or 2.5% PF A. In some aspects, the fixing step occurs for greater than, equal to, at least, or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 minutes, including any range or value derivable therein. In some aspects, the fixing step occurs at room temperature.

[0576] In some aspects, the fixing step is quenched. In some aspects, the fixing step is quenched with glycine. In some aspects, the quenching glycine is greater than, equal to, at least, at most 25, 50, 75, 100, 125, 150, 200, 225, or 250 mM, including any range or value derivable therein. In some aspects, the quenching step occurs for greater than, equal to, at least, or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 minutes, including any range or value derivable therein. In some aspects, the quenching step occurs at room temperature.

[0577] In some aspects, the sample is permeabilized. In some aspects, a cell permeabilizing agent may comprise a detergent, an enzyme, a solvent, a small molecule, a buffer or any combination thereof. In some aspects, the cell permeabilizing agent comprises a detergent. In some aspects, the agent that permeabilizes cell membranes comprises, consists, or consists essentially of greater than or equal to 0.1%, 0.2%, 0.3%, 0.4%, 0.5%, 0.6%, 0.7%, 0.8%, 0.9%, 1.0%, 1.1%, 1.2%, 1.3%, 1.4%, or 1.5%, including any range or value derivable therein, Triton X-100. In some aspects, the sample is contacted with the permeabilizing agent greater than, equal to, at least, or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 minutes, including any range or value derivable therein. In some aspects, the contacting (e.g., incubating together) step occurs on ice.

[0578] In some aspects, the at least one RNase is optionally provided to the sample following the permeabilizing step or further downstream. In some aspects, the providing of the at least one RNase improves resolution during the sequencing step. In some aspects, the at least one RNase comprises, consists, or consists essentially of ribonuclease I (RNase I, via Thermo Fisher Scientific), RNase A, and/or RNase Tl. In some aspects, the RNase is provided to the sample for greater than, equal to, at least, or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 minutes, including any range or value derivable therein. In some aspects, the at least one RNase is provided to the sample at 37 °C.

3. Primary and secondary complex formation

[0579] In some aspects, the current disclosure provides an ARTR-seq method for determining one or more RNA interaction sites of a RNA-binding Protein (RBP) in a biological sample, comprising: contacting a RBP-targeting agent to the RBP, wherein the RBP-targeting agent specifically binds the RBP to form a first complex; contacting the first complex with one or more secondary binding agents that specifically bind the RBP-targeting agent, to form a second complex; incubating the first or the second complex with the transcriptase composition disclosed herein, to obtain cDNA; sequencing the cDNA to determine the one or more RNA interaction sites of the RBP. A schematic of an exemplary ARTR-seq procedure is provided in FIG. 1A.

[0580] In some aspects, the method may further comprise one or more of a sample preparation step, a fixing step, a quenching step, a permeabilizing step, RNAse treatment, blocking step or any combination thereof, as disclosed herein and/or known in the art. In some aspects, the sample is blocked before the RBP targeting agent is provided to the sample to form a first complex. A person of skill in the art is familiar with common techniques to accomplish sample blocking, which reduces background or non-specific staining of the sample. As is known to a person of skill in the art, agents like hydrogen peroxide, levamisole, avidin/biotin blocking reagents, and/or protein blocking solutions like BSA, gelatin, and/or non-fat dry milk. In an aspect, the blocking agent may comprise BSA. In some aspects, the sample may be blocked using greater than, equal to, at least, at most 0.2, 0.4, 0.6, 0.8, 1, 1.2, 1.4, 1.6, 1.8, 2, 2.2, 2.4, 2.6, 2.8, 3, 3.2, 3.4, 3.6, 3.8, 4, 4.2, 4.4, 4.6, 4.8, 5.0 mg/mL blocking agent in any suitable buffer. In some aspects, the blocking may be done for greater than, equal to, at least, or at most 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, or more minutes at RT. In some aspects, the samples can be blocked for greater than, equal to, at least, at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 hrs at 4 °C. In some aspects, practicing one or more of these steps comprising sample preparation step, a fixing step, a quenching step, a permeabilizing step, blocking step steps provide a processed sample for use in downstream steps of the method.

[0581] In an aspect, the method comprises contacting one or more RBP-targeting agents disclosed herein, with RBPs in the processed sample. In some aspects, the RBP-targeting agent may comprise any molecule as disclosed here, that specifically binds the RBP. Non-limiting examples include antibodies, and functional variants thereof, oligonucleotides or variants thereof, peptides, ligands, small molecules, or aptamers. In some aspects, the RBP-targeting agent is an antibody. In an aspect, the contacting step may be carried out in any suitable buffer composition, for example Tris-HCl, MOPS, phosphate buffered saline (PBS), or Dulbecco’s phosphate buffered saline. In some aspects, the buffer composition may further comprise a blocking agent as disclosed herein. In some aspects, the RBP-targeting agent is incubated with the processed sample for greater than, equal to, at least, or at most 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, or more minutes at RT. In some aspects, the samples can be incubated with the RBP -targeting agent for greater than, equal to, at least, at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 hrs at 4°C. In some aspects, contacting of the RBP binding agent with the RBP forms a primary complex.

[0582] In some aspects, primary complex may optionally be incubated with a secondary binding agent, which specifically binds the RBP-targeting agent. In some aspects, the secondary binding agent may comprise any molecule as disclosed here, that specifically binds the RBP binding agent. Non-limiting examples include antibodies, and functional variants thereof, oligonucleotides or variants thereof, peptides, ligands, small molecules, or aptamers. In some aspects, the secondary binding agent is an antibody, for example an antibody that specifically binds the primary complex. In some aspects, the secondary binding agent is incubated with the sample for greater than, equal to, at least, or at most 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, or more minutes at RT. In some aspects, the samples can be incubated with the secondary binding agent for greater than, equal to, at least, at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 hrs at 4 °C. In some aspects, contacting of the secondary binding agent with the primary complex forms a secondary complex. In some aspects, the RBP-binding agent, or the secondary binding agent may be labeled as provided herein above.

[0583] In some aspects, the sample may be washed between any or after any of the steps disclosed herein. In some aspects, the sample is washed for greater than, equal to, at least, at most 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 times, after the at least one RNase is provided to the sample, the RBP targeting step, after the primary or secondary complex formation and/or after the blocking step. In some aspects, the washing step comprises, consists, or consists essentially of washing the sample with a suitable buffer, for example Tris-HCl, MOPS, phosphate buffered saline (PBS), or Dulbecco’s phosphate buffered saline. In some aspects, the washing buffer may further comprise a blocking agent, as disclosed herein, a RNase inhibitor, and additional ingredients, as well known in the art. In some aspects, the washing step comprises, consists, or consists essentially of shaking the sample with DPBS. In some aspects, the washing step occurs for greater than, equal to, at least, or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 minutes, including any range or value derivable therein. In some aspects, the washing step occurs at room temperature.

4. Reverse transcription and cDNA synthesis

[0584] In some aspects, the disclosed method further comprises incubating the primary or the secondary complex, or both with a transcriptase composition as disclosed herein. As provided herein, the transcriptase composition comprises at least one polypeptide construct and a transcriptase mix. In some aspects, the polypeptide construct comprises a targeting moiety as disclosed herein; and a reverse transcriptase enzyme as disclosed herein. As an aspect, the transcriptase mix comprises one or more ingredients for initiation and synthesis of cDNA. In an aspect, the transcriptase mix comprises one or more adapter-RT primer, wherein the one or more adapter RT-primer each comprises an adapter primer sequence and an RT primer sequence. In some aspects, the RT primer comprises random RT primers as disclosed herein. In some aspects, the adapter primer comprises one or more of a barcode sequence, indexes etc. In some aspects, the transcriptase mix may further comprise components known in the art, for example labeled and/or unlabeled dNTPs as disclosed herein, RNase inhibitor, salts, reducing agents, buffers, solvents, osmotic agents etc.

[0585] A person of skill in the art is familiar with conditions capable of producing cDNA. As noted above, in some aspects, the conditions to produce cDNA can comprise, consist, or consist essentially of providing the sample with at least one primer (random, oligo(dT) or gene specific), dNTPs, and other components in order to conduct reverse transcription (RT) before halting the reaction. In an aspect, the primer is an adapter RT primer as disclosed herein. The other components can comprise, consist, or consist essentially of a non-competitive inhibitor of pancreatic-type ribonucleases, a buffer or buffers, MgCh, a reducing reagent, and/or water. In some aspects, the transcriptase composition is provided to the sample for greater than, equal to, at least, or at most 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, or 60 minutes, including any range or value derivable therein to obtain a cDNA. In some aspects, the transcriptase mix provided to the sample at less than, equal to, about or more than 34 °C, 35 °C, 36 °C, 37 °C, 38 °C, 39 °C, 40 °C, 41 °C, 42 °C, 43 °C, 44 °C, 45 °C, 46 °C, 47 °C, 48 °C, 49 °C, 50 °C, 51 °C, or 52 °C, 53 °C, 54 °C, 55 °C, 56 °C, 57 °C, 58 °C. In some aspects, the transcriptase mix is provided to the sample at less than, equal to, about or more than 34 °C, 35 °C, 36 °C, 37 °C, 38 °C, 39 °C, 40 °C, 41 °C, 42 °C. In some aspects, the transcriptase mix is provided to the sample at 37 °C - 42 °C.

[0586] A person of skill in the art is aware of standard conditions and protocols with which to conduct reverse transcription. For example, primers with which to conduct reverse transcription can comprise, consist, or consist essentially of oligo(dT) primers, random primers, and/or gene-specific primers. A person of skill in the art can select random primers to improve cDNA synthesis for detection. These random primers can comprise, consist, or consist essentially of at least septamers, octamers, nonamers, decamers, undecamers, dodecamers, tridecamers, tetradecamers, pentadecamers, hexadecamers, heptadecamers, octadecamers, nonadecamers, or eicosamers. As a further example, the dNTPs with which to conduct reverse transcription can be labelled or not labelled; as known to a person in the art a dNTP label can comprise, consist, or consist essentially of biotin, biotin-16, a-³²P, fluorescein, a fluorescent dye, and/or another label that facilitates detection and/or purification. The labeled and label- free dNTPs can be mixed at different ratios, for example 2:1, 1 :1, 1 :2, or any range or value derivable therein. In some aspects, the dNTPs can comprise, consist, or consist essentially of a combination of labelled dUTP, labelled dCTP, labelled dGTP, labelled dATP, dTTP, dCTP, dATP, and/or dGTP.

[0587] A non-competitive inhibitor of pancreatic-type ribonucleases suitable for conducting reverse transcription can comprise, consist, or consist essentially of RNase inhibitor, RNAseOUT, and/or another agent which prevents RNA degradation by RNase. Buffers with suitable for conducting reverse transcription can comprise, consist, or consist essentially of a phosphate buffer solution like PBS and/or DPBS, and/or another buffer providing a favorable pH and ionic strength for the reaction. A reducing reagent suitable for conducting reverse transcription can comprise, consist, or consist essentially of dithiothreitol (DTT), and/or another agent suitable for reducing disulfide bonds in RNases. Water suitable for conducting reverse transcription can comprise, consist, or consist essentially of nuclease- free water, water treated with diethylpyrocarbonate, and/or water treated with another agent that eliminates any RNases.

[0588] In some aspects, the disclosed method does not comprise oligo(dT) primer initiated reverse transcription. In some aspects, the method does not comprise Tn5 tagmentation.

[0589] A person of skill in the art is familiar with methods for halting RT. For example, a chelating agent can be added to the sample to halt RT. As known to a person of skill in the art, chelating agents can comprise, consist, or consist essentially of EDTA and/or EGTA. In some aspects, halting RT comprises, consists, or consists essentially of providing at least one chelating agent to the sample. In some aspects, the at least one chelating agent comprises, consists, or consists essentially of EDTA and/or EGTA. In some aspects, the EDTA is at a concentration of greater than, equal to, at least, at most, or about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 mM, including any range or value derivable therein. In some aspects, the EGTA is at a concentration of greater than, equal to, at least, at most, or about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 mM, including any range or value derivable therein. In some aspects, the at least one chelating agent is provided to the sample for greater than, equal to, at least, or at most 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 minutes, including any range or value derivable therein. In some aspects, the at least one chelating agent is provided to the sample at room temperature.

[0590] As noted above, the present methods can further comprise, consist, or consist essentially steps which permit recovery of DNA and/or cDNA from a sample. For example, an optional cell digestion step can be included after the incubating step or optional in-situ imaging step. A person of skill in the art can also use alternative DNA extraction protocols, such as treatment with chemical extractants, physical disruption, treatment with proteases, and/or treatment with other cellular lysis agents. As is known in the art, chemical extractants can comprise, consist, or consist essentially of sodium dodecyl sulfate (SDS), chloroform, phenol, Chelex 100, and/or guanadinium isothiocyanate. As is known in the art, physical disruption methods can comprise, consist, or consist essentially of bead mill homogenization and/or freeze-thaw lysis. As is known in the art, proteases or other cellular lysis agents can comprise, consist, or consist essentially of a lysozyme, a proteinase K, achromopeptidase, and/or pronase E.

5. DNA Sequencing

[0591] As noted above, in some aspects, the cDNA sequencing step produces a binding profile for the RBP of interest. As is commonly known in the art, DNA sequencing can comprise, consist, or consist essentially of amplifying the cDNA, purifying the amplified cDNA, and sequencing the purified cDNA. A person of skill in the art is familiar with common sequencing methods, which can include high-throughput sequencing.

[0592] In some aspects, the methods of the disclosure include a sequencing method. In certain aspects, methods involve sequencing the cDNA produced by incubation step. The cDNA can be prepared for sequencing by any method known in the art, such as library preparation, hybrid capture, sample quality control, product-utilized ligation-based library preparation, or a combination thereof. The cDNA can be prepared for any sequencing technique. In some aspects, a unique genetic readout for each sample can be generated by genotyping one or more highly polymorphic SNPs. In some aspects, sequencing, such as 76 base pair, paired-end sequencing, can be performed to cover approximately 70%, 75%, 80%, 85%, 90%, 95%, 99%, or greater percentage of targets at more than 20x, 25x, 30x, 35x, 40x, 45x, 50x, or greater than 50x coverage. In certain aspects, mutations, SNPS, INDELS, copy number alterations (somatic and/or germline), or other genetic differences can be identified from the sequencing using at least one bioinformatics tool, including VarScan2, any R package (including CopywriteR) and/or Annovar. Exemplary sequencing methods include those described below.

[0593] Massively parallel signature sequencing (MPSS) the first of the next-generation sequencing technologies, was developed in the 1990s at Lynx Therapeutics. MPSS was a beadbased method that used a complex approach of adapter ligation followed by adapter decoding, reading the sequence in increments of four nucleotides. This method made it susceptible to sequence-specific bias or loss of specific sequences. Because the technology was so complex, MPSS was only performed 'in-house' by Lynx Therapeutics and no DNA sequencing machines were sold to independent laboratories. Lynx Therapeutics merged with Solexa (later acquired by Illumina) in 2004, leading to the development of sequencing-by-synthesis, a simpler approach acquired from Manteia Predictive Medicine, which rendered MPSS obsolete. However, the essential properties of the MPSS output were typical of later "next-generation" data types, including hundreds of thousands of short DNA sequences. In the case of MPSS, these were typically used for sequencing cDNA for measurements of gene expression levels. Indeed, the powerful Illumina HiSeq2000, HiSeq2500 and MiSeq systems are based on MPSS. [0594] Polony sequencing developed in the laboratory of George M. Church at Harvard, was among the first next-generation sequencing systems and was used to sequence a full genome in 2005. It combined an in vitro paired-tag library with emulsion PCR, an automated microscope, and ligation-based sequencing chemistry to sequence an E. coli genome at an accuracy of >99.9999% and a cost approximately 1/9 that of Sanger sequencing. The technology was licensed to Agencourt Biosciences, subsequently spun out into Agencourt Personal Genomics, and eventually incorporated into the Applied Biosystems SOLiD platform, which is now owned by Life Technologies.

[0595] 454 pyrosequencing is a parallelized version of pyrosequencing developed by 454

Life Sciences, which has since been acquired by Roche Diagnostics. The method amplifies DNA inside water droplets in an oil solution (emulsion PCR), with each droplet containing a single DNA template attached to a single primer-coated bead that then forms a clonal colony. The sequencing machine contains many picoliter-volume wells each containing a single bead and sequencing enzymes. Pyrosequencing uses luciferase to generate light for detection of the individual nucleotides added to the nascent DNA, and the combined data are used to generate sequence read-outs. This technology provides intermediate read length and price per base compared to Sanger sequencing on one end and Solexa and SOLiD on the other.

[0596] Illumina (Solexa) sequencing. Solexa, now part of Illumina, developed a sequencing method based on reversible dye-terminators technology, and engineered polymerases, that it developed internally. The terminated chemistry was developed internally at Solexa and the concept of the Solexa system was invented by Balasubramanian and Klennerman from Cambridge University's chemistry department. In 2004, Solexa acquired the company Manteia Predictive Medicine in order to gain a massivelly parallel sequencing technology based on "DNA Clusters", which involves the clonal amplification of DNA on a surface. The cluster technology was co-acquired with Lynx Therapeutics of California. Solexa Ltd. later merged with Lynx to form Solexa Inc.

[0597] In this method, DNA molecules and primers are first attached on a slide and amplified with polymerase so that local clonal DNA colonies, later coined "DNA clusters", are formed. To determine the sequence, four types of reversible terminator bases (RT -bases) are added and non-incorporated nucleotides are washed away. A camera takes images of the fluorescently labeled nucleotides, then the dye, along with the terminal 3' blocker, is chemically removed from the DNA, allowing for the next cycle to begin. Unlike pyro sequencing, the DNA chains are extended one nucleotide at a time and image acquisition can be performed at a delayed moment, allowing for very large arrays of DNA colonies to be captured by sequential images taken from a single camera.

[0598] Decoupling the enzymatic reaction and the image capture allows for optimal throughput and theoretically unlimited sequencing capacity. With an optimal configuration, the ultimately reachable instrument throughput is thus dictated solely by the analog-to-digital conversion rate of the camera, multiplied by the number of cameras and divided by the number of pixels per DNA colony required for visualizing them optimally (approximately 10 pixels/colony). In 2012, with cameras operating at more than 10 MHz A/D conversion rates and available optics, fluidics and enzymatics, throughput can be multiples of 1 million nucleotides/second, corresponding roughly to one human genome equivalent at lx coverage per hour per instrument, and one human genome re-sequenced (at approx. 3 Ox) per day per instrument (equipped with a single camera).

[0599] SOLiD sequencing. Applied Biosystems' (now a Thermo Fisher Scientific brand) SOLiD technology employs sequencing by ligation. Here, a pool of all possible oligonucleotides of a fixed length are labeled according to the sequenced position. Oligonucleotides are annealed and ligated; the preferential ligation by DNA ligase for matching sequences results in a signal informative of the nucleotide at that position. Before sequencing, the DNA is amplified by emulsion PCR. The resulting beads, each containing single copies of the same DNA molecule, are deposited on a glass slide. The result is sequences of quantities and lengths comparable to Illumina sequencing. This sequencing by ligation method has been reported to have some issue sequencing palindromic sequences.

[0600] Ion Torrent semiconductor sequencing. Ion Torrent Systems Inc. (now owned by Thermo Fisher Scientific) developed a system based on using standard sequencing chemistry, but with a novel, semiconductor based detection system. This method of sequencing is based on the detection of hydrogen ions that are released during the polymerization of DNA, as opposed to the optical methods used in other sequencing systems. A microwell containing a template DNA strand to be sequenced is flooded with a single type of nucleotide. If the introduced nucleotide is complementary to the leading template nucleotide it is incorporated into the growing complementary strand. This causes the release of a hydrogen ion that triggers a hypersensitive ion sensor, which indicates that a reaction has occurred. If homopolymer repeats are present in the template sequence multiple nucleotides will be incorporated in a single cycle. This leads to a corresponding number of released hydrogens and a proportionally higher electronic signal. [0601] DNA nanoball sequencing is a type of high throughput sequencing technology used to determine the entire genomic sequence of an organism. The company Complete Genomics uses this technology to sequence samples submitted by independent researchers. The method uses rolling circle replication to amplify small fragments of genomic DNA into DNA nanoballs. Unchained sequencing by ligation is then used to determine the nucleotide sequence. This method of DNA sequencing allows large numbers of DNA nanoballs to be sequenced per run and at low reagent costs compared to other next generation sequencing platforms. However, only short sequences of DNA are determined from each DNA nanoball which makes mapping the short reads to a reference genome difficult. This technology has been used for multiple genome sequencing projects.

[0602] Heliscope single molecule sequencing is a method of single-molecule sequencing developed by Helicos Biosciences. It uses DNA fragments with added poly-A tail adapters which are attached to the flow cell surface. The next steps involve extension-based sequencing with cyclic washes of the flow cell with fluorescently labeled nucleotides (one nucleotide type at a time, as with the Sanger method). The reads are performed by the Heliscope sequencer. The reads are short, up to 55 bases per run, but recent improvements allow for more accurate reads of stretches of one type of nucleotides. This sequencing method and equipment were used to sequence the genome of the Ml 3 bacteriophage.

[0603] Single molecule real time (SMRT) sequencing is based on the sequencing by synthesis approach. The DNA is synthesized in zero-mode wave-guides (ZMWs) - small welllike containers with the capturing tools located at the bottom of the well. The sequencing is performed with use of unmodified polymerase (attached to the ZMW bottom) and fluorescently labelled nucleotides flowing freely in the solution. The wells are constructed in a way that only the fluorescence occurring by the bottom of the well is detected. The fluorescent label is detached from the nucleotide at its incorporation into the DNA strand, leaving an unmodified DNA strand. According to Pacific Biosciences, the SMRT technology developer, this methodology allows detection of nucleotide modifications (such as cytosine methylation). This happens through the observation of polymerase kinetics. This approach allows reads of 20,000 nucleotides or more, with average read lengths of 5 kilobases.

6. Sample Imaging

[0604] As noted above, in some aspects, present methods can further comprise, consist, or consist essentially of an optional in-situ imaging step after the incubating step. As is known in the art, imaging can be performed by light microscopy, fluorescence microscopy, confocal microscopy, and/or other commonly known microscopy techniques.

[0605] A person of skill in the art can use an imaging moiety, for example a fluorophore such aminocoumarin, fluorescein, texas red, Alexa Fluor dyes (e.g. Alexa Fluor 488, Alexa Fluor 555, Alexa Fluor 594, Alexa Fluor 647, Alexa Fluor 350, Alexa Fluor 532, and Alexa Fluor 700), Cy dyes (e.g. Cy3, Cy5), DyLight dyes, FITC, or Rhodamine, or functional variants thereof to target certain aspects of the sample of interest, for example the biotin-tagged cDNA. Suitable moieties are known to a person of skill, and can consist, comprise, or consist essentially of a biotinylated monoclonal antibody like Alexa Fluor dye. A person of skill in the art can use a nuclear counterstain to indicate live cells with intact, nonpermeable plasma membranes in the sample. A nuclear counterstain can consist, comprise, or consist essentially of a cell-permanent nuclear counterstain which emits fluorescence when bound to dsDNA, like Hoechst stains and/or SYTO stains.

B. Spatial ARTR-seq

[0606] In some aspects, the disclosed method may be modified to obtain spatial information with respect to RNA binding sites. By introducing single-cell and/or spatial barcodes, ARTR-seq can achieve single-cell or spatial resolution. These barcodes can be seamlessly incorporated either through the use of barcoded RT primers during the reverse transcription process or through ligation. They can subsequently employed to assign singlecell identity or spatial localization during data analysis.

[0607] In spatial barcoding-based ARTR-seq, resolution can be fine-tuned by adjusting the density of barcode primers, allowing for cellular and/or subcellular resolution. Apart from spatial barcoding strategy, the in-situ sequencing method may be used in spatial ARTR-seq to achieve subcellular resolution.

[0608] Spatial ARTR-seq offers compatibility with imaging techniques, such as FISH or variations on FISH, microfluidics imaging techniques, or any other single-cell profiling techniques. This compatibility provides additional information alongside sequencing data, such as subcellular structure identification and/or cell stage determination. In an aspect, the disclosed methods may be combined with advanced single cell imaging techniques to provide spatially resolved binding sites and expression date. Commonly used techniques are provided herein.

[0609] Spatial Transcriptomics: This technique combines gene expression analysis with spatial information, allowing researchers to map RNA molecules in a tissue sample. It involves capturing gene expression data while preserving the spatial context, often using barcoded slides or arrays.

[0610] MERFISH (Multiplexed Error-Robust Fluorescence In Situ Hybridization): A highly multiplexed method for visualizing the spatial distribution of thousands of RNA molecules within cells. It uses fluorescent probes to detect RNA and generate a spatially resolved map of gene expression at the single-cell level.

[0611] SeqFISH (Sequential Fluorescence In Situ Hybridization): Similar to MERFISH, SeqFISH sequentially labels and images RNA molecules within cells using different fluorescent probes, enabling the spatial resolution of hundreds to thousands of genes within 3D tissue sections.

[0612] STARmap (Spatially Resolved Transcript Amplicon Readout Mapping): A technique that preserves the 3D structure of tissues while performing RNA sequencing. It uses hydrogel-tissue chemistry to encode RNA spatial information, allowing for highly multiplexed in situ transcriptomics.

[0613] Slide-Seq: A method that uses barcoded beads on a slide to capture RNA transcripts from tissue sections. This technique maps gene expression across the tissue with single-cell resolution, while maintaining spatial context.

[0614] Visium Spatial Gene Expression: Developed by lOx Genomics, this technique captures mRNA from tissue sections using spatially barcoded microarrays. It provides a spatial map of gene expression, linking molecular data with histological information.

[0615] Laser Capture Microdissection (LCM): A technique that physically isolates specific regions or cells from a tissue sample using a laser. These cells are then analyzed for gene expression or other molecular features, allowing for spatially resolved insights, though in a more manual and targeted way.

[0616] Imaging Mass Cytometry (IMC): Combines high-resolution imaging with mass cytometry to map the spatial distribution of proteins, DNA, or RNA in tissue sections. It allows multiplexed detection of dozens of markers at a time, preserving spatial and cellular context.

[0617] DBiT-seq (Deterministic Barcoding in Tissue for Spatial Omics Sequencing) is a method for co-mapping of mRNAs and proteins in a formaldehyde-fixed tissue slide via nextgeneration sequencing (NGS). Parallel microfluidic channels are used to deliver DNA barcodes to the surface of a tissue slide, and crossflow of two sets of barcodes, Al-50 and Bl-50, followed by ligation in situ, yielding a 2D mosaic of tissue pixels, each containing a unique full barcode AB. Gene expression profiles in 10-pm pixels conformed into the clusters of single-cell transcriptomes, allowing for rapid identification of cell types and spatial distributions.

C. Multiplex ARTR-seq

[0618] In some aspects, also provided herein are methods for determining the RNA interactions sites of more than one RNA binding protein. In an aspect, the methods comprise modification of the ARTR-seq method, such that each RBP-targeting agent is tagged with a separate barcode, and wherein the barcode may be incorporated into the cDNA using click chemistry. In some aspects, the method may be used to map the RNA binding sites for greater than, equal to, at least, or at most 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 30, 36, 40, 48, 50, 60, 70, 72, 80, 84, 90, 96, 100, 108, 120, 150, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500 RBPs, or any range derivable therein. In some aspects, the methods disclosed herein may be used to map the RNA binding sites of all the RBPs in a cell.

[0619] In some aspects, provided herein is method of determining one or more RNA interaction sites of a first RNA-binding Protein (RBP) in a biological sample, comprising: a) contacting a first RBP-targeting agent comprising an alkyne functionalized first DNA barcode, to the first RBP, wherein the first RBP-targeting agent specifically binds the first RBP to form a first complex; b) contacting the first complex with one or more secondary binding agents that specifically binds the first RBP-targeting agent, to form a second complex; c) incubating the first or the second complex with the transcriptase composition disclosed herein, to obtain a first barcoded cDNA library; d) amplifying and sequencing the first barcoded cDNA library; e) obtaining one or more interaction site of the first RBP by deconvoluting the sequenced cDNA library based on the first DNA barcode. In some aspects, the transcriptase composition for use in the method may comprise RT primers comprising a reactive moiety such that it can react with a barcoded oligonucleotide containing antibody or targeting moiety, wherein the barcode oligonucleotide comprises a corresponding reactive moiety for click chemistry. A reactive moiety of a random RT primer may be selected from the non-limiting group consisting of azides, alkynes, nitrones (e.g., 1,3 -nitrones), strained alkenes (e.g., trans-cycloalkenes such as cyclooctenes or oxanorbomadiene), tetrazines, tetrazoles, iodides, thioates (e.g., phorphorothioate), acids, amines, and phosphates. For example, the first reactive moiety of the RT primer may comprise an azide moiety, and a second reactive moiety of the barcode oligonucleotide may comprise an alkyne moiety. The first and second reactive moieties may react to form a linking moiety. A reaction between the first and second reactive moieties may be, for example, a cycloaddition reaction such as a strain -promoted azide-alkyne cycloaddition, a copper-catalyzed azide-alkyne cycloaddition, a strain-promoted alkyne-nitrone cycloaddition, a Diels-Alder reaction, a [3+2] cycloaddition, a [4+2] cycloaddition, or a [4+1] cycloaddition; a thiol-ene reaction; a nucleophilic substation reaction; or another reaction. In some cases, reaction between the first and second reactive moieties may yield a triazole moiety or an isoxazoline moiety. A reaction between the first and second reactive moieties may involve subjecting the reactive moieties to suitable conditions such as a suitable temperature, pH, or pressure and providing one or more reagents or catalysts for the reaction. For example, a reaction between the first and second reactive moieties may be catalyzed by a copper catalyst, a ruthenium catalyst, or a strained species such as a difluorooctyne, dibenzylcyclooctyne, or biarylazacyclooctynone. In some aspects, the random RT primer disclosed herein may further comprise a azide functional group (NNNN-N3).

[0620] In some aspects, the method comprises RBP-targeting agents that comprise an oligonucleotide comprising a DNA-barcode. In some aspects, the oligonucleotide is linked to the RBP-targeting agent via an amino spacer. In some aspects, the amino spacer is a 7 C6 amino spacer, wherein a non-nucleoside modification adds a primary amino group to an oligo's internal position. The amino group is separated from the 5' end nucleotide base by a 6-carbon spacer arm to reduce steric interaction.

[0621] In some aspects, the DNA-barcode can be unique for each RBP being studied. In some aspects, use of multiple barcoded antibodies, wherein each barcode is specific to a RBP, allows for studying more than one RBP using the methods disclosed herein. In some aspects, the oligonucleotide may further comprise a reactive moiety that is operable in attaching the barcode to a cDNA of the disclosed method. A reactive moiety of a barcoded antibody may be selected from the non-limiting group consisting of azides, alkynes, nitrones (e.g., 1,3 -nitrones), strained alkenes (e.g., trans-cycloalkenes such as cyclooctenes or oxanorb omadiene), tetrazines, tetrazoles, iodides, thioates (e.g., phosphorothioate), acids, amines, and phosphates. For example, the first reactive moiety of the RT primer may comprise an azide moiety, and a second reactive moiety of the barcode oligonucleotide may comprise an alkyne moiety. The first and second reactive moieties may react to form a linking moiety. A reaction between the first and second reactive moieties may be, for example, a cycloaddition reaction such as a strain-promoted azide-alkyne cycloaddition, a copper-catalyzed azide-alkyne cycloaddition, a strain-promoted alkyne-nitrone cycloaddition, a Diels-Alder reaction, a [3+2] cycloaddition, a [4+2] cycloaddition, or a [4+1] cycloaddition; a thiol-ene reaction; a nucleophilic substation reaction; or another reaction. In some cases, reaction between the first and second reactive moieties may yield a triazole moiety or an isoxazoline moiety. A reaction between the first and second reactive moieties may involve subjecting the reactive moieties to suitable conditions such as a suitable temperature, pH, or pressure and providing one or more reagents or catalysts for the reaction. For example, a reaction between the first and second reactive moieties may be catalyzed by a copper catalyst, a ruthenium catalyst, or a strained species such as a difluorooctyne, dibenzylcyclooctyne, or biarylazacyclooctynone. Table A provides a list of some exemplary oligonucleotides that may be linked to the RBP binding agent via an amino spacer, and that comprise an alkyne group reactive moiety.

[0622] In some aspects, the method further comprises incorporation of the biotinylated dNTPs, and the RT primer sequence comprising azide functional group, into the cDNA to form proximal azide labeled biotinylated cDNAs during reverse transcription. In some aspects, the method further comprises incorporating the alkyne functionalized first DNA barcode into the cDNA by reacting the alkyne functionalized first DNA barcode with the proximal azide labeled biotinylated cDNA of claim, using in-situ copper catalyzed azide-alkyne cycloaddition (CuAAC), to obtain a first barcoded biotinylated cDNA library. In some aspects, the method further comprises purifying the barcoded biotinylated cDNA library over a streptavidin column prior to step (d). In some aspects, the method further comprises processing the CuAAC using a Klenow Fragment DNA polymerase for second strand synthesis prior to sequencing. In some aspects the one or more interaction sites of the first RBP are obtained by deconvoluting the sequenced data based on the first DNA barcode incorporated into the cDNA. In some aspects, the method further comprises similarly determining the one or more RNA-interaction sites of a second RNA-binding Protein (RBP) in a biological sample, comprising: a) contacting a second RBP-targeting agent comprising a alkyne functionalized second DNA barcode, to the second RBP, wherein the RBP-targeting agent specifically binds the second RBP to form a second primary complex; b) contacting the second primary complex with one or more secondary binding agents that specifically binds the first RBP-targeting agent, to form a second secondary complex; c) incubating the second primary or the second secondary complex with the transcriptase composition, to obtain a second barcoded cDNA library; d) amplifying and sequencing the second barcoded cDNA library; and e) obtaining one or more interaction site of the second RBP by deconvoluting the sequenced cDNA library based on the second DNA barcode.

[0623] In some aspects, this process can be simultaneously conducted for, for greater than, for equal to, for at least, or for at most 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 30, 36, 40, 48, 50, 60, 70, 72, 80, 84, 90, 96, 100, 108, 120, 150, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500 RBPs, or any range derivable therein.

D. Advanced ARTR-seq and spatial ARTR-seq for RNA modifications

[0624] In some aspects, any of the methods disclosed herein may be modified to determine RNA modification sites, either at sequence level or spatial level. In an aspect, suitable modification sites may comprise, consist of, consist essentially of m⁶C, m⁵C, m^xA, m⁷G, or a pseudouridine modification. In an aspect, the method may comprise using a modification targeting agent instead of a RBP targeting agent in the ARTR-method. In some aspects, suitable modification targeting agents comprise, consist of, consist essentially of an antibody or variant thereof, an oligonucleotide or variant thereof, a receptor, a ligand, a small molecule, an aptamer, or any combination thereof. In some aspects, the modification targeting agent specifically binds to a modification site. The method may be used with ARTR-seq, multiplex ARTR-seq or spatial ARTR-seq as provided herein, with suitable adjustments as will be known to a person of skill in the art with the disclosure herein.

[0625] Thus, in some aspects, the current disclosure encompasses method of determining spatial distribution of a RNA modification site on a biological sample bound to a solid surface, comprising: a) contacting a modification-targeting agent that specifically binds the modification site on the RNA to form a primary complex; b) contacting the primary complex with a secondary binding agent that specifically bind the primary complex to form a secondary complex; c) incubating the primary complex or the secondary complex with the transcriptase composition to obtain cDNA; optionally incorporating labelled barcodes into the cDNA; and sequencing and imaging the biological sample using a single cell genomic imaging technique to determine the one or more modification sites.

[0626] In some aspects, the modification-targeting agent is an oligonucleotide, or a variant thereof, or a small molecule. In some aspects, the oligonucleotide comprises, consists essentially of, or consists of fluorescent NTPs, or a fluorescent probe. In some aspects, modification-targeting agent is an antibody or a functional variant thereof. In some aspects, the antibody or the functional variant thereof comprises monoclonal antibodies, polyclonal antibodies, recombinant antibody, IgG, Fv, single chain antibody, single domain antibodies, nanobodies, diabodies, multispecific antibodies (e.g., bispecific antibodies), scFv, Fab, F(ab')2, Fab, or variants thereof. In some aspects, modification targeting agent specifically binds to a modification comprising, consisting essentially of, or consisting of m6C, m5C, ml A, m7G, or a pseudouridine modification. In some aspects, the sequencing and imaging is done using a single cell genomic imaging technique as disclosed herein. In some aspects, the single cell genomic imaging technique comprises, consists essentially of, consists of deterministic barcoding in tissue for spatial omics sequencing (DBiT-seq) comprising: ligating a first set and a second set of spatial barcodes to the cDNA of step (c), prior to step (d), wherein the first set of spatial barcodes are contacted to the cDNA horizontally using a first multi-channel microfluidic chip, and the second set of spatial barcodes are contacted to the solid surface vertically using a second multi-channel microfluidic chip.

IV. Kits

[0627] Certain aspects of the present disclosure also concern kits containing compositions of the disclosure and/or compositions to implement methods disclosed herein. In some aspects, the current disclosure encompasses a kit comprising a polypeptide construct as disclosed herein. In some aspects, the kit comprises a transcriptase composition as disclosed herein. In some aspects, the current disclosure encompasses a kit comprising in one or more suitable container(s), an RBP-targeting agent that specifically binds to an RBP, one or more secondary binding agents, a polypeptide construct as disclosed herein, and/or the transcriptase composition as disclosed herein.

[0628] In some aspects, disclosed are kits that can be used to prepare a sample for RBP- RNA binding site and/or RNA modification site identification. In some aspects, disclosed are kits that can be used to identify RBP-RNA binding sites via ARTR-seq, spatial ARTR-seq, multiplexed ARTR-seq and advanced ARTR-seq techniques for determining RNA modification sites.

[0629] The kit can optionally provide additional components that are useful in the procedure. These optional components include buffers, capture reagents, developing reagents, labels, reacting surfaces, means for detection, control samples, instructions, and interpretive information. In certain aspects, a kit contains, contains at least, or contains at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 100, 500, 1,000 or more probes, primers or primer sets, synthetic molecules or inhibitors, or any value or range and combination derivable therein. In some aspects, there are kits for evaluating RBP binding activity in cells.

[0630] Kits can comprise components, which can be individually packaged or placed in a container, such as a tube, bottle, vial, syringe, or other suitable container means.

- I l l - [0631] Individual components can also be provided in a kit in concentrated amounts; in some aspects, a component is provided individually in the same concentration as it would be in a solution with other components. Concentrations of components can be provided as lx, 2x, 5x, lOx, or 20x or more. In certain aspects, negative and/or positive control nucleic acids, probes, and inhibitors are included in some kit aspects.

[0632] Kits for using probes, synthetic nucleic acids, nonsynthetic nucleic acids, RBP targeting agents, and/or targeting moieties of the disclosure for prognostic or diagnostic applications are included as part of the disclosure. In certain aspects, negative and/or positive control nucleic acids, probes, and inhibitors are included in some kit aspects. In addition, a kit can include a sample that is a negative or positive control for RBP-RNA interactions.

[0633] Any aspect of the disclosure involving specific RBP, RNA, or other biomarker by name is contemplated also to cover aspects involving biomarkers whose sequences are at least 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99% identical to the mature sequence of the specified nucleic acid.

[0634] Detection Kits and Systems: One can recognize that based on the methods described herein, detection reagents, kits, and/or systems can be utilized to detect the biomarkers, including ST2, for diagnosing or prognosing an individual. The reagents can be combined into at least one of the established formats for kits and/or systems as known in the art. The kits could also contain other reagents, chemicals, buffers, enzymes, packages, containers, electronic hardware components, etc. The kits/systems could also contain packaged sets of PCR primers, oligonucleotides, arrays, beads, antibodies, or other detection reagents. Any number of probes could be implemented for a detection array. In some aspects, the detection reagents and/or the kits/systems are paired with chemiluminescent or fluorescent detection reagents. Particular aspects of kits/systems include the use of electronic hardware components, such as DNA chips or arrays, or microfluidic systems, for example. In specific aspects, the kit also comprises one or more therapeutic or prophylactic interventions in the event the individual is determined to be in need of.

[0635] It is contemplated that any method or composition described herein can be implemented with respect to any other method or composition described herein and that different aspects can be combined, and that these compositions may be packaged into kits and/or kits may be designed to facilitate these methods.

[0636] The claims originally filed are contemplated to cover claims that are multiply dependent on any filed claim or combination of filed claims. V. Clinical, and non-clinical applications

[0637] In some aspects, the current disclosure also encompasses methods of using the methods and/or compositions disclosed herein for use in clinical, non-clinical, and/or research use. The novel approach of the methods and products of the present disclosure utilizes now well established array and sequencing technology to yield single cell level information for RNA binding proteins, whilst retaining the positional information. It will be evident to the person of skill in the art that this represents a milestone in the life sciences. The new technology opens new avenues of research, which is likely to have profound consequences for our collective understanding of tissue development and tissue and cellular function in all multicellular organisms. It will be apparent that such techniques will be particularly useful in our understanding of the cause and progress of disease states and in developing effective treatments for such diseases, for example but not limited to, cancer. The methods of the disclosure will also find uses in the diagnosis of numerous medical conditions. These methods can be used to advance research of RBPs, translatome, transcriptome, and epitranscriptomic regulations. For example, over 150 distinct chemical modifications occur on the RNA molecules, impacting various aspects of gene expression, such as RNA decay and translation. These modifications also play critical roles in physiology and diseases. Notable, A⁶ -methyladenosine (m⁶A) stands out as the most prevalent modification in mammalian mRNA and chromatin-associated RNA (caRNA), with close associations with disorders and cancers. In addition, m⁶A modification exhibits distinct tissue-specific distributions. Measuring transcriptome-wide m⁶A at single-cell and spatial resolution allows a deeper understanding of epitranscriptomic regulations in heterogeneous cell types within tissues.

[0638] In some aspects, the disclosed methods may also be used to develop diagnostic methods to detect a disease or a disorder, to study disease progression or to study susceptibility if a subject to a disease or disorder.

Diseases or Disorders

[0639] In certain aspects, methods involve obtaining a sample from a subject with a disease or disorder. In some embodiments, compositions, methods, and/or kits described herein may be used in a method of preventing, treating, reducing the progression of, and/or reducing the risk of a disease or disorder, wherein the disease or disorder is a cancer and/or a neurodegenerative disease.

[0640] In some embodiments, the disease or disorder is a cancer. In some embodiments, the cancer is pancreatic cancer, breast cancer, kidney cancer, bladder cancer, prostate cancer, testicular cancer, urothelial cancer, endometrial cancer, ovarian cancer, cervical cancer, renal cancer, esophageal cancer, gastrointestinal stromal tumor (GIST), multiple myeloma, cancer of secretory cells, thyroid cancer, gastrointestinal carcinoma, chronic myeloid leukemia, hepatocellular carcinoma, colon cancer, melanoma, malignant glioma, glioblastoma, glioblastoma multiforme, astrocytoma, dysplastic gangliocytoma of the cerebellum, Ewing’ s sarcoma, rhabdomyosarcoma, ependymoma, medulloblastoma, ductal adenocarcinoma, adenosquamous carcinoma, nephroblastoma, acinar cell carcinoma, neuroblastoma, or lung cancer. In some embodiments, the cancer of secretory cells is non-Hodgkin’s lymphoma, Burkitt’s lymphoma, chronic lymphocytic leukemia, monoclonal gammopathy of undetermined significance (MGUS), plasmacytoma, lymphoplasmacytic lymphoma or acute lymphoblastic leukemia.

[0641] In some embodiments, the disease or disorder is a neurological disorder. Neurological disorders are diseases of the body’s nervous system. Structural, biochemical or electrical abnormalities in the brain, spinal cord or other nerves can result in a range of symptoms. There are more than 600 diseases of the nervous system, such as epilepsy, dementias, Alzheimer’s disease and cerebrovascular diseases including stroke, multiple sclerosis, Parkinson’s disease, amyotrophic lateral sclerosis, migraine, neuroinfections, brain tumors and traumatic disorders of the nervous system such as brain trauma and autism.

EXAMPLES

[0642] The following examples are included to demonstrate certain aspects of the disclosure. It should be appreciated by those of skill in the art that the techniques disclosed in the examples which follow represent techniques discovered by the inventors to function well in the practice of the disclosure, and thus can be considered to constitute certain modes for its practice. However, those of skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific aspects which are disclosed and still obtain a like or similar result without departing from the spirit and scope of the inventions described herein.

EXAMPLE 1 - Development of ARTR-seq

[0643] To overcome the limitations of existing methods, the inventors introduce an Assay of Reverse Transcription-based RBP binding sites Sequencing (ARTR-seq) to capture RBP- RNA interactions through in-situ reverse transcription (RT). The inventors demonstrated that ARTR-seq sensitively profiled the RNA targets of RBPs with good sequencing quality, using as few as 20 cells or even a single tissue section. Additionally, an imaging step was readily built into the ARTR-seq procedure, which provides direct spatial information of RBP-RNA interactions. With ARTR-seq, the inventors show distinct binding patterns of splicing factors and regulatory differences among the YTH family reader proteins of RNA ^-methyladenosine (m⁶A) modification. ARTR-seq unbiasedly detected RNA binding by RBPs in both cytoplasm and nucleus and measured the binding strength of RBP on different RNA substrates. Furthermore, the inventors show that ARTR-seq can be applied to monitor dynamic RNA targeting by G3BP1 during stress granule assembly on a small timescale of 10 minutes.

METHODS

Cell culture and stress treatment

[0644] In some instances, HeLa cells and HepG2 cells can be purchased from ATCC, and cultured in DMEM medium (Gibco) supplemented with 10% fetal bovine serum (FBS, Gibco) and Penicillin-Streptomycin (Gibco). In some instances, K562 cells can be obtained from ATCC and cultured in RPMI 1640 Medium (Gibco) supplemented with 10% (v/v) fetal bovine serum. Penicillin-Streptomycin (Gibco) and 2 mM L-glutamine (Gibco). Cells can be grown at 37 °C with 5% CO2. For NaAsCh treatment, HeLa cells can be grown to 90% confluence and replaced in the pre-warmed DMEM medium containing 0.5 mM NaAsCh, which can be further maintained at 37 °C with 5% CO2 for indicated times.

Expression and purification of recombinant protein A/G-reverse transcriptase

[0645] In some instances, the recombinant plasmids can be constructed by assembly of pet28A vector, protein A/G (pAG), linkers with different lengths, and reverse transcriptase (RTase) or a modified RTase with NEBuilder® HiFi DNA Assembly Master Mix (NEB) following the manufacturer’s protocol. Protein A/G dNA segment can be amplified from pAG/MNase plasmid (Addgene, #123461). In some instances, the engineered MMLV RTase can be modified from pCMV-PE2 plasmid (Addgene, #132775). In other instances, the recombinant proteins were expressed in BL21(DE3) Competent E. coli. (NEB) with IPTG induction at 16 °C for 18 h. In some instances, cells can be collected by centrifuge at 6000 rpm for 10 min, and lysed in the buffer of 50 mM Tris HC1 (pH 7.5), 300 mM NaCl and ImM PMSF with sonication at 10s-on/l Os-off setting for 10 min at 4 °C. In some instances, the recombinant proteins can be purified from the supernatant using HisTrap HP column (GE Healthcare), followed by ion exchange chromatography column (GE Healthcare) on an AKTA Purifier 10 system (GE Healthcare) according to the manufacturer’ s protocol, and concentrated to about 20 mg/ml. The purified enzyme can be supplemented with 40% glycerol and stored in -80 °C for future use. Quantitative reverse transcription-polymerase chain reaction (qRT-PCR)

[0646] In some instances, RNA can be reverse transcribed with purified protein AG- reverse transcriptases (pAG-RTases) or commercial reverse transcriptases in reaction buffer (50 mM Tris HC1, 150 mMNaCl, pH 7.5) at 37 °C for 15 min, and denatured at 85 °C for 5min. Quantitative PCR can be performed with FastStart Essential DNA Green Master (Roche) on LightCycler 96 System (Roche). The efficiency of reverse transcription (RT) can be quantified by delta quantitation cycle (Cq) method.

Protein detection by Coomassie brilliant blue (CBB) stain and western blot

[0647] In some instances, the mammalian cell samples can be lysed with cold RIPA buffer (Thermo Fisher Scientific) containing l x protease inhibitor cocktail (Roche). The cell lysate can be cleared with centrifugation at 15,000 g for 10 min at 4 °C. The supernatant or purified protein can then be mixed with LDS loading buffer (Bio-Rad) and boiled at 95 °C for 10 min. Denatured protein can be loaded into 4-12% NuPAGE BIS-Tris gel (Thermo Fisher Scientific). [0648] In some instances, for CBB stain, the gel can be stained with Imperial Protein Stain (Thermo Fisher Scientific) and detected by FluroChem R (Proteinsimple). For the western blot, the protein can be transferred to the PVDF membrane from gel. The membranes can be blocked in 3% BSA (diluted in PBST) for 1 h at room temperature, incubated in a diluted primary antibody solution at 4°C overnight, washed with PBST 4 times, and incubated in a dilution of secondary antibody conjugated to HRP for 1 h at room temperature. Protein bands can be imaged by SuperSignal West Dura Extended Duration Substrate kit (Thermo Fisher Scientific) on the FluroChem R machine (Proteinsimple). Quantification can be performed using Imaged software.

Transfection

[0649] In some instances, PTBP1 siRNA can be purchased from Horizon Discovery /Dharmacon. Cells can be seeded in 30 % confluency one day before. After 12h, siRNA can be transfected with RNAimax (Thermo Fisher Scientific) following the manufacturer’s manual. The fresh medium can then be changed at 6h after transfection. Cells can then be cultured for another 48h, and the knockdown efficiency can be quantified by western blot.

ARTR-seq

[0650] In some instances, cells can be fixed to a chamber with 1.5% paraformaldehyde (PF A) at room temperature for 10 min. To mitigate cell loss, 1.5% PFA crosslinking can be applied instead of the commonly used 1% PFA crosslinking. Samples can then be quenched with 125 mM glycine at room temperature for 5 min, and permeabilized with 0.5% Triton X- 100 on ice for 10 min. Samples can then blocked with 1 mg/ml UltraPur BSA (Thermo Fisher Scientific) at room temperature for 30 min, stained with the primary antibody at room temperature for Ih, and then stained with fluorophore-labeled secondary antibody at room temperature for 30 min, followed by incubation with pAG-RTase for an additional 30 min. For input samples, the primary antibody can be replaced by the DPBS buffer with 1 mg/ml UltraPure BSA. Cells can be washed with DPBS at least once, twice, thrice, or more after each staining step by shaking at room temperature for 3 min.

[0651] In some instances, a reverse transcription reaction mixture was prepared by mixing 2 pM adapter-RT primer (5'-AGACGTGTGCTCTTCCGATCTNNNNNNNNNN-3'), 0.05 mM biotin- 16-dUTP (Jena Bioscience), 0.05 mM biotin- 16-dCTP (Jena Bioscience), 0.05 mM dTTP (Thermo Fisher Scientific), 0.05 mM dCTP (Thermo Fisher Scientific), 0.1 mM dATP (Thermo Fisher Scientific), 0.1 mM dGTP (Thermo Fisher Scientific), 1 U/pl RNaseOUT (Thermo Fisher Scientific) in 50 pl buffer of DPBS supplemented with 3 mM MgCh. In-situ reverse transcription can be performed by immersing cells with the Transcriptase mix and incubating at 37 °C for 30 min, then stopping by adding 20 mM EDTA and 10 mM EGTA and incubating at room temperature for 3 min. Next, cells can then be stained with biotin monoclonal antibody (BK-1/39) - Alexa fluor 488 (Thermo Fisher Scientific) by incubation at room temperature for Ih, followed by stain with 1 pg/mL Hoechst 33342 dye (Thermo Fisher Scientific) at room temperature for 15 min, and then imaged by Leica SP8 laser confocal microscope. The fluorescence intensity distribution on a line can be quantified by ImageJ software. After imaging, cells can be digested with proteinase K (Thermo Fisher Scientific) at 37 °C for 2 h. The nucleic acids can be recovered by phenol-chloroform extraction and concentrated by ethanol precipitation. RNA can be digested with RNase H (NEB) and RNase A/Tl (Thermo Fisher Scientific) at 37 °C for Ih, followed by enriching biotinylated cDNA using 10 pl pre-blocked Dynabeads MyOne Streptavidin Cl (Thermo Fisher Scientific) at room temperature for 20 min. The beads can be washed, and the on-beads 3' cDNA adapter (5'Phos-NNNNNNNNAGATCGGAAGAGCGTCGTGT-3'SpC3) (nucleic acid sequence as in SEQ ID NO: 26) can be ligated by T4 RNA ligase 1 (NEB) by incubating at 25 °C for 16 h. The beads can be washed again, and cDNA can be recovered with the elution buffer of 95 % (v/v) formamide and lOmM EDTA (pH 8.0) by boiling at 95 °C for 10 min, followed by ethanol precipitation.

[0652] The library can be obtained by PCR amplification with NGS sequencing primer and gel purification of size between 180 bp and 400 bp. Next-generation sequencing can be carried out either at the University of Chicago Single Cell Immunophenotyping Core on an Illumina NextSeq 550 machine or at the University of Chicago Genomics Facility on an Illumina NovaSeq 6000 platform.

Spatial ARTR-seq

[0653] In addition to identifying RBP binding sites, ARTR-seq can profile translation and RNA modifications. By targeting the RTase to ribosomes, ARTR-seq can identify ribosome binding sites. Additionally, ARTR-seq can capture RNA modification sites through in-situ reverse transcription.

[0654] By introducing single-cell or spatial barcodes, ARTR-seq can achieve single-cell or spatial resolution. These barcodes can be seamlessly incorporated either through the use of barcoded RT primers during the reverse transcription process or through ligation, as exemplified in SPLiT-seq⁷². They can subsequently employed to assign single-cell identity or spatial localization during data analysis.

[0655] In spatial barcoding-based ARTR-seq, resolution can be fine-tuned by adjusting the density of barcode primers, allowing for cellular and/or subcellular resolution. Apart from spatial barcoding strategy, the in-situ sequencing, such as FISSEQ⁷³ can be applied in spatial ARTR-seq to achieve subcellular resolution.

[0656] Spatial ARTR-seq offers compatibility with imaging techniques, such as FISH or variations on FISH, microfluidics imaging techniques, or any other single-cell profiling techniques. This compatibility provides additional information alongside sequencing data, such as subcellular structure identification and/or cell stage determination.

RNase treatment in ARTR-seq

[0657] RNase treatment can be incorporated into ARTR-seq procedure with the following adjustments. After permeabilization, cells can incubated with lU/pl RNase I (Thermo Fisher Scientific) at 37 °C for at least 5min, followed by at least one, two, or more washes with a buffer like DBPS. For the samples with strong RNase treatment, an additional RNase I treatment can be conducted as previously described before reverse transcription.

Dot blot

[0658] In some instances, after the proteinase K digestion step in ARTR-seq, the total nucleic acids can be recovered with Oligo Clean & Concentrator Kits (Zymo) to get rid of free biotinylated dNTP. The concentration of nucleic acids can be measured by Nanodrop 8000 Spectrophotometer and adjusted to 50 ng/pL. Next, 1 pL nucleic acids can be loaded onto the Amersham Hybond- N+ membrane (GE Healthcare). Membranes can be air-dried and crosslinked by ultraviolet (UV) strata linker 2400 at 150 mJ/cm² twice. The membranes can be then blocked in 5% fatty-acid-free BSA in PBST (PBS with 0.1% Tween-20) at room temperature for 1 h, followed by incubation in streptavidin-HRP (Thermo Fisher Scientific) in PBST supplemented with 5% fatty-acid free BSA at room temperature for another 1 h. The membrane can be washed with PBST four times before being imaged by SuperSignal West Dura Extended Duration Substrate kit (Thermo Fisher Scientific) on the FluroChem R machine (Proteinsimple).

ARTR-seq in the mouse embryo

[0659] In some instances, C57 mouse embryo (El l) frozen tissue sections can be purchased from Zyagen. The slide with frozen tissue sections can be brought to room temperature for 10-minute incubation. The PAP pen can be used to draw a circle around the mouse tissue on the slide, providing a thin film-like hydrophobic barrier for reagent incubation. Then the tissue can be subjected to typical ARTR-seq procedures.

ARTR-seq with low input

[0660] In some instances, ARTR-seq can be applied to 20 to 5k HepG2 cells with the following changes. 4% PFA can be employed to minimize cell loss for low input samples. 2 pM adapter-barcode-RT primer (5'-AGACGTGTGCTCTTCCGATCT-8-nt barcode- NNNNNNNNNN-3') (together as in SEQ ID NO: 25) can be applied for in-situ reverse transcription. After digestion of proteinase K, two biological replicates can be pooled together for biotinylated cDNA enrichment, adapter ligation, library amplification and library sequencing. Sequence data can be isolated based on the 8-nt barcode in RT primers.

Genome reference

[0661] Genome and the corresponding reference of Homo sapiens (GRCh38.pl3, GENCODE Release 39), Mus musculus (GRCm39, GENCODE Release M29), and Drosophila melanogaster (BDGP6.32, Ensembl Release 107) can be used for mapping the sequencing reads in this study. rRNA reference sequences can be downloaded from NCBI for H. sapiens (NR_003285.3, NR_003286.4, NR_003287.4, NR_023363.1), M. musculus (NR_003278.3, NR_003279.1, NR_003280.2, NR_046156. 1), and from FlyBase for D. melanogaster (5SrRNA-CR33353, 18SrRNA-CR45841, 5.8SrRNA-CR45842, 28SrRNA-CR4584)

ARTR-seq primary data processing

[0662] In some instances, reads from the small cell number libraries containing cell barcodes can be firstly demultiplexed with an in-house script using read 2. The adaptor sequences can be trimmed with Cutadapt⁵² (v4.2) using the parameter cutadapt — nextseq- trim=20 -a AGATCGGAAGAGCACACGTCTGAACTCCAG (SEQ ID NO: 79); the 8-nt UMI sequences can be moved and add to the read name for the further deduplication. Extra 4 nts at the reads’ 3 -end can be removed from the adapter-free sequence to minimize mapping mismatch caused by the imperfect paired sequence in the random primer.

[0663] In some instances, the reads can first be mapped to the corresponding rRNA sequences using Bowtie2⁵³ (v2.4.4) with parameters: — seedlen=15, and the mapped reads can be discarded to avoid rRNA contamination. The remaining unmapped reads can be mapped to the corresponding genome using STAR⁵⁴ (v2.7.9a) with parameters: — readFilesCommand zcat — alignEndsType EndToEnd — genomeLoad NoSharedMemory — quantMode

TranscriptomeSAM — alignMatesGapMax 15000 — outFilterMultimapNmax 1 outFilterMultimapScoreRange 1 — outSAMprimaryFlag AllBestScore — outSAMattributes All — outSAMtype BAM SortedByCoordinate — outFilterType BySJout — outReadsUnmapped Fastx — outFilterScoreMin 10 — outFilterMatchNmin 24. Uniquely mapped reads can be deduplicated to get the usable reads using UMI-tools⁵⁵ (vl.1.2) with the parameter, —method unique. The usable reads can be assigned to genomic regions with RNASeQC⁵⁶ (v2.4.2) using default parameters. Deduplicated reads can be assigned to genes with featureCounts⁵⁷ (v2.0.3) for the calculation of Pearson’s correlation coefficient. For visualization in IGV⁵⁸ (v2. 13.1), bam files of the usable reads can be converted to bigWig with bamCoverage in the deepTools suite⁵⁹ (v3.5.1) with normalization by its respective sequencing depth using the parameters — normalizeUsing BPM —binSize 1. All the sample tracks can be set to the same scale for display, except for the additional instruction in the legend.

Peak Calling

[0664] In some instances, for peak calling, the usable reads in one library can be first split into two sam files containing reads aligned to the positive and negative strands, respectively. macs3⁶⁰ can be used to identify peaks with default parameters, except for adding keep-dup all —nomodel -extsize 30’ . The peaks located in two strands can be called separately using the corresponding strand read in the input libraries as background. The two peak files for one library can later be combined. To generate the consensus motif for peaks, 20 nts can first be extended to both upstream and downstream, and the overrepresented sequences can be generated using fmdMotifsGenome.pl in the HOMER suite⁶¹ (v4.11) with parameters: -ma -S 10 -len 5, 6, 7, 8, 9. Specifically, for motif generation for peaks in mouse tissue, the peak genomic coordinates can be converted from mm39 to mm 10 using liftOver from UCSC Genome Browse⁶². Peaks can be assigned to specific genomic regions with in-house scripts, and the peaks overlapping two genomic regions can be assigned to the region of longer overlapping size. The peaks from the reader YTHDC1 can be further assigned to repeats and other regions with annotatePeaks.pl in the HOMER suite. Subsampling

[0665] In some instances, to calculate the percentage of usable reads at different sequencing depths, the uniquely mapped reads can be subsampled with samtools view in the Samtools suite⁶³ (vl.16.1). For the comparison between small cell number input libraries for different methods, the sizes of all libraries can be reduced to that of the smallest library. Specifically, instead of directly subsampling the fastq files, the usable reads can be subsampled to match the usable reads percentage of each library.

Alternative splicing identification

[0666] The differential alternative splicing events of each gene can be identified using rMATS (v4.1.2). The RBP-knockdown RNA-seq libraries bam files and the corresponding control libraries bam files with the annotation of ENCODE4 vl.2.1 GRCh38 V29 can be downloaded from the ENCODE and can be analyzed by rMATS for the identification of five alternative splicing modes, including SE (skipped exon), MXE (mutually exclusive exons), A3SS (alternative 3' splice site), A5SS (alternative 5' splice site) and RI (retained intron). Events of FDR >= 0.05 can be discarded for the subsequent analysis.

ARTR-seq enrichment level at the gene level

[0667] In some instances, to calculate the ARTR-seq enrichment at the gene level, the reads in one library can be divided into two groups by whether they were in one specific gene to have a pair of in/out read number for each of the IP and Input library. For each gene, two-by-two tables for all the combinations of in/out read number between IP and Input libraries can be generated. The ARTR-seq enrichment for a gene can be defined as the common odds ratio of the tables with significance determined by the Cochran-Mantel-Haenszel Chi-Squared test.

Data visualization and statistical analysis

[0668] Read heatmaps and profiles were generated with plotHeatmap and plotProfile in the deepTools suite⁵⁹ (v3.5.1). The splicing regulatory maps of slicing factors are generated by RBP-Maps⁶⁴ with default parameters, and the coordinates of native cassette exons and constitutive exons were downloaded from the software GitHub deposit. The random regions of the same length as the m⁶A reader proteins binding peaks were generated by bedtools shuffle in the BEDTools suite⁶⁵ (v2.30.0).

[0669] The meta-distributions of binding peaks were generated by the R package Guitar⁶⁶. All statistical analyses were performed with R⁶⁷, and all the plots are generated by the R package ggplot2⁶⁸. Quantification of ARTR-seq signal at the gene level

[0670] In some instances, to analyze G3BP1 binding strength at the gene level, ARTR-seq reads can be counted for genes in both G3BP1 and paired input samples, and fold changes and significance between G3BP1 and input can be determined by DESeq2⁶⁹. Only genes with the read sum equal to or greater than 10 for G3BP1 and input samples can be considered. RNA targets of G3BP1 can be defined as those with fold change

2 and p-value < 0.05.

Clustering analysis of G3BP1 ARTR-seq signal

[0671] To track the changing pattern of G3BP1 binding single during the stress granule assembly, log2 fold change (G3BPl/input) of genes can be used to represent G3BP1 binding signal, and fuzzy c-means clustering analysis can be performed on log2FC by the Mfuzz package⁷⁰ (v2.54.0). Only genes with the top 50% of the greatest standard deviation (SD) of log2FC can be considered, and the log2FC values can be scaled by z score before clustering. The cluster number can be determined by the ‘Dmin’ function in the Mfuzz package. Clustering can be calculated by the ‘mfuzz’ function in Mfuzz package with 10,000 iterations with Euclidean distance as the clustering method.

Functional enrichment analysis

In some instances, KEGG enrichment analysis can be performed to compare G3BP1 RNA targets at different time points using the ‘compareCluster’ function in the clusterProfiler package⁷¹ (v4.4.4). The KEGG terms with adjusted p values less than 0.05 can be visualized.

EXAMPLE 2: Strategy and Development of ARTR-seq

[0672] ARTR-seq relies on in-situ RT to capture binding sites of specific RBPs. In design of ARTR-seq, the inventors started with formaldehyde fixation to rapidly freeze and preserve the cellular structure, followed by permeabilization of cell membranes to facilitate subsequent processing (FIG. 1A-I). The inventors then targeted the reverse transcriptase (RTase) to the RBP of interest with the guidance of specific antibodies (FIG. 1A-II). The inventors first delivered the primary antibody to bind the RBP through antigen-antibody interaction (FIG. 1A-II1). Then, the inventors incubated cells with a secondary antibody that can efficiently bind the fragment crystallizable (Fc) region of the primary antibody (FIG. 1A-II2). As multiple secondary antibodies could bind to a single primary antibody, the incorporation of the secondary antibody increased the local antibody concentration around the targeted RBP. The inventors next incubated cells with a fusion protein of protein A/G and reverse transcriptase (pAG-RTase); the specific binding of protein A/G (pAG) to the Fc regions on both primary and secondary antibodies would allow site-specific delivery of the tethered RTase to the target RBP (FIG. 1A-II3). pAG can interact with various types of antibodies and be easily expressed in bacterial systems with high yield, making it an ideal choice for fusion with RTase. Subsequent to each delivery of the primary antibody, the secondary antibody, and the pAG- RTase, the inventors conducted multiple wash steps to remove any unbound antibodies or pAG-RTase.

[0673] After localizing RTase to the RBP, the inventors initiated in-situ RT at RBP binding sites by the addition of primers, dNTPs and other components (FIG. 1A-III). To achieve efficient reverse transcription, the inventors screened three commonly used RTases, including engineered Moloney murine leukemia virus (MMLV) RTase (H8Y, D200N, T306K, W313F, T330P, D524G, L603W)^{24, 25}, human immunodeficiency virus (HIV) RTase, and a truncated version of engineered MMLV RTase (25-497) in the pAG-RTase fusion constructs with a linker length of 30 amino acids (FIGs. 7A-B). The RNase H domain and the first 24 N-terminal residues were omitted in MMLV RTase (25-497) to improve its reactivity. Quantitative reverse transcription-polymerase chain reaction (qRT-PCR) was used to evaluate the properties of these pAg-RTase constructs. The inventors found pAG-MMLV RTase (25-497) exhibited the highest RT activity among the three and used this fusion construct for subsequent studies (FIG. IB and FIG. 7C)

[0674] To identify all RBP binding sites without sequence bias, the inventors next applied random reverse transcription primers with an adapter tagged at their 5' ends for library construction. However, the commonly used random 6-mer primer, when tagged with the adapter, presented a noticeable reduction in reverse transcription efficiency. The inventors therefore increased the primer length to 10 nt (FIG. 7D). Moreover, for effective enrichment of cDNAs produced in ARTR-seq, the inventors tested biotinylated dNTPs that could be incorporated into the final cDNA products. After screening five commercially available biotinylated dNTPs, the inventors found that biotin- 16-dUTP and biotin- 16-dCTP exhibited the least hindrance on RT efficiency (FIG. 7E). The inventors proceeded with including biotin- 16-dUTP and biotin- 16-dCTP, in a 1 :1 ratio with regular dTTP and dCTP, respectively, in the current ARTR-seq protocol. The inventors could enrich the biotinylated cDNAs with the streptavidin beads, and perform 3' end adapter ligation of cDNAs, library amplification and high-throughput sequencing to acquire the binding profile of the RBP of interest (FIG. 1A-IV). Note that after in-situ reverse transcription, IF imaging could be performed to reveal subcellular localization of RBPs without disturbing the subsequent library constructions if the secondary antibody and pAG-RTase delivered to the RBP are fluorophore-modified. EXAMPLE 3: Validation of ARTR-seq

[0675] To evaluate ARTR-seq in capturing binding sites of RBPs, the inventors applied ARTR-seq to a well-known RBP, PTBP1. PTBP1 is a splicing factor with a variety of published CLIP-seq datasets that can be readily utilized for comparison. To confirm the production of biotinylated cDNA from in-situ RT, the inventors monitored the biotin group in the cDNA product by dot plot. cDNA biotinylation was mostly abolished with the omission of biotin-dNTP, pAG-RTase, or primary antibody, confirming the usefulness of each of these components for successful cDNA synthesis (FIG. 1C). With further IF staining of biotinylated cDNA, with their signals largely disappeared upon exclusion of the primary antibody, the inventors also confirmed the colocalization of pAG-RTase, the secondary antibody and newly synthesized cDNA, supporting the localized RT reaction performed by pAG-RTase tethered to the RBP of interest (FIG. ID and FIG. 7F). Note that the utilization of the secondary antibody led to an increased overall yield of biotinylated cDNA (FIG. ID and FIGs. 7F and 7G).

[0676] Altogether, the inventors showed that ARTR-seq can specifically and effectively reverse transcribe RNAs nearby the targeted protein into biotinylated cDNA products.

[0677] The inventors next proceeded to test ARTR-seq on PTBP1 in 40,000 HepG2 and HeLa cells, respectively. The inventors compared ARTR-seq results with the published data from several known methods, namely CLIP, iCLIP, irCLIP, eCLIP, sCLIP, tRIP, LACE-seq and RT&Tag. By counting the usable reads, which were defined as reads uniquely mapped to the genome and remained after PCR deduplication, the inventors observed that ARTR-seq displayed a comparable or higher percentage of usable reads compared to all published methods, suggesting a high complexity of the ARTR-seq libraries (FIGs. 8A-8B). Then, the inventors calculated the correlation between biological replicates based on usable reads per gene normalized to coverage (reads per million reads mapped, RPM), and observed a high correlation (R= 0.98 for both HepG2 and HeLa samples), indicating good reproducibility of ARTR-seq (FIG. 2A).

[0678] Further, the inventors introduced input samples prepared by ARTR-seq with the omission of the primary antibody as controls to help filter out potential background signals caused by the non-specific binding of RTase (FIG. 8C). In the case of PTBP 1, the inventors found that over 70% of usable reads and over 80% of ARTR-seq peaks were annotated to introns, with the majority of exon peaks located within the 3' untranslated region (3' UTR), consistent with results obtained from using other methods^{10, 12, 13, 27-30} (FIG. 2B and FIGs. 8D- 8E). The consensus motif of PTBP 1 ARTR-seq peaks was identified as the canonical CU-rich sequence also known previously³¹ (FIG. 2B). At the genomic scale, the inventors plotted read distribution around the published eCLIP peaks³². ARTR-seq reads for PTBP1 were well aligned at the eCLIP peaks, while the input sample did not show such accumulation (FIGs. 9A-B). Additionally, the inventors observed that over 50% of genes identified by ARTR-seq overlapped with those targeted by other methods (52% for eCLIP, 51% for LACE-seq, and 82% for iCLIP). At the peak level, ARTR-seq successfully identified 41% of eCLIP -targeted peaks. (FIG. 9C). Examination of individual PTBP1 binding sites revealed similar reads distribution and density between ARTR-seq and eCLIP or iCLIP results (FIG. 2C and FIG. 9D). To further validate PTBP bindings captured by ARTR-seq, the inventors knocked down PTBP1 in HepG2 cells using two distinct siRNAs and performed ARTR-seq (FIG. 8E). The reads located around the ARTR-seq peaks reduced accordingly upon PTBP1 knockdown, indicating the high specificity of ARTR-seq (FIG. 2D).

Direct versus indirect binding sites detected by ARTR-seq

[0679] ARTR-seq identifies RBP binding by in-situ RT, which enables the capture of RNAs directly bound by the RBP (direct targets) or potentially those spatially close to the RBP (indirect targets) (FIG. 10A). To evaluate direct versus indirect targets, the inventors employed the splicing factor RBFOX2 as an example; RBFOX2 possesses a well-defined canonical binding motif ‘UGCAUG’ . Peaks close to the ‘UGCAUG’ motifs likely represent direct targets, while those farther away have an increasing possibility of being indirect targets. The inventors calculated the distances between the peak center to the nearest ‘UGCAUG’ sequence, and observed over 70% of ARTR-seq peaks were within 500 nucleotides (nts) from ‘UGCAUG’ . This peak percentage of ARTR-seq is slightly higher than that of eCLIP⁹. The two methods are comparable when the distance was set to 200 nts (FIG. 10B). It is important to note that RBFOX2 can have other non-canonical binding sites beyond the ‘UGCAUG’ motif, as suggested by the similar ratio of distant RBFOX2 eCLIP peaks from this motif. Additionally, setting more stringent signal values and q-value cutoffs for peaks increased confidence in identifying the direct targets, albeit at the expense of target numbers (FIGs. 10C- 10D) Furthermore, taking advantage of single-nucleotide-resolution m⁶A sequencing results offered by m⁶A-SAC-seq⁴², the inventors also examined YTHDF2, an m⁶A binding protein. The inventors observed about 80% of YTHDF2 ARTR-seq peaks were within 300 nts from individual m⁶A sites, comparable to that from the PAR-CLIP method³⁷ (FIG. 10E). These results all indicate that the indirect interactions captured in ARTR-seq are likely limited. The ratios of direct targets identified by ARTR-seq are comparable to those observed in CLIP-based methods. [0680] To further interrogate potential indirect targets identified in ARTR-seq, the inventors limited the movement range of RTase by shortening the linker in pAG-RTase and omitting the secondary antibody (FIGs. 11A-11C). The inventors found shorter linkers reduced RT activity of pAT-RTase, implying that shorter linkers might lead to a slowdown in the kinetics of RTase (FIG. 11D). In RBFOX2 ARTR-seq, the employments of shortening linkers or omitting the secondary antibody resulted in decreased biotinylated cDNA yield but slightly increased read accumulation at RBFOX2 ARTR-seq peaks, indicating reduced RT efficiency and concentrated ARTR-seq signals (FIGs. 11E-11G). Moreover, by calculating the distances between the peak center to the nearest ‘UGCAUG’, the inventors observed a little higher percentage (1.9% - 3.4%) of peaks within 500 nts of ‘UGCAUG’ when a shorter linker or the omission of the secondary antibody was applied (FIG. 11H). These findings indicate that restricting the RTase movement range tested here only moderately reduced potential indirect RNA capture by ARTR-seq. Higher RT efficiency is another factor that needs to be considered when designing optimal linkers.

Resolution of ARTR-seq

[0681] To assess the resolution of ARTR-seq, the inventors examined the distribution of RBFOX2 peak centers around ‘UGCAUG’ sites, and observed a clear enrichment with the majority of peaks positioned within 200 nts flanking the ‘UGCAUG’ motif (FIG 12A). Furthermore, the inventors conducted a parallel analysis on YTHDF2. Compared to RBFOX2, the inventors observed a similar but more enriched distribution of YTHDF2 around the corresponding m⁶A sites, further supporting ARTR-seq as a method that can capture direct targets and binding sites of RBPs (FIG 12B).

[0682] In an attempt to improve the resolution of binding site identification by ARTR-seq, the inventors evaluated the impact of RNase treatment in RBFOX2 ARTR-seq. As expected, the stronger RNase treatment reduced the library fragment lengths (FIG. 13A). The inventors observed that the stronger RNase treatment led to a sharper enrichment of RBFOX2 ARTR- seq peaks around ‘UGCAUG’ sites, indicating an improved resolution of ARTR-seq upon RNase treatment (FIG. 13B) The inventors quantified RT efficiency through qPCR of biotinylated cDNA, and found that samples with the stronger RNase treatment exhibited lower RT efficiency (FIG. 13C). By calculating the distances from the peak center to the nearest ‘UgCAUG’, the inventors observed that the stronger RNase treatment resulted in obviously decreased proportion of peaks located within 500 nts of the canonical ‘UGCAUG’ motif. This observation suggested that the application of RNase can reduce reads from direct target transcripts, thereby potentially elevating the ratio of non-specific or indirect-binding signals (FIG. 13D) Overall, the studies revealed that RNase treatment could improve the resolution of ARTR-seq. The strength of RNase treatment in ARTR-seq needs to be optimized to achieve the desired balance between resolution and sensitivity, especially for samples with limited starting materials.

ARTR-seq specifically detected PTBP1 binding sites with as few as 20 cells

[0683] The in-situ RT -based ARTR-seq bypasses the IP step to minimize sample loss, potentially making it feasible for low cell number samples. To test this, the inventors generated libraries for PTBP1 using different numbers of HepG2 cells and compared the results with published data from LACE-seq and RT&Tag of low cell number samples^{13, 22}. The inventors found the correlations at the gene level remained strong for ARTR-seq libraries prepared from as few as 20 cells (FIG. 14A). Additionally, ARTR-seq libraries exhibited a much higher percentage of usable reads compared to other methods when using comparable numbers of cells (FIG. 2E and FIGs. 14B-14C). Furthermore, ARTR-seq presented a consistently high percentage of intronic reads for PTBT1, suggesting its effectiveness in capturing informative reads even with the limited starting material (FIG. 14D). The inventors further subsampled the libraries from different numbers of cells to an equal sequencing depth and examined their reads distribution at peaks identified in the corresponding bulk samples. Compared to LACE-seq, ARTR-seq exhibited a clearer accumulation at the center of peaks with a higher proportion of effective reads (FIG. 2F and FIG. 14E). Visible ARTR-seq signal remained stable for libraries with different numbers of cells as exemplified in the IGV plot (FIG. 2G).

[0684] Because PTBP1 binds to a canonical CU-rich sequence, the inventors therefore compared the CT percentages in usable reads of PTBP 1 libraries constructed by different methods. The inventors found that all the ARTR-seq libraries showed comparable or higher CT percentage compared to that of CLIP, iCLIP, eCLIP, irCLIP or LACE-seq^{10, 13, 27-29} (FIG. 2H). The inventors further assessed the read distribution around CU-rich regions and observed the stable read accumulation in ARTR-seq libraries of all cell numbers peaked at the center of the regions (FIG. 21). Taken together, ARTR-seq can effectively and specifically capture the RBP binding sites even with limited starting materials.

Application of ARTR-seq in mouse embryo sections

[0685] RBPs can have strong tissue-specific expression, or are only expressed in certain tissues rather than cultured cells. The identification of RBP binding sites in tissues is still technically challenging³³. IP -based methods can require dissociating tissues into single cells to allow UV-crosslinking, which limits their application to whole tissues, particularly embedded frozen tissues or formalin-fixed tissues. Editing-based methods can require genetic modification and cannot be applied to patient tissues.

[0686] ARTR-seq offers an opportunity for identification of RBP binding sites from tissues. The inventors studied a splicing factor RBFOX2 with a section of OCT-embedded El l mouse embryo to validate the feasibility of ARTR-seq in tissue samples (FIG. 3A). The inventors first confirmed that the localization of RBFOX2 was predominantly in the nucleus of mouse embryos with the IF imaging built into the ARTR-seq procedure (FIG. 3B). The ARTR-seq reads for mouse embryo tissue showed a high percentage of usable reads and good reproducibility at gene level between biological replicates (FIGs. 15A-15B). Compared to the input, a higher percentage of usable reads from ARTR-seq of RBFOX2 were mapped to introns, consistent with the known binding preference of RBFOX2³² (FIG. 15C). RBFOX2 binding peaks were mostly located in introns and contain the canonical ‘UGCAUG’ motif⁹ (FIG. 3C). In addition, the inventors calculated the percentage of usable reads containing ‘UGCAUG’ sequence and found that mouse tissue samples displayed a similar percentage of enriched motif to that of HepG2 cell samples, indicating a comparable signal detection efficiency of ARTR- seq for tissues and cultured cells (FIG. 3D). Examination of individual binding sites supported binding of the ‘UGCAUG’ sequences by RBFOX2 (FIG. 3E). Overall, ARTR-seq can identify RBP binding sites in embedded tissue samples with high specificity.

ARTR-seq profiles regulatory features of splicing factors

[0687] The previously mentioned RBPs, PTBP1 and RBFOX2, are well-known splicing factors, with PTBP1 belonging to the heterogeneous ribonucleoprotein (hnRNP) family³⁴. To show broader applicability the inventors also studied HNRNPC, another splicing factor belonging to the hnRNP family (FIG. 16A). Consistent with the binding preference of the splicing factors, both reads (over 70%) and peaks (over 80%) from the ARTR-seq libraries of all three splicing factors (PTBP1, HNRNPC, and RBFOX2) were mainly located in introns in HepG2 cells (FIGs. 4A-4B and FIG. 16B). The RNA binding motifs of RBFOX2 and HNRNPC are the canonical ‘UGCAUG’ and U-rich sequence, respectively, consistent with the previous report³² (FIGs. 4A-4B)

[0688] To explore how splicing factor binding is associated with their splicing regulation, the inventors identified the alternative splicing (AS) events by comparing the ENCODE RNA- seq data from RBP-knockdown cells with that from control cells³⁵. The inventors found most of the AS events were categorized as exon skipping (FIG. 4C). The inventors then generated ‘RNA splicing maps’ for exon skipping events, which plot the peak density on alternatively spliced exons upon RBP knockdown and their proximal introns² (FIG. 4D). The corresponding ARTR-seq peaks were predominantly enriched at upstream proximal introns of the included exons for PTBP1, at downstream proximal introns of the excluded exons for RBF0X2, and at both upstream and downstream proximal introns of the included exons for HNRNPC, but not around native cassette exons and constitutive exons. The inventors quantified relative RBP binding strength by ARTR-seq enrichment at gene level, and divided the genes into three groups of no, low or high ARTR-seq enrichment. The inventors observed that genes with higher ARTR-seq enrichment tend to present a higher splicing difference upon knockdown for all three splicing factors (FIG. 4E and FIG. 16C). In addition to SE, the number of included retained introns (RI) upon PTBP1 knockdown (491 events) outnumbered other splicing modes. The inventors further inspected the relationship between ARTR-seq enrichment and splicing difference of RIs (FIG. 16D) and found that higher enrichment corresponded to higher splicing inclusion differences of RIs, similar to the trend observed for SEs. Altogether, ARTR-seq robustly captures distinctive binding patterns for different splicing factors, and the ARTR-seq enrichment could indicate differences in splicing.

Distinct binding features of m⁶A reader proteins identified by ARTR-seq

[0689] In addition to recognizing specific sequences, RBPs can also recognize RNA targets in a chemical modification-dependent manner. m⁶A modification is the most prevalent chemical modification in mammalian mRNA, and m⁶A reader proteins can preferentially bind m⁶A-modified RNAs to regulate its processing and metabolism in both the nucleus and cytoplasm³⁶'⁴⁰. In addition to YTHDF2, the inventors performed ARTR-seq in HeLa cells for another cytosolic m⁶A reader YTHDF1, and a nuclear reader YTHDC1.

[0690] The inventors first verified the subcellular localization of the three readers with the built-in imaging step in ARTR-seq procedure (FIG. 17A). The sequencing data of ARTR-seq remained highly reproducible between replicates for all three proteins (FIG. 17B). Over 80% of the peaks of the two cytoplasmic m⁶A readers (YTHDF1 and YTHDF2) were located in exons, whereas ~ 81% of the peaks of nuclear reader YTHDC1 were located in introns or intergenic regions, consistent with their distinct subcellular localization features (FIG. 5A and FIGs. 17A and 17C). The high unique peak ratios observed for the three reader proteins (84.2% for YTHDC1, 34.3% for YTHDF1, and 47.5% for YTHDF2) can be explained by their unique subcellular localization; YTHDF1 and YTHDF2 display different sequences of the N-terminal low-complexity domain, which most likely affect their binding to different partner proteins and therefore different RNA targets⁴¹ (FIG. 17D). The inventors further investigated the much more abundant non-exonic peaks of YTHDC1, and found more than half of them located in repeat elements, with the most prevalent being long interspersed nuclear elements (LINEs) (~ 45%), consistent with a previous report⁴⁰ (FIG. 5B). The inventors next examined the distribution of exonic peaks along mRNA and found that the profiles for all readers showed enrichment around stop codons, which resembles the meta profile of m⁶A modifications, especially for YTHDF1 and YTHDF2⁴² (FIG. 5C and FIG. 17E).

[0691] Further, the inventors calculated the percentage of exonic peaks overlapping with m⁶ A sites in polyadenylated RNA identified by m⁶ A-S AC-seq⁴². The peaks for all three readers captured by ARTR-seq showed higher percentages than random peaks, also comparable to the YTHDF2 peaks captured by PAR-CLIP³⁷, supporting the m⁶A-dependent binding features of these three readers (FIG. 5D). The inventors then analyzed the association between the m⁶A fraction and RBP binding strength. By dividing peaks into four groups based on the m⁶A fraction (sum value), the inventors observed that the group with higher m⁶A fractions showed higher RBP enrichment signals for YTHDF1 and YTHDF2, suggesting ARTR-seq can measure the relative binding strength of RBPs (FIG. 5E). However, the association was not strong for YTHDC1 (FIG. 17D). Most of the YTHDC1 peaks were located in introns that lack quantitative m⁶A seq data, resulting in a lower number of exon peaks being used for analysis, which can explain the reduced association. Overall, ARTR-seq captured different features of three m⁶A binding proteins in cytoplasm and nucleus.

Dynamic RNA binding of G3BP1 during stress granule assembly

[0692] Stress granules (SGs) are membraneless organelles composed of proteins and RNAs that form in response to stress. The RBP G3BP1 is the central node in the network of protein- RNA interaction during SG assembly^{43, 44} Under sodium arsenite (NaAsO?) treatment, SGs could be observed after 13 min with a progressive increase in size over time, with most of the SG assembly completed by 40 min, providing a rapid stress response⁴⁵. However, whether RNA targets of G3BP1 vary during SG assembly has yet to be investigated.

[0693] Taking advantage of the potential high temporal resolution offered by fast formaldehyde fixation and low material requirements of ARTR-seq, the inventors performed ARTR-seq for G3BP1 in HeLa cells with the treatment of 0.5 mM NaAsCh and monitored the SG assembly process at time intervals of 0 min, 10 min, 20 min, and 60 min post stress. The inventors first visualized G3BP1 localization using IF imaging, and confirmed the gradual condensation of G3BP1 into granules over time (FIG. 6 A). The colocalization of G3BP1 and biotinylated cDNAs produced in ARTR-seq was further verified (FIG. 6B). Subsequently, the same samples examined by imaging were used for ARTR-seq library construction and sequencing. The inventors then analyzed sequencing data and determined G3BP1 binding strength by calculating the ARTR-seq log2 fold change (log2FC) between G3BP1 and input samples at gene level. The inventors observed ~ 78% of G3BP1 RNA targets (log2FC > 1, P- value < 0.05) were no longer enriched at 60 min (T60) post NaAsCh treatment (FIG. 6C). SG enrichment of RNA was previously reported by sequencing RNAs separated from NaAsCh- induced SGs to quantify the relative degree of RNA SG localization⁴⁶. Through integrative analysis, the inventors observed that G3BP1 targets at T60 showed significantly higher SG enrichment compared to the starting point (without stress, TO) (FIG. 6D). These results supported the accuracy of ARTR-seq and distinct RNA binding of G3BP1 in the presence and absence of stress. The functions of stress-induced G3BP1 targets (T60_only) were enriched to KEGG pathways of protein processing in the endoplasmic reticulum (ER) and human papillomavirus (HPV) infection, consistent with previous results^{47, 48} (FIG. 6E).

[0694] To further explore the dynamic RNA targeting of G3BP1 over time, the inventors calculated pairwise correlations of the G3BP1 binding strength among time points. The correlation coefficients were generally low (R = 0.38-0.57), suggesting distinct G3BP1 bindings at different time intervals (FIG. 18A). RNAs were previously classified into SG- enriched RNAs and SG-depleted RNAs according to their SG enrichment⁴⁶. The inventors found that during SG assembly, the G3BP1 binding strength from ARTR-seq gradually increased for SG-enriched RNAs, and decreased for SG-depleted RNAs, suggesting a shift of G3BP1 targets towards SG-enriched RNAs as SGs assemble (FIGs. 6F-6G). A portion of RNAs captured in ARTR-seq displayed stable G3BP1 binding, while others showed dynamic G3BP1 binding across time intervals (FIG. 6H and FIGs. 18B-18C). The inventors then grouped these RNAs based on G3BP1 binding strength using the fuzzy c-means clustering algorithm. The inventors found that G3BP1 binding strength for these RNAs displayed not only unidirectional trajectories of increasing or decreasing, but also transient changes during a 60-minute period of NaAsCh treatment, suggesting rapid and dynamic responses of cells to stress (FIGs. 6H-6I and FIG. 18D) Taken together, the ARTR-seq performed along the stress progression showcased the highly dynamic nature of G3BP1-RNA interaction during SG assembly. The results also indicate ARTR-seq as a method that allows capturing temporal changes of protein-RNA interactions in a short timescale with the limited starting material. [0695] In summary, the inventors have created methods and compositions that can be utilized to an assay of reverse transcription-based RBP binding sites sequencing (ARTR-seq), which relies on in-situ reverse transcription of RBP-bound RNAs guided by antibodies to identify RBP binding sites. ARTR-seq avoids ultraviolet cross-linking and immunoprecipitation, allowing for efficient and specific identification of RBP binding sites from as few as 20 cells or a tissue section. Taking advantage of rapid formaldehyde fixation, ARTR-seq enables capturing the dynamic binding of RBPs over a short period of time, as demonstrated by the discovery of dynamic RNA binding of G3BP1 during stress granule assembly on a timescale as short as 10 min.

Data availability

[0696] All the sequencing data generated in this study have been deposited in NCBI's Gene Expression Omnibus (GEO) under the accession number GSE226161. Previously published data from CLIP-seq²⁷, eCLIP²⁹, iCLIP²⁸, irCLIP¹⁰, LACE-seq¹³, sCLIP¹¹, tRIP-seq¹² and RT&Tag²² are available under accession numbers of GSE42701, GSE92205, E-MTAB-3108, GSE78832, GSE137925, GSE92995, DRA005743 and GSE195654, respectively. The data were downloaded and processed as described in the articles. The PTBP1, RBFOX2 and HNRNPC knockdown RNA-Seq data were downloaded from ENCODE portal³² under the accession numbers of ENCSR052IYH, ENCSR305XWT, ENCSR634KBO, ENCSR572FFX, ENCSR767LLP, ENCSR104ABF, ENCSR336DFS, ENCSR667PLJ, ENCSR064DXG, ENCSR603TCV, ENCSR527IVX, ENCSR129RWD. The published PAR-CLIP data and the corresponding peaks for YTHDF2 are available under the GEO accession number of GSE49339. The m⁶A modification sites list identified by m⁶A-SAC-seq is available under the GEO accession number of GSE198246.

Code availability

[0697] Codes for processing ARTR-seq data are available in the following GitHub repository https://github.com/mingming-cgz/ARTR-seq.

EXAMPLE 4: Multiplex ARTR-seq

[0698] A multiplexed ARTR-seq approach was developed to spontaneously detect the binding of multiple RNA binding proteins (RBPs) within a single sample. In the design of multiplexed ARTR-seq (FIG. 19), unique DNA barcodes (with alkyne) are covalently ligated to RPB-specific antibodies (Adapter AB in FIG. 19), allowing for decoding the RBP targets during next-generation sequencing (NGS). Multiple DNA-barcoded antibodies that recognize corresponding RBPs were added to the assay system, followed by the application of a secondary antibody and protein A/G-reverse transcriptase (pAG-RTase). After removing unbound pAG-RTase, in situ reverse transcription (RT) is initiated at the binding sites of RBPs by adding azide-labeled random RT primers (NNNN-N3), biotinylated dNTP, and other components.

[0699] To link the antibody barcodes to their corresponding cDNA products, alkyne group was incorporated into the DNA barcodes of RBP antibodies, enabling in situ copper-catalyzed azide-alkyne cycloaddition (CuAAC) to ligate antibody barcode oligos specifically with their proximal cDNA products. After the biotin enrichment of these cDNAs, the inventors performed adapter ligation for library construction. The heterocycle generated during CuAAC can be processed by Klenow Fragment (3'— >5' exo-) DNA polymerase for second-strand synthesis. After library amplification and NGS, RBP binding sites are deconvoluted based on their specific barcode sequences.

[0700] The encouraging results from these experiments show that other approaches could also be envisioned using similar approaches with the goal to simultaneously map binding sites of multiple RBPs using DNA barcoded antibodies that recognize corresponding RBPs.

EXAMPLE 5: Enhanced Spatial ARTR-seq for RNA modifications

[0701] Over 150 distinct chemical modifications occur on the RNA molecules, impacting various aspects of gene expression, such as RNA decay and translation. These modifications also play critical roles in physiology and disease. Notable, N6-methyladenosine (m6A) stands out as the most prevalent modification in mammalian mRNA and chromatin-associated RNA (caRNA), with close associations with disorders and cancers. In addition, m6A modification exhibits distinct tissue-specific distribution. Measuring transcriptome-wide m6A at single-cell and spatial resolution allows a deeper understanding of epitranscriptomic regulations in heterogeneous cell types within tissues. However, existing methods enable transcriptome-wide m6A exploration at bulk or single-cell level, but they lack spatial information. Identifying spatial m6A distribution within tissues at high spatial resolution remains a challenge.

[0702] Next it was tested if Spatial ARTR-seq could be used to map RNA modifications spatially. For this, m6A modification was used as an example. Building on the ARTR-seq methods, which detects binding sites of RNA binding proteins through in situ reverse transcription, and deterministic barcoding in tissue (DBiT) technology, which uses microfluidic chips with parallel channels directly placed against a fixed tissue slide for barcoding, spatial m6A-ARTR-seq for de novo spatial profiling of m6A modifications across the transcriptome was attempted.

[0703] FIG. 20 provides a schematic design of spatial m6A-ARTR-seq. OCT-embedded tissue sections are fixed with formaldehyde. After permeabilization, m6A modifications are targeted using m6A-specific antibody, followed by the application of a secondary antibody and the protein A/G-reverse transcriptase (pAG-RTase) fusion protein, locating RTase at m6A sites. In situ reverse transcription is then initiated by the addition of RT components. Spatial barcoding is achieved via a microfluidic device with two PDMS chips featuring multiple parallel microchannels, delivering horizontal (Al-An) and vertical (Bl-Bn) barcodes sequentially to generate a unique 2D barcode array. After imaging, the tissues are digested for downstream cDNA enrichment and library preparation, followed by high-throughput sequencing to decode spatial m6A distribution within tissues.

[0704] In addition to the microfluidic system, note that many other spatial methods could be used as well. For instance, once cDNA is generated, these cDNAs could be recognized by nucleic acid probes and imaged with methods such as MERFISH or STARmap, or any other method that can image or sequence cDNAs in the spatial manner. DBiT is only presented as one example for the approach. These approaches could be used to map RBP binding sites. These RBPs also include ribosomes so one can spatially map translation.

[0705] This procedure was tested experimentally using m⁶A-ARTR-seq. The m⁶A-ARTR- seq was tested in HeLa cells. Immunofluorescence (IF) staining showed that m⁶A was predominantly localized in cytoplasm, with strong colocalization of pAG-RTase and the secondary antibody, and their signals largely disappeared when the m⁶A antibody was omitted, demonstrating the high specificity of pAG-RTase targeting (FIG. 21A). Correlation analysis between biological replicates showed high reproducibility of m⁶A-ARTR-seq, with a Pearson’ s correlation coefficient of 0.999 (FIG. 21B) At the peak level, approximately 80% of m⁶A peaks were shared between replicates, with enrichment around stop codon regions (FIGs. 21C- D). The majority of m⁶A peaks were annotated to exonic regions and associated with the canonical consensus sequence of ‘GGACU’, consistent with previous reports (FIG. 21E). Additionally, comparison of individual m⁶A sites revealed similar distribution between m⁶A - ARTR-seq, m⁶A-SAC-seq, and GLORI (FIG. 21F). Collectively, these findings confirm the robust performance of m⁶A-ARTR-seq in detection.

[0706] To further benchmark this method for spatial m⁶A profiling, spatial m⁶ A- ARTR- seq was applied to sagittal tissue sections of embryonic day 11 (El l) mouse embryos using a microfluidic device with a pixel resolution of 50 pm (FIG. 22A). Simultaneously, m⁶A-ARTR- seq was applied to E14 mouse embryonic stem cells (mESCs) and compared the m⁶A distribution compared between El l tissues and mESCs. Correlation analysis showed high reproducibility between replicates from adjacent sections, with a Pearson correlation coefficient of 0.98 (FIG. 22B). In contrast, the correlation between El l tissues and mESCs was much lower, as reflected by lower Pearson correlation coefficients of about 0.38, suggesting a dramatic m⁶A distinction. At the m⁶A peak level, only 34% of peaks in El 1 tissue overlapped with those in mESCs, with the canonical ‘GGACU’ motif observed in the overlapped peaks (FIG. 22C). Analysis of exonic peak distribution along mRNA showed less enrichment of m⁶A at stop codon and higher enrichment in the 5' UTR region for El 1 tissues (FIF. 22D). The genome browser snapshots showcased the specific m⁶A sites in Fat4 for El 1 tissue, Akapl2 for mESC and shared m⁶A modifications in Arhgap5 (FIG. 22E). The spatial m⁶A-ARTR-seq detected an average of 2359 unique molecular identifiers (UMIs) and 1594 genes per pixel (FIG. 22F). Unsupervised clustering identified 16 m6A clusters, and the spatial Uniform Manifold Approximation and Projection (UMAP) closely aligned with the histology from an adjacent hematoxylin and eosin (H&E) stained section, demonstrating distinct m⁶A distribution within mouse embryo tissues and the ability of m⁶A modification to reveal the subtle tissue structures (FIGs. 22A-22G).

[0707] Spatial m⁶A profiling was further extended to coronal mouse brain section (FIG. 23A). Fluorescence imaging of the fhiorophore-labeled secondary antibody revealed the overall m⁶A distribution in the same spatial section, which was subsequently applied for downstream sequencing (FIG. 23B). On average, 2489 UMIs and 1664 genes were captured in each pixel, with the UMI map highly mirroring the overall m⁶A distribution detected via imaging (FIGs. 23B-23C). Unsupervised clustering of the gene-by-pixel matrix unveiled 20 spatially organized m⁶A clusters, whose spatial distribution closely aligned with the anatomical annotations of a similar brain section in the Allen Mouse Brain Atlas (FIG. 23D). Furthermore, read distributions along mouse mRNA showed clear enrichment at stop codon region in both biological replicates, recapitulating the canonical m⁶A distribution observed in bulk samples, further confirming the reproducibility and specificity of spatial m⁶A-ARTR-seq in mouse brain (FIG. 23E) To investigate brain region-specific m⁶A modifications, 20 spatial m⁶A clusters were grouped into 9 brain regions and compared their m⁶A level (FIG. 23F). The m⁶A signal of Cblnl. which encodes a secreted glycoprotein regulated by m⁶A reader protein YTHDF3 to affect synaptic transmission, was dominantly distributed in thalamus (TH), particularly for the perifascicular nucleus (PF) (FIG. 23G). Another example is Zbtb20._j which regulates the hippocampus development, with its m⁶A signal were mainly enriched in the dentate gyrus (DG) (FIG. 4H) Taken together, these findings demonstrate that spatial m⁶A-ARTR-seq provides a high-resolution map of m⁶A modifications in the mouse brain, revealing region-specific m⁶A distributions and highlighting the potential of this method to uncover spatially organized epitranscriptomic regulation in complex tissues.

* * *

[0708] All of the methods disclosed and claimed herein can be made and executed without undue experimentation in light of the present disclosure. While the compositions and methods of this invention have been described in terms of certain aspects, it will be apparent to those of skill in the art that variations can be applied to the methods and in the steps or in the sequence of steps of the method described herein without departing from the concept, spirit and scope of the invention. More specifically, it will be apparent that certain agents which are both chemically and physiologically related can be substituted for the agents described herein while the same or similar results would be achieved. All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit, scope and concept of the invention as defined by the appended claims.

SEQUENCES

REFERENCES

[0709] All references cited herein, including patent applications, patent publications, and Accession numbers, to the extent that they provide exemplary procedural or other details supplementary to those set forth herein, are specifically herein incorporated by reference in their entirety, as if each individual reference were specifically and individually indicated to be incorporated by reference.

[0710] 1. Gerstberger, S., Hafner, M. & Tuschl, T. A census of human RNA-binding proteins. Nat Rev Genet 15, 829-845 (2014).

[0711] 2 Gebauer, F., Schwarzl, T., Valcarcel, J. & Hentze, M.W. RNA-binding proteins in human genetic disease. Nat Rev Genet 22, 185-198 (2021).

[0712] 3. Lerner, M.R. & Steitz, J. A. Antibodies to small nuclear RNAs complexed with proteins are produced by patients with systemic lupus erythematosus. Proceedings of the National Academy of Sciences 76, 5495-5499 (1979).

[0713] 4. Tenenbaum, S. A., Carson, C.C., Lager, P. J. & Keene, J.D. Identifying mRNA subsets in messenger ribonucleoprotein complexes by using cDNA arrays. Proceedings of the National Academy of Sciences 97, 14085-14090 (2000).

[0714] 5. Ule, J. et al. CLIP Identifies Nova-Regulated RNA Networks in the Brain.

Science 302, 1212-1215 (2003).

[0715] 6. Licatalosi, D.D. et al. HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature 456, 464-469 (2008).

[0716] 7 Hafner, M. et al. Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. Cell 141, 129-141 (2010).

[0717] 8. Konig, J. et al. iCLIP reveals the function of hnRNP particles in splicing at individual nucleotide resolution. Nat Struct Mol Biol 17, 909-915 (2010).

[0718] 9. Van Nostrand, E.L. et al. Robust transcriptome-wide discovery of RNA-binding protein binding sites with enhanced CLIP (eCLIP). Nat Methods 13, 508-514 (2016).

[0719] 10. Zamegar, B.J. et al. irCLIP platform for efficient characterization of protein-

RNA interactions. Nat Methods 13, 489-492 (2016).

[0720] 11. Kargapolova, Y., Levin, M., Lackner, K. & Danckwardt, S. sCLIP-an integrated platform to study RNA-protein interactomes in biomedical research: identification of CSTF2tau in alternative processing of small nuclear RNAs. Nucleic Acids Res 45, 6074- 6086 (2017).

[0721] 12. Masuda, A. et al. tRIP-seq reveals repression of premature polyadenylation by co-transcriptional FUS-U1 snRNP assembly. EMBO Rep 21, e49890 (2020).

[0722] 13. Su, R. et al. Global profiling of RNA-binding protein target sites by LACE-seq.

Nat Cell Biol 23, 664-675 (2021). [0723] 14. Blue, S.M. et al. Transcriptome-wide identification of RNA-binding protein binding sites using seCLIP-seq. Nat Protoc 17, 1223-1265 (2022).

[0724] 15. Lorenz, D.A. et al. Multiplexed transcriptome discovery of RNA-binding protein binding sites by antibody-barcode eCLIP. Nat Methods 20, 65-69 (2023).

[0725] 16. McMahon, A.C. et al. TRIBE: Hijacking an RNA-Editing Enzyme to Identify

Cell-Specific Targets of RNA-Binding Proteins. Cell 165, 742-753 (2016).

[0726] 17. Brannan, K.W. et al. Robust single-cell discovery of RNA targets of RNA- binding proteins and ribosomes. Nat Methods 18, 507-519 (2021).

[0727] 18. Nguyen, D.T.T. et al. HyperTRIBE uncovers increased MUSASHI-2 RNA binding activity and differential regulation in leukemic stem cells. Nat Commun 11, 2026 (2020).

[0728] 19. Xu, W., Rahman, R. & Rosbash, M. Mechanistic implications of enhanced editing by a HyperTRIBE RNA-binding protein. RNA 24, 173-182 (2018).

[0729] 20. Flamand, M.N., Ke, K., Tamming, R. & Meyer, K.D. Single-molecule identification of the target RNAs of different RNA binding proteins simultaneously in cells. Genes Dev 36, 1002-1015 (2022).

[0730] 21. Meyer, K.D. DART-seq: an antibody-free method for global m6A detection.

Nat Methods 16, 1275-1280 (2019).

[0731] 22. Khyzha, N., Henikoff, S. & Ahmad, K. Profiling RNA at chromatin targets in situ by antibody-targeted tagmentation. Nat Methods (2022).

[0732] 23. Kaya-Okur, H.S. et al. CUT&Tag for efficient epigenomic profiling of small samples and single cells. Nat Commun 10, 1930 (2019).

[0733] 24. Anzalone, A. V. et al. Search-and-replace genome editing without double-strand breaks or donor DNA. Nature 576, 149-157 (2019).

[0734] 25. Potter, R. J., & Rosenthal, K. High fidelity reverse transcriptases and uses thereof. US Patent No. US7056716B2. June 6, 2006.

[0735] 26. Oscorbin, I.P. & Filipenko, M.L. M-MuLV reverse transcriptase: Selected properties and improved mutants. Comput Struct Biotechnol J 19, 6315-6327 (2021).

[0736] 27. Xue, Y. et al. Direct conversion of fibroblasts to neurons by reprogramming

PTB-regulated microRNA circuits. Cell 152, 82-96 (2013).

[0737] 28. Coelho, M.B. et al. Nuclear matrix protein Matrin3 regulates alternative splicing and forms overlapping regulatory networks with PTB. EMBO J 34, 653-668 (2015).

[0738] 29. Consortium, E.P. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57-74 (2012). [0739] 30. Fred, R.G., Tillmar, L. & Welsh, N. The role of PTB in insulin mRNA stability control. Curr Diabetes Rev 2, 363-366 (2006).

[0740] 31. Xue, Y. et al. Genome-wide analysis of PTB-RNA interactions reveals a strategy used by the general splicing repressor to modulate exon inclusion or skipping. Mol Cell 36, 996-1006 (2009).

[0741] 32. Van Nostrand, E.L. et al. A large-scale binding and functional map of human

RNA-binding proteins. Nature 583, 711-719 (2020).

[0742] 33. Hafner, M. et al. CLIP and complementary methods. Nature Reviews Methods

Primers 1 (2021).

[0743] 34. Dvinge, H. Regulation of alternative mRNA splicing: old players and new perspectives. FEBS Lett 592, 2987-3006 (2018).

[0744] 35. Luo, Y. et al. New developments on the Encyclopedia of DNA Elements

(ENCODE) data portal. Nucleic Acids Res 48, D882-D889 (2020).

[0745] 36. Shi, H., Wei, J. & He, C. Where, When, and How: Context-Dependent

Functions of RNA Methylation Writers, Readers, and Erasers. Mol Cell 74, 640-650 (2019).

[0746] 37. Wang, X. et al. N6-methyladenosine-dependent regulation of messenger RNA stability. Nature 505, 117-120 (2014).

[0747] 38. Wang, X. et al. N6-m ethyladenosine Modulates Messenger RNA Translation

Efficiency. Cell 161, 1388-1399 (2015).

[0748] 39. Roundtree, I. A. et al. YTHDC1 mediates nuclear export ofN6-methyladenosine methylated mRNAs. Elife 6 (2017).

[0749] 40. Liu, J. et al. N6-m ethyladenosine of chromosome-associated regulatory RNA regulates chromatin state and transcription. Science 367, 580-586 (2020).

[0750] 4L Zou, Z., Sepich-Poore, C., Zhou, X., Wei, J. & He, C. The mechanism underlying redundant functions of the YTHDF proteins. Genome Biol 24, 17 (2023).

[0751] 42. Ge, R. et al. m6A-SAC-seq for quantitative whole transcriptome m6A profiling.

Nat Protoc (2022).

[0752] 43. Yang, P. et al. G3BP1 Is a Tunable Switch that Triggers Phase Separation to

Assemble Stress Granules. Cell 181, 325-345 e328 (2020).

[0753] 44. Protter, D.S.W. & Parker, R. Principles and Properties of Stress Granules.

Trends Cell Biol 26, 668-679 (2016).

[0754] 45. Wheeler, J.R., Matheny, T., Jain, S., Abrisch, R. & Parker, R. Distinct stages in stress granule assembly and disassembly. Elife 5 (2016). [0755] 46. Khong, A. et al. The Stress Granule Transcriptome Reveals Principles of mRNA

Accumulation in Stress Granules. Mol Cell 68, 808-820 e805 (2017).

[0756] 47. Chou, R.H. & Huang, H. Sodium arsenite suppresses human papillomavirus- 16

E6 gene and enhances apoptosis in E6-transfected human lymphoblastoid cells. J Cell Biochem 84, 615-624 (2002).

[0757] 48. Sun, H. et al. Sodium Arsenite-Induced Learning and Memory Impairment Is

Associated with Endoplasmic Reticulum Stress-Mediated Apoptosis in Rat Hippocampus. Front Mol Neurosci 10, 286 (2017).

[0758] 49. Henikoff, S. & Ahmad, K. In situ tools for chromatin structural epigenomics.

Protein Sci 31, e4458 (2022).

[0759] 50. Lopes, I., Altab, G., Raina, P. & de Magalhaes, J.P. Gene Size Matters: An

Analysis of Gene Length in the Human Genome. Front Genet 12, 559998 (2021).

[0760] 51. Irgen-Gioro, S., Yoshida, S., Walling, V. & Chong, S. Fixation can change the appearance of phase separation in living cells. Elife 11 (2022).

[0761] 52. Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. journal 17, 10-12 (2011).

[0762] 53. Langmead, B. & Salzberg, S.L. Fast gapped-read alignment with Bowtie 2. Nat

Methods 9, 357-359 (2012).

[0763] 54. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner.

Bioinformatics 29, 15-21 (2013).

[0764] 55. Smith, T., Heger, A. & Sudbery, I. UMI-tools: modeling sequencing errors in

Unique Molecular Identifiers to improve quantification accuracy. Genome Res 27, 491- 499 (2017).

[0765] 56. Graubert, A., Aguet, F., Ravi, A., Ardlie, K.G. & Getz, G. RNA-SeQC 2: efficient RNA-seq quality control and quantification for large cohorts. Bioinformatics 37, 3048-3050 (2021).

[0766] 57. Liao, Y., Smyth, G.K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923-930 (2014). [0767] 58. Robinson, J.T. et al. Integrative genomics viewer. Nat Biotechnol 29, 24-

26 (2011).

[0768] 59. Ramirez, F., Dundar, F., Diehl, S., Gruning, B.A. & Manke, T. deepTools: a flexible platform for exploring deep- sequencing data. Nucleic Acids Res 42, W187-191 (2014). [0769] 60. Zhang, Y. et al. Model-based analysis of ChlP-Seq (MACS). Genome

Biol 9, R137 (2008). [0770] 61. Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell 38, 576-589 (2010).

[0771] 62. Kent, W.J. et al. The human genome browser at UCSC. Genome Res 12, 996-

1006 (2002).

[0772] 63. Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10 (2021).

[0773] 64. Yee, B.A., Pratt, G. A., Graveley, B.R., Van Nostrand, E.L. & Yeo, G.W. RBP-

Maps enables robust generation of splicing regulatory maps. RNA 25, 193-204 (2019).

[0774] 65. Quinlan, A.R. & Hall, I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841-842 (2010).

[0775] 66. Cui, X. et al. Guitar: An R/Bioconductor Package for Gene Annotation Guided

Transcriptomic Analysis of RNA-Related Genomic Features. Biomed Res Int 2016, 8367534 (2016).

[0776] 67. R Core Team, R. R: A language and environment for statistical computing.

(2013).

[0777] 68. Wickham, H. ggplot2: elegant graphics for data analysis New York. NY:

Springer (2009).

[0778] 69. Love, M.I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 15, 550 (2014).

[0779] 70. Kumar, L. & M, E.F. Mfuzz: a software package for soft clustering of microarray data. Bioinformation 2, 5-7 (2007).

[0780] 71. Wu, T. et al. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innovation (Camb) 2, 100141 (2021).

[0781] 72. Rosenberg AB, Roco CM, Muscat RA, Kuchina A, Sample P, Yao Z, Graybuck

LT, Peeler DJ, Mukherjee S, Chen W, Pun SH, Sellers DL, Tasic B, Seelig G. Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding. Science. 2018 Apr 13;360(6385):176-182. doi: 10.1126/science.aam8999.

[0782] 73. Lee JH, Daugharthy ER, Scheiman J, Kalhor R, Yang JL, Ferrante TC, Terry

R, Jeanty SS, Li C, Amamoto R, Peters DT, Turczyk BM, Marblestone AH, Inverse SA, Bernard A, Mali P, Rios X, Aach J, Church GM. Highly multiplexed subcellular RNA sequencing in situ. Science. 2014 Mar 21;343(6177): 1360-3. doi: 10.1126/science.1250212.

Claims

CLAIMS What is claimed is:

1. A polypeptide construct comprising: a) a targeting moiety; and b) a reverse transcriptase enzyme, or a functional variant there.

2. The polypeptide construct of claim 1, wherein the targeting moiety is a Fc binding protein or variant thereof, an antibody or variant thereof, an oligonucleotide or variant thereof, a receptor or variant thereof, a ligand, a small molecule, an aptamer, a nucleoside, or any combination thereof.

3. The polypeptide construct of claim 2, wherein the targeting moiety comprises a Fc binding protein or a variant thereof.

4. The polypeptide construct of claim 2, wherein the targeting moiety comprises an antibody or variant thereof.

5. The polypeptide construct of claim 2, wherein the targeting moiety comprises an oligonucleotide or a variant thereof.

6. The polypeptide construct of claim 5, wherein the oligonucleotide comprises a barcode, indices, affinity tag, label, a modified nucleotide, or any combination thereof.

7. The polypeptide construct of claim 6, wherein the affinity tag comprises a streptavidin, or an avidin tag.

8. The polypeptide construct of claim 2, wherein the targeting moiety comprises a small molecule.

9. The polypeptide construct of claim 3, wherein the Fc binding protein comprises protein A, protein G, protein A/G (pAG), protein L, anti-rabbit IgG, anti-mouse IgG, or a variant thereof, or any combination thereof.

10. The polypeptide construct of claim 9, wherein the Fc binding protein comprises pAG.

11. The polypeptide construct of claim 9, wherein the Fc binding protein comprises an amino acid sequence as set forth is any one of SEQ ID NOs: 8, 10, and 12, or an amino acid sequence at least 60% identical thereto.

12. The polypeptide construct of claim 1, wherein the reverse transcriptase comprises Moloney murine leukemia virus (MMLV) RTase, human immunodeficiency virus (HIV) RTase, Avian Myeloblastosis Virus (AMV) RTase or a functional variant thereof.

13. The polypeptide construct of claim 1, wherein the reverse transcriptase protein comprises an amino acid sequence as set forth is any one of SEQ ID NO s: 2, 4, and 6, or an amino acid sequence at least 60% identical thereto, or a functional variant thereof.

14. The polypeptide construct of claim 1, further comprising one or more linker sequences directly or indirectly bound to the targeting moiety and the reverse transcriptase.

15. The polypeptide construct of claim 1, wherein the one or more linker sequences are at least, equal to, or at most, 2-100 amino acids in length.

16. The polypeptide construct of claim 15, wherein the one or more linker sequences are 2-100 amino acids in length, 2-10 amino acids in length, 11-20 amino acids in length, 21-30 amino acids in length, 31-40 amino acids in length, 41-50 amino acids in length, 51-60 amino acids in length, 61-70 amino acids in length, 71-80 amino acids in length, 81-90 amino acids in length, or 91-100 amino acids in length.

17. The polypeptide construct of claim 16, wherein the linker comprises an amino acid sequence as set forth in SEQ ID NO: 28, or a sequence at least 80% identical thereto.

18. The polypeptide construct of claim 1, further comprising a fluorophore.

19. The polypeptide construct of claim 18, wherein the fluorophore comprises Green Fluorescent Protein (GFP), eGFP, Red Fluorescent Protein (RFP), Teal Fluorescent Protein (TFP), Blue Fluorescent Protein (BFP), Yellow Fluorescent Protein (YFP), miRFP, cerulean fluorescent protein (CFP), eCyanFP, mCherry, mVenus, mOrange, mTurquoise, tdTomato, aminocoumarin, fluorescein, texas red, Alexa Fluor dyes (e.g. Alexa Fluor 488, Alexa Fluor 555, Alexa Fluor 594, Alexa Fluor 647, Alexa Fluor 350, Alexa Fluor 532, and Alexa Fluor 700), Cy dyes (e.g. Cy3, Cy5), DyLight dyes, FITC, or Rhodamine, or functional variants thereof.

20. The polypeptide construct of claim 1, further comprising a purification and/or a solubilization tag.

21. The polypeptide construct of claim 20, wherein the purification and/or a solubilization tag comprises a maltose binding protein (MBP) tag, a GST-tag, a FLAG tag, an HA tag, a His-tag, a SUMO-tag, a Trx-tag, a Halo-tag, or any combination thereof.

22. The polypeptide construct of claim 21, wherein the purification and or a solubilization tag comprises an amino acid sequence as set forth in SEQ ID NO: 30, or a sequence at least 80% identical thereto.

23. The polypeptide construct of claim 1, further comprising a peptide leader sequence.

24. A transcriptase composition comprising the polypeptide construct of any one of claims 1-23, and a transcriptase mix comprising one or more adapter-RT primer, wherein the one or more adapter RT -primer each comprises an adapter primer sequence and an RT primer sequence.

25. The transcriptase composition of claim 24, wherein at least one of the one or more RT primer sequence is a random RT primer.

26. The transcriptase composition of claim 25, wherein the random RT primer comprises at least 7 nucleotides.

27. The transcriptase composition of claim 25, wherein the random RT primer is at least 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, or more, nucleotides in length.

28. The transcriptase composition of claim 24, wherein the adapter primer sequence comprises a sequencing barcode.

29. The transcriptase composition of claim 24, wherein the transcriptase mix further comprises non-labeled dNTPs, labeled dNTPs, or any combination thereof.

30. The transcriptase composition of claim 29, wherein the labeled dNTPs are biotinylated dNTPs, optionally wherein the biotinylated dNTPs comprises biotin- 16-dUTP, or biotin- 16-dCTP, or both.

31. The transcriptase composition of claim 29, wherein the labeled dNTP and the nonlabeled dNTP are at a ratio of at least 0.5:1, 1 :1, or 2:1.

32. The transcriptase composition of claim 24, wherein the RT sequence primer further comprises an azide functional group.

33. The transcriptase composition of claim 24, wherein the adapter-RT primer comprises a nucleotide sequence as set forth in as set forth in SEQ ID NO: 25, or a sequence at least 80% identical thereto.

34. A method of determining one or more RNA interaction sites of a RNA-binding Protein (RBP) in a biological sample, comprising: a) incubating a RBP-targeting agent with the RBP, wherein the RBP-targeting agent specifically binds the RBP to form a primary complex; b) incubating the first complex with one or more secondary binding agents that specifically bind the RBP-targeting agent, to form a secondary complex; c) incubating the first or the secondary complex with the transcriptase composition of claim 24, to obtain cDNA; d) sequencing the cDNA to determine the one or more RNA interaction sites of the RBP.

35. The method of claim 34, wherein the biological sample is a RNA-protein complex, a cell, or a tissue section.

36. The method of claim 34, further comprising fixing the biological sample with a fixing agent.

37. The method of claim 36, wherein the fixing agent comprises formaldehyde, paraformaldehyde, and/or glutaraldehyde.

38. The method of claim 37, wherein the fixing agent is paraformaldehyde at a concentration of about 0.1% to about 5% by volume, or at a concentration of at least, equal to, about, or more than 0.1%, 0.2%, 0.3%, 0.4%, 0.5%, 0.6%, 0.7%, 0.8%, 0.9%, 1.0%, 1.1%, 1.2%, 1.3%, 1.4%, 1.5%, 1.6%, 1.7%, 1.8%, 1.9%, 2.0%, 2.1%, 2.2%, 2.3%, 2.4%, or 2.5% by volume.

39. The method of claim 36, wherein the fixing comprises incubating the biological sample and the fixing agent together for, or for less than 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 minutes.

40. The method of claim 36, further comprising quenching of the fixing agent with a quenching agent.

41. The method of claim 40, wherein the quenching agent comprises glycine.

42. The method of claim 40 or claim 41, wherein the quenching agent is at a concentration of greater than, equal to, at least, at most, or about 25, 50, 75, 100, 125, 150, 200, 225, or 250 mM.

43. The method of claim 34, wherein the biological sample comprises cell and/or tissue, the method further comprising permeabilizing the cell and/or the tissue section with a permeabilizing agent.

44. The method of claim 43, wherein the permeabilizing agent comprises a detergent.

45. The method of claim 43, wherein the detergent comprises Triton X-100, optionally wherein the Triton-X is at a concentration of greater than, equal to, at least, at most, or about 0.1%, 0.2%, 0.3%, 0.4%, 0.5%, 0.6%, 0.7%, 0.8%, 0.9%, 1.0%, 1.1%, 1.2%, 1.3%, 1.4%, or 1.5%.

46. The method of claim 34, further comprising incubating the primary and/or the secondary complex with an RNase enzyme.

47. The method of claim 34, wherein the RBP is a transcription factor, a splicing factor, RNA helicase, ribonuclease, RNA polymerase, translation initiation factor, or ribosomal protein.

48. The method of claim 34, wherein the RBP comprises YTHDF1, YTHDF2, YTHDC1, HuR, PTB, Musashi, eIF4E, FMRP, LARP1, IMP, hnRNP family proteins, Lin28, AUF1, IGF2BP, FUBP1, LIN28B, RBM5, FUS, TIA1, TTP, QKI, MBNL, CELF, NONO, DDX5, RBM10, SAFB, TDP-43, Ataxin-2, hnRNP A/B, C9orf72, hnRNP H/F, Matrin 3 (MATR3), Pur-alpha, TAF15, Huntingtin, RBFOX, SMN, ELAVL, Ro (SSA) and La (SSB) Proteins, hnRNP, Roquin, Staufenl, NF90/NF110, ILF3, SF3B1, SRSF2, U2AF1, ZRSR2, PRPF8, PRPF31, SNRNP200, HNRNPA1, HNRNPA2B1, NELFE, CPEB1, SRSF1, NO VAI, NOVA2, G3BP1, PTBP1, RBFOX2, and/or HNRNPC.

49. The method of claims 34, wherein the RBP-targeting agent specifically binds the RBP, optionally wherein the RBP-targeting agent comprises an antibody or functional variant thereof, optionally wherein the antibody or functional variant thereof comprises a polyclonal antibody, a monoclonal antibody, a chimeric antibody, a human antibody, a veneered antibody, a diabody, a humanized antibody, an antibody derivative, a recombinant antibody, a recombinant humanized antibody, an engineered antibody, single chain antibody, single domain antibody, nanobodies, diabodies, a bi-specific antibody, a multi-specific antibody, a DARPin, or a variant of each thereof.

50. The method of claim 34, wherein the secondary binding agent comprises an antibody or functional variant thereof, optionally wherein the antibody or functional variant thereof comprises a polyclonal antibody, a monoclonal antibody, a chimeric antibody, a human antibody, a veneered antibody, a diabody, a humanized antibody, an antibody derivative, a recombinant antibody, a recombinant humanized antibody, an engineered antibody, single chain antibody, single domain antibody, nanobodies, diabodies, a bi-specific antibody, a multi-specific antibody, a DARPin, or a variant of each thereof.

51. The method of claim 34, wherein the RBP-targeting agent is labeled, optionally wherein the label comprises a radioisotopes, a hapten, a fluorescent label, a fluorescent polypeptide, a phosphorescent molecule, a chemiluminescent molecule, a chromophore, a luminescent molecule, a photoaffinity molecule, a colored particle, and/or a ligand.

52. The method of claim 34, wherein the RBP-targeting agent is linked to a functionalized DNA barcode via an amino spacer, optionally wherein the functionalized DNA barcode comprises a alkyne (3'-O-propargyl N 2'-5' linked) functionalized DNA barcode.

53. The method of claim 52, wherein the alkyne functionalized barcodes comprise a nucleic acid sequence as set forth in any one of SEQ ID NO: 31-78, or a nucleic acid sequence at least 80% identical thereto.

54. The method of claim 34, wherein the secondary binding agent is labeled, optionally wherein the label comprises a radioisotopes, a hapten, a fluorescent label, a fluorescent polypeptide, a phosphorescent molecule, a chemiluminescent molecule, a chromophore, a luminescent molecule, a photoaffinity molecule, a colored particle, and/or a ligand.

55. The method of claim 54, wherein the fluorescent label comprises Green Fluorescent Protein (GFP), eGFP, Red Fluorescent Protein (RFP), Teal Fluorescent Protein (TFP), Blue Fluorescent Protein (BFP), Yellow Fluorescent Protein (YFP), miRFP, cerulean fluorescent protein (CFP), eCyanFP, mCherry, mVenus, mOrange, mTurquoise, tdTomato, aminocoumarin, fluorescein, texas red, Alexa Fluor dyes (e.g. Alexa Fluor 488, Alexa Fluor 555, Alexa Fluor 594, Alexa Fluor 647, Alexa Fluor 350, Alexa Fluor 532, and Alexa Fluor 700), Cy dyes (e.g. Cy3, Cy5), DyLight dyes, FITC, or Rhodamine, or functional variants thereof.

56. The method of claim 34, wherein the steps (a) - (c) are conducted in-situ.

57. The method of claims 34, wherein the method further comprises imaging the biological sample.

58. The method of claim 34, wherein the biological sample comprises less than or equal to 1000, 750, 500, 100, 50, or 20 cells, or wherein the biological sample comprises a single cell, wherein the biological sample comprises less than 5 tissue sections, or wherein the biological sample comprises a single tissue section.

59. The method of claim 34, wherein the method does not comprise ultraviolet crosslinking.

60. The method of claim 34, wherein the method does not comprise immunoprecipitation.

61. The method of claim 34, wherein the method does not comprise use of base editing proteins.

62. The method of claim 34, wherein the method does not comprise dissociating the one or more tissue section into single cells.

63. The method of claim 34, wherein the method detects transient and/or dynamic RNA- RBP interactions, optionally wherein the transient and/or dynamic RNA-RBP interactions occur on a timescale within 10 minutes.

64. The method of claim 34, wherein the method does not comprise oligo(dT) primer initiated reverse transcription.

65. The method of claim 34, wherein the method does not comprise Tn5 tagmentation.

66. The method of claims 47, wherein the RBP is a splicing factor.

67. The method of claim 66, wherein the method is used to determine splice variants between one or more biological samples.

68. The method of claims 47, wherein the RBP is a YTH family reader protein, or wherein the RBP is G3BP1.

69. The method of claim 34, wherein the method can be used to determine one or more interaction sites of the RBP with RNA in the cytoplasm, or nucleus, or both.

70. The method of claim 34, wherein the method is used to measures relative binding strength of the RBP to the RNA in comparison to one or more other RBPs to the RNA.

71. The method of claim 34, wherein the cDNA is labeled.

72. The method of claim 71, wherein the cDNA comprises one or more labeled nucleotides.

73. The method of claim 72, wherein the nucleotides are labeled with a fluorescent label.

74. The method of claim 72, wherein the labeled nucleotides comprises a biotinylated nucleotide.

75. The method of claim 74, wherein the method further comprises purifying the cDNA with a streptavidin comprising agent, optionally wherein the streptavidin comprising agent comprises, a bead, a plate, a magnetic bead, an agarose bead, a microtiter plate, a nanoparticle, and/or a membrane.

76. The method of claim 34, wherein two or more unique RBP targeting agents that interact with one or more RBPs are used in step (a).

77. The method of claim 76, wherein each of the two or more unique RBP targeting agents comprise a unique functionalized DNA barcode linked via an amino spacer.

78. The method of claim 77, wherein the functionalized DNA barcode comprises an alkyne (3 '-0 -propargyl N 2 -5' linked) functionalized DNA barcode.

79. The method of claim 78, wherein the alkyne functionalized barcodes comprise a nucleic acid sequence as set forth in any one of SEQ ID NO: 31-98, or a sequence at least 80% identical thereto.

80. A method of in-situ imaging of one or more RNA interaction sites of an RNA-binding Protein (RBP) in a biological sample bound to a solid surface, comprising: a) incubating a RBP-targeting agent with the RBP, wherein the RBP-targeting agent specifically binds the RBP to form a primary complex; b) incubating the first complex with one or more secondary binding agents that specifically binds the RBP-targeting agent, to form a secondary complex; c) incubating the primary or the secondary complex with a transcriptase composition of claim 24, to obtain cDNA; and d) imaging the solid surface.

81. The method of claim 80, wherein the RBP-targeting agent or the one or more secondary binding agents or any combination thereof are labeled, optionally wherein the label comprises radioisotopes, a hapten, a fluorescent label, a fluorescent polypeptide, a phosphorescent molecule, a chemiluminescent molecule, a chromophore, a luminescent molecule, a photoaffinity molecule, a colored particle and/or a ligand.

82. The method of claim 81, wherein the fluorescent label comprises Green Fluorescent Protein (GFP), eGFP, Red Fluorescent Protein (RFP), Teal Fluorescent Protein (TFP), Blue Fluorescent Protein (BFP), Yellow Fluorescent Protein (YFP), miRFP, cerulean fluorescent protein (CFP), eCyanFP, mCherry, mVenus, mOrange, mTurquoise, tdTomato, aminocoumarin, fluorescein, texas red, Alexa Fluor dyes (e.g. Alexa Fluor 488, Alexa Fluor 555, Alexa Fluor 594, Alexa Fluor 647, Alexa Fluor 350, Alexa Fluor 532, and Alexa Fluor 700), Cy dyes (e.g. Cy3, Cy5), DyLight dyes, FITC, or Rhodamine, or functional variants thereof.

83. The method of claim 80, wherein the cDNA is labeled.

84. The method of claim 83, wherein the cDNA comprises one or more labeled nucleotides optionally wherein the nucleotides are labeled with a fluorescent label and/or are biotinylated.

85. The method of claim 80, wherein the imaging is done using fluorescence microscopy.

86. The method of claim 80, wherein the biological sample is a RNA-protein complex, a cell, or a tissue section.

87. The method of claim 80, further comprising fixing the biological sample with a fixing agent.

88. The method of claim 87, wherein the fixing agent comprises formaldehyde, paraformaldehyde, and/or glutaraldehyde.

89. The method of claim 88, wherein the fixing agent is paraformaldehyde at a concentration of about 0.5% to about 5% by volume or wherein the paraformaldehyde at a concentration of about 0.1%, 0.2%, 0.3%, 0.4%, 0.5%, 0.6%, 0.7%, 0.8%, 0.9%, 1.0%, 1.1%, 1.2%, 1.3%, 1.4%, 1.5%, 1.6%, 1.7%, 1.8%, 1.9%, 2.0%, 2.1%, 2.2%, 2.3%, 2.4%, or 2.5% by volume.

90. The method of claim 87, wherein the wherein the fixing comprises incubating the biological sample and the fixing agent for, or for less than 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 minutes.

91. The method of claim 87, further comprising quenching of the fixing agent with a quenching agent, optionally wherein the quenching agent comprises glycine.

92. The method of claim 91, wherein the quenching agent is a concentration of greater than, equal to, at least, at most, or about 25, 50, 75, 100, 125, 150, 200, 225, or 250 mM.

93. The method of claim 80, further comprising permeabilizing the cell and/or the tissue section with a permeabilizing agent.

94. The method of claim 93, wherein the permeabilizing agent comprises a detergent.

95. The method of claim 94, wherein the detergent comprises Triton X-100, optionally wherein the Triton-X is at a concentration of greater than, equal to, at least, at most, or about

0.1%, 0.2%, 0.3%, 0.4%, 0.5%, 0.6%, 0.7%, 0.8%, 0.9%, 1.0%, 1.1%, 1.2%, 1.3%, 1.4%, or 1.5%.

96. The method of claim 80, wherein the transcriptase mix further comprises an RNase.

97. The method of claim 80, wherein the RNA binding protein is a transcription factor, a splicing factor, RNA helicase, ribonuclease, RNA polymerase, translation initiation factor, or ribosomal protein.

98. The method of claim 80, wherein the RBP comprises YTHDF1, YTHDF2, YTHDC1, HuR, PTB, Musashi, eIF4E, FMRP, LARP1, IMP, hnRNP family proteins, Lin28, AUF1, IGF2BP, FUBP1, LIN28B, RBM5, FUS, TIA1, TTP, QKI, MBNL, CELF, NONO, DDX5, RBM10, SAFB, TDP-43, Ataxin-2, hnRNP A/B, C9orf72, hnRNP H/F, Matrin 3 (MATR3), Pur-alpha, TAF15, Huntingtin, RBFOX, SMN, ELAVL, Ro (SSA) and La (SSB) Proteins, hnRNP, Roquin, Staufenl, NF90/NF110, ILF3, SF3B1, SRSF2, U2AF1, ZRSR2, PRPF8, PRPF31, SNRNP200, HNRNPA1, HNRNP A2B1, NELFE, CPEB1, SRSF1, NO VAI, NOVA2, G3BP1, PTBP1, RBFOX2, and/or HNRNPC.

99. The method of claim 80, wherein the RBP-targeting agent specifically binds the RBP.

100. The method of claim 99, wherein the RBP-targeting agent is an antibody or a functional variant thereof, optionally wherein the antibody or the functional variant thereof comprises a polyclonal antibody, a monoclonal antibody, a chimeric antibody, a human antibody, a veneered antibody, a diabody, a humanized antibody, an antibody derivative, a recombinant antibody, a recombinant humanized antibody, an engineered antibody, single chain antibody, single domain antibody, nanobodies, diabodies, a bi-specific antibody, a multi-specific antibody, a DARPin, or a variant of each thereof.

101. The method of claim 80, wherein the one or more secondary binding agent is an antibody, or a functional variant thereof, optionally wherein the antibody, or the functional variant thereof comprises a polyclonal antibody, a monoclonal antibody, a chimeric antibody, a human antibody, a veneered antibody, a diabody, a humanized antibody, an antibody derivative, a recombinant antibody, a recombinant humanized antibody, an engineered antibody, single chain antibody, single domain antibody, nanobodies, diabodies, a bi-specific antibody, a multi-specific antibody, a DARPin, or a variant of each thereof.

102. The method of claim 80, wherein the solid surface comprises a slide, a multi-well plate, a capillary, or the like.

103. The method of claim 80, further comprising sequencing the cDNA.

104. The method of claim 103, wherein the sequencing is performed using Next Generation Sequencing (NGS) techniques.

105. The method of claim 104, wherein the sequencing is done using a single cell genomic imaging techniques.

106. The method of claim 105, wherein the single cell genomic imaging technique comprises, consists essentially of, or consists of spatial transcriptomics, MERFISH, SeqFISH, STARmap, Slide-Seq, Visium Spatial Gene Expression, or deterministic barcoding in tissue for spatial omics sequencing (DBiT-seq).

107. The method of claim 106, wherein the single cell genomic imaging technique is a microfluidic based technique comprising: ligating a first set and a second set of spatial barcodes to the cDNA of step (c), prior to step (d), wherein the first set of spatial barcodes are contacted to the cDNA horizontally using a first multi-channel microfluidic chip, and wherein the second set of spatial barcodes are contacted to the solid surface vertically using a second multi-channel microfluidic chip.

108. The method of claim 105, wherein the first set of spatial barcodes and second set of spatial barcodes form a 2D spatial barcode array

109. A kit comprising a polynucleotide construct of any one of claims 1-23, or a transcriptase composition of any one of claims 24-32.

110. A method of identifying one or more RNA interaction sites of a RNA-binding Protein (RBP) in a biological sample, comprising:

(a) fixing the biological sample;

(b) incubating the biological sample with an agent that permeabilizes cell membranes;

(c) providing an RBP-targeting agent to the sample, wherein the RBP-targeting agent interacts with the RBP of interest; (d) providing a transcriptase composition comprising a polypeptide construct comprising a targeting moiety and a reverse transcriptase enzyme; wherein the targeting moiety interacts with the RBP-targeting agent;

(e) incubating the sample with the transcriptase composition to produce cDNA; and

(f) sequencing the cDNA.

111. The method of claim 110, wherein the targeting moiety comprises a Fc binding protein or a variant thereof, an antibody or variant thereof, an oligonucleotide or variant thereof, a receptor, a ligand, a small molecule, or any combination thereof.

112. The method of claim 111, wherein the targeting moiety comprises a Fc binding protein or a variant thereof, an antibody or variant thereof, and/or an oligonucleotide or a variant thereof.

113. A method of determining one or more RNA interaction sites of a first RNA-binding Protein (RBP) in a biological sample, comprising: a) incubating a first RBP-targeting agent comprising a functionalized first DNA barcode, with the first RBP, wherein the first RBP-targeting agent specifically binds the first RBP to form a first primary complex; b) incubating the first primary complex with one or more secondary binding agents that specifically binds the first RBP-targeting agent, to form a secondary complex; c) incubating the first primary or the secondary complex with the transcriptase composition of claim 24, to obtain a first barcoded cDNA library; d) amplifying and sequencing the first barcoded cDNA library; and e) obtaining one or more interaction site of the first RBP by deconvoluting the sequenced cDNA library based on the first DNA barcode.

114. The method of claim 113, wherein the transcriptase composition comprise an RT primer sequence comprising a functional group and biotinylated dNTPs.

115. The method of claim 114, wherein the functional group is an azide functional group.

116. The method of claim 114 or claim 115, wherein the biotinylated dNTPs, and the RT primer sequence comprising the azide functional group, are incorporated into the cDNA to form proximal azide labeled biotinylated cDNAs during reverse transcription in step c.

117. The method of claim 113, wherein the functionalized DNA barcode comprises an alkyne (3 '-0 -propargyl N 2 -5' linked) functionalized DNA barcode.

118. The method of claim 113, wherein the alkyne functionalized barcodes comprise a nucleic acid sequence as set forth in any one of SEQ ID NO: 31-78, or a sequence at least 80% identical thereto.

119. The method of claim 113, further comprising incorporating the alkyne functionalized first DNA barcode into the cDNA by reacting the alkyne functionalized first DNA barcode with the proximal azide labeled biotinylated cDNA of claim 116, using in-situ copper catalyzed azide-alkyne cycloaddition (CuAAC), to obtain a first barcoded biotinylated cDNA library.

120. The method of claim 119, wherein the method further comprises purifying the barcoded biotinylated cDNA library over a streptavidin column prior to step (d).

121. The method of claim 120, further comprising processing the CuAAC using a Klenow Fragment DNA polymerase for second strand synthesis.

122. The method of claim 121, wherein the one or more interaction sites of the first RBP are obtained by deconvoluting the sequenced data based on the first DNA barcode incorporated into the cDNA.

123. The method of claim 113, further comprising determining the one or more RNA- interaction sites of a second RNA-binding Protein (RBP) in a biological sample, comprising: a) incubating a second RBP-targeting agent comprising a alkyne functionalized second DNA barcode, with the second RBP, wherein the RBP-targeting agent specifically binds the second RBP to form a second primary complex; b) incubating the second primary complex with one or more secondary binding agents that specifically binds the first RBP-targeting agent, to form a second secondary complex; c) incubating the second primary or the second secondary complex with the transcriptase composition of claim 24, to obtain a second barcoded cDNA library; d) amplifying and sequencing the second barcoded cDNA library; and e) obtaining one or more interaction site of the second RBP by deconvoluting the sequenced cDNA library based on the second DNA barcode.

124. The method of claim 113, comprising determining the one or more RNA interaction sites for greater than, equal to, at least, at most 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, or 1500 RBPs, wherein each RBP targeting agent has a unique alkyne functionalized DNA barcode.

125. A method of determining spatial distribution of a RNA modification site on a biological sample bound to a solid surface, comprising: a) incubating a modification-targeting agent that specifically binds the modification site on the RNA to form a primary complex; b) incubating the primary complex with a secondary binding agent that specifically bind the primary complex to form a secondary complex; c) incubating the primary complex or the secondary complex with the transcriptase composition of claim 24 to obtain cDNA; d) optionally incorporating labelled barcodes into the cDNA; e) sequencing and imaging the biological sample using a single cell genomic imaging technique to determine the one or more modification sites.

126. The method of claim 125, wherein the modification-targeting agent is an oligonucleotide, or a variant thereof, or a small molecule.

127. The method of claim 126, wherein the oligonucleotide comprises fluorescent NTPs, or a fluorescent probe.

128. The method of claim 125, wherein the modification-targeting agent is an antibody or a functional variant thereof, optionally wherein the antibody or the functional variant thereof comprises monoclonal antibodies, polyclonal antibodies, recombinant antibody, IgG, Fv, single chain antibody, single domain antibodies, nanobodies, diabodies, multi specific antibodies (e.g., bispecific antibodies), scFv, Fab, F(ab')2, Fab, or variants thereof.

129. The method of claim 127, wherein the modification targeting agent specifically binds to a modification comprising m⁶C, m^5C, m^xA, m⁷G, or a pseudouridine modification.

130. The method of claim 125, wherein the sequencing and imaging is done using a single cell genomic imaging technique.

131. The method of claim 130, wherein the single cell genomic imaging technique comprises spatial transcriptomics, MERFISH, SeqFISH, STARmap, Slide-Seq, Visium Spatial Gene Expression, or deterministic barcoding in tissue for spatial omics sequencing (DBiT-seq).

132. The method of claim 131, wherein the single cell genomic imaging technique comprises deterministic barcoding in tissue for spatial omics sequencing (DBiT-seq) comprising: ligating a first set and a second set of spatial barcodes to the cDNA of step (c), prior to step (d), wherein the first set of spatial barcodes are contacted to the cDNA horizontally using a first multi-channel microfluidic chip, and the second set of spatial barcodes are contacted to the solid surface vertically using a second multi-channel microfluidic chip.

133. The method of claim 131, wherein the first set of spatial barcodes and the second set of spatial barcodes form a 2D spatial barcode array.