CROSS-REFERENCE TO RELATED APPLICATIONThe present application is a continuation of U.S. Ser. No. application 09/067,638, filed on Apr. 28, 1998, which claims priority from U.S. application Ser. No. 60/081,483, filed on Apr. 13, 1998, the disclosures of which each are herein incorporated by reference in their entireties.[0001]
FIELD OF THE INVENTIONThe present invention relates generally to the generation of synthetic compounds having defined physical, chemical or bioactive properties. More particularly, the present invention relates to the automated generation of oligonucleotide compounds targeted to a given nucleic acid sequence via computer-based, iterative robotic synthesis of synthetic oligonucleotide compounds and robotic or robot-assisted analysis of the activities of such compounds. Information gathered from assays of such compounds is used to identify nucleic acid sequences that are tractable to a variety of nucleotide sequence-based technologies, for example, antisense drug discovery and target validation.[0002]
BACKGROUND OF THEINVENTION1. Oligonucleotide TechnologySynthetic oligonucleotides of complementarity to targets are known to hybridize with particular, target nucleic acids. In one example, compounds complementary to the ‘sense’ strand of nucleic acids that encode polypeptides, are referred to as “antisense oligonucleotides.” A subset of such compounds may be capable of modulating the expression of target nucleic acid in vivo; such synthetic compounds are described herein as “active oligonucleotide compounds.”[0003]
Oligonucleotide compounds are also commonly used in vitro as research reagents and diagnostic aids, and in vivo as therapeutic and bioactive agents. Oligonucleotide compounds can exert their effect by a variety of means. One such means is the antisense-mediated use of an endogenous nuclease, such as RNase H in eukaryotes or RNase P in prokaryotes, to the target nucleic acid (Chiang et al.,[0004]J. Biol. Chem., 1991, 266, 18162; Forster et al.,Science, 1990, 249, 783). Another means involves covalently linking of a synthetic moiety having nuclease activity to an oligonucleotide having an antisense sequence. This does not rely upon recruitment of an endogenous nuclease to modulate target activity. Synthetic moieties having nuclease activity include, but are not limited to, enzymatic RNAs, lanthanide ion complexes, and other reactions species. (Haseloffet al.,Nature, 1988, 334, 585; Baker et al.,J. Am. Chem. Soc., 1997, 119, 8749).
Despite the advances made in utilizing antisense technology to date, it is still common to identify sequences amenable to antisense technologies through an empirical approach (Szoka,[0005]Nature Biotechnology, 1997, 15, 509). Accordingly, the need exists for systems and methods for efficiently and effectively identifying target nucleotide sequences that are suitable for antisense modulation. The present disclosure answers this need by providing systems and methods for automatically identifying such sequences via in silico, robotic or other automated means.
2. Identification of Active Oligonucleotide CompoundsTraditionally, new chemical entities with useful properties are generated by (1) identifying a chemical compound (called a ‘lead compound’) with some desirable property or activity, (2) creating variants of the lead compound, and (3) evaluating the property and activity of such variant compounds. The process has been called ‘SAR’, i.e., structure activity relationship. Although ‘SAR’ and its hand-maiden, rational drug design, has been utilized with some degree of success, there are a number of limitations to these approaches to lead compound generation, particularly as it pertains to the discovery of bioactive oligonucleotide compounds. In attempting to use SAR with oligonucleotides, it has been recognized that RNA structure can inhibit duplex formation with antisense compounds, so much so that “moving” the target nucleotide sequence even a few bases can drastically decrease the activity of such compounds (Lima et al.,[0006]Biochemistry, 1992, 31, 12055).
Heretofore, the search for lead antisense compounds has been limited to the manual synthesis and analysis of such compounds. Consequently, a fundamental limitation of the conventional approach is its dependence upon the availability, number and cost of antisense compounds produced by manual, or at best semi-automated, means. Moreover, the assaying of such compounds has traditionally been performed by tedious manual techniques. Thus, the traditional approach to generating active antisense compounds is limited by the relatively high cost and long time required to synthesize and screen a relatively small number of candidate antisense compounds.[0007]
Accordingly, the need exists for systems and methods for efficiently and effectively generating new active antisense and other olgonucleotide compounds targeted to specific nucleic acid sequences. The present disclosure answers this need by providing systems and methods for automatically generating active antisense compounds via robotic and other automated means.[0008]
3. Gene Function AnalysisEfforts such as the Human Genome Project are making an enormous amount of nucleotide sequence information available in a variety of forms, e.g., genomic sequences, cDNAs, expressed sequence tags (ESTs) and the like. This explosion of information has led one commentator to state that ‘genome scientists are producing more genes than they can put a function to’ (Kahn,[0009]Science, 1995, 270, 369). Although some approaches to this problem have been suggested, no solution has yet emerged. For example, methods of looking at gene expression in different disease states or stages of development only provide, at best, an association between a gene and a disease or stage of development (Nowak,Science, 1995, 270, 368). Another approach, looking at the proteins encoded by genes, is developing but ‘this approach is more complex and big obstacles remain’ (Kahn,Science, 1995, 270, 369). Furthermore, neither of these approaches allows one to directly utilize nucleotide sequence information to perform gene function analysis.
In contrast, antisense technology does allow for the direct utilization of nucleotide sequence information for gene function analysis. Once a target nucleic acid sequence has been selected, antisense sequences hybridizable to the sequence can be generated using techniques known in the art. Typically, a large number of candidate antisense oligonucleotides (ASOs) are synthesized having sequences that are more-or-less randomly spaced across the length of the target nucleic acid sequence (e.g., a ‘gene walk’) and their ability to modulate the expression of the target nucleic acid is assayed. Cells or animals can then be treated with one or more active antisense oligonucleotides, and the resulting effects determined in order to determine the function(s) of the target gene. Although the practicality and value of this empirical approach to developing active antisense compounds has been acknowledged in the art, it has also been stated that this approach ‘is beyond the means of most laboratories and is not feasible when a new gene sequence is identified, but whose function and therapeutic potential are unknown’ (Szoka,[0010]Nature Biotechnology, 1997, 15, 509).
Accordingly, the need exists for systems and methods for efficiently and effectively determining the function of a gene that is uncharacterized except that its nucleotide sequence, or a portion thereof, is known. The present disclosure answers this need by providing systems and methods for automatically generating active antisense compounds to a target nucleotide sequence via robotic means. Such active antisense compounds are contacted with cells, cell-free extracts, tissues or animals capable of expressing the gene of interest and subsequent biochemical or biological parameters are measured. The results are compared to those obtained from a control cell culture, cell-free extract, tissue or animal which has not been contacted with an active antisense compound in order to determine the function of the gene of interest.[0011]
4. Target ValidationDetermining the nucleotide sequence of a gene is no longer an end unto itself; rather, it is ‘merely a means to an end. The critical next step is to validate the gene and its [gene] product as a potential drug target’ (Glasser,[0012]Genetic Engineering News, 1997, 17, 1). This process, i.e., confirming that modulation of a gene that is suspected of being involved in a disease or disorder actually results in an effect that is consistent with a causal relationship between the gene and the disease or disorder, is known as target validation.
Efforts such as the Human Genome Project are yielding a vast number of complete or partial nucleotide sequences, many of which might correspond to or encode targets useful for new drug discovery efforts. The challenge represented by this plethora of information is how to use such nucleotide sequences to identify and rank valid targets for drug discovery. Antisense technology provides one means by which this might be accomplished; however, the many manual, labor-intensive and costly steps involved in traditional methods of developing active antisense compounds has limited their use in target validation (Szoka,[0013]Nature Biotechnology, 1997, 15, 509). Nevertheless, the great target specificity that is characteristic of antisense compounds makes them ideal choices for target validation, especially when the functional roles of proteins that are highly related are being investigated (Albert et al.,Trends in Pharm. Sci., 1994, 15, 250).
Accordingly, the need exists for systems and methods for developing compounds efficiently and effectively that modulate a gene, wherein such compounds can be directly developed from nucleotide sequence information. Such compounds are needed to confirm that modulation of a gene that is thought to be involved in a disease or disorder will in fact cause an in vitro or in vivo effect indicative of the origin, development, spread or growth of the disease or disorder.[0014]
The present disclosure answers this need by providing systems and methods for automatically generating active oligonucleotide and other compounds, especially anti-sense compounds, to a target nucleotide sequence via robotic or other automated means. Such active compounds are contacted with a cell culture, cell-free extract, tissue or animal capable of expressing the gene of interest, and subsequent biochemical or biological parameters indicative of the origin, development, spread or growth of the disease or disorder are measured. These results are compared to those obtained with a control cell system, cell-free extract, tissue or animal which has not been contacted with an active antisense compound in order to determine whether or not modulation of the gene of interest will have a therapeutic benefit or not. The resulting active antisense compounds may be used as positive controls when other, non antisense-based agents directed to the same target nucleic acid, or to its gene product, are screened.[0015]
It should be noted that embodiments of the invention drawn to gene function analysis and target validation have parameters that are shared with other embodiments of the invention, but also have unique parameters. For example, antisense drug discovery naturally requires that the toxicity of the antisense compounds be manageable, whereas, for gene function analysis or target validation, overt toxicity resulting from the antisense compounds is acceptable unless it interferes with the assay being used to evaluate the effects of treatment with such compounds.[0016]
U.S. Pat. No. 5,563,036 to Peterson et al. describes systems and methods of screening for compounds that inhibit the binding of a transcription factor to a nucleic acid. In a preferred embodiment, an assay portion of the process is stated to be performed by a computer controlled robot.[0017]
U.S. Pat. No. 5,708,158 to Hoey describes systems and methods for identifying pharmacological agents stated to be useful for diagnosing or treating a disease associated with a gene the expression of which is modulated by a human nuclear factor of activated T cells. The methods are stated to be particularly suited to high-thoughput screening wherein one or more steps of the process are performed by a computer controlled robot.[0018]
U.S. Pat. Nos. 5,693,463 and 5,716,780 to Edwards et al. describe systems and methods for identifying non-oligonucleotide molecules that specifically bind to a DNA molecule based on their ability to compete with a DNA-binding protein that recognizes the DNA molecule.[0019]
SUMMARY OF THE INVENTIONThe present invention is directed to automated systems and methods for generating active oligonucleotide compounds, i.e., those having desired physical, chemical and/or biological properties. The present invention is also directed to oligonucleotide-sensitive target sequences identified, by the systems and methods. For purposes of illustration, the present invention is described herein with respect to the production of antisense oligonucleotides; however, the present invention is not limited to this embodiment.[0020]
The present invention is directed to iterative processes for generating new chemical compounds with prescribed sets of physical, chemical and/or biological properties, and to systems for implementing these processes. During each iteration of a process as contemplated herein, a target nucleic acid sequence is provided or selected, and a library of (candidate) nucleobase sequences is generated in silico (that is in a computer manipulatible and reliable form) according to defined criteria a virtual oligonucleotide chemistry is chosen. A library of virtual oligonucleotide compounds having the desired nucleobase sequences is generated. These virtual compounds are reviewed and compounds predicted to have particular desired properties are selected. The selected compounds are synthesized, preferably in a robotic, batchwise manner; and then they are robotically assayed for a desired physical, chemical or biological activity in order to identify compounds with the desired properties. Active compounds are, thus, generated and, at the same time, preferred sequences and regions of the target nucleic acid that are amenable to modulation are identified.[0021]
In subsequent iterations of the process, second libraries of candidate nucleobase sequences are generated and/or selected to give rise to a second virtual oligonucleotide library. Through multiple iterations of the process, a library of target nucleic acid sequences that are tractable to oligonucleotide technologies are identified. Such modulation includes, but is not limited to, antisense technology, gene function analysis and target validation.[0022]
Further features and advantages of the present invention, as well as the structure and operation of various embodiments of the present invention, are described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers indicate identical or functionally similar elements.[0023]
BRIEF DESCRIPTION OF THE DRAWINGSThe present invention will be described with reference to the accompanying drawings, wherein:[0024]
FIGS. 1 and 2 are a flow diagram of one method according to the present invention depicting the overall flow of data and materials among various elements of the invention.[0025]
FIG. 3 is a flow diagram depicting the flow of data and materials among elements of[0026]step200 of FIG. 1.
FIGS. 4 and 5 are a flow diagram depicting the flow of data and materials among elements of[0027]step300 of FIG. 1.
FIG. 6 is a flow diagram depicting the flow of data and materials among elements of[0028]step306 of FIG. 4.
FIG. 7 is another flow diagram depicting the flow of data and materials among elements of[0029]step306 of FIG. 4.
FIG. 8 is a another flow diagram depicting the flow of data and materials among elements of[0030]step306 of FIG. 4.
FIG. 9 is a flow diagram depicting the flow of data and materials among elements of[0031]step350 of FIG. 5.
FIGS. 10 and 11 are flow diagrams depicting a logical analysis of data and materials among elements of[0032]step400 of FIG. 1.
FIG. 12 is a flow diagram depicting the flow of data and materials among the elements of[0033]step400 of FIG. 1.
FIGS. 13 and 14 are flow diagrams depicting the flow of data and materials among elements of[0034]step500 of FIG. 1.
FIG. 15 is a flow diagram depicting the flow of data and materials among elements of[0035]step600 of FIG. 1.
FIG. 16 is a flow diagram depicting the flow of data and materials among elements of[0036]step700 of FIG. 1.
FIG. 17 is a flow diagram depicting the flow of data and materials among the elements of[0037]step1100 of FIG. 2.
FIG. 18 is a block diagram showing the interconnecting of certain devices utilized in conjunction with a preferred method of the invention;[0038]
FIG. 19 is a flow diagram showing a representation of data storage in a relational database utilized in conjunction with one method of the invention;[0039]
FIG. 20 is a flow diagram depicting the flow of date and materials in effecting a preferred embodiment of the invention as set forth in Example 14;[0040]
FIG. 21 is a flow diagram depicting the depicting the flow of date and materials in effecting a preferred embodiment of the invention as set forth in Example 15;[0041]
FIG. 22 is a flow diagram depicting the depicting the flow of date and materials in effecting a preferred embodiment of the invention as set forth in Example 2;[0042]
FIG. 23 is a pictorial elevation view of a preferred apparatus used to robotically synthesize oligonucleotides; and[0043]
FIG. 24 is a pictorial plan view of an apparatus used to robotically synthesize oligonucleotides.[0044]
Certain preferred methods of this invention are now described with reference to the flow diagram of FIGS. 1 and 2.[0045]
1. Target Nucleic Acid SelectionThe target selection process,[0046]process step100, provides a target nucleotide sequence that is used to help guide subsequent steps of the process. It is generally desired to modulate the expression of the target nucleic acid for any of a variety of purposes, such as, e.g., drug discovery, target validation and/or gene function analysis.
One of the primary objectives of the target selection process, step[0047]100, is to identify molecular targets that represent significant therapeutic opportunities, provide new medicines to the medical community to fill therapeutic voids or improve upon existing therapies, to provide new and efficacious means of drug discovery and to determine the function of genes that are uncharacterized except for nucleotide sequence. To meet these objectives, genes are classified based upon specific sets of selection criteria.
One such set of selection criteria concerns the quantity and quality of target nucleotide sequence. There must be sufficient target nucleic acid sequence information available for oligonucleotide design. Moreover, such information must be of sufficient quality to give rise to an acceptable level of confidence in the data to perform the methods described herein. Thus, the data must not containing too many missing or incorrect base entries. In the case of a target sequence that encodes a polypeptide, such errors can be detected by virtually translating all three reading frames of the sense strand of the target sequence and confirming the presence of a continuous polypeptide sequence having predictable attributes, e.g., encoding a polypeptide of known size, or encoding a polypeptide that is about the same length as a homologous protein. In any event, only a very high frequency of sequence errors will frustrate the methods of the invention; most oligonucleotides to the target sequence will avoid such errors unless such errors occur frequently throughout the entire target sequence[0048]
Another preferred criterion is that appropriate culturable cell lines or other source of reproducible genetic expression should be available. Such cell lines express, or can be induced to express, the gene comprising the target nucleic acid sequence. The oligonucleotide compounds generated by the process of the invention are assayed using such cell lines and, if such assaying is performed robotically, the cell line is preferably tractable to robotic manipulation such as by growth in 96 well plates. Those skilled in the art will recognize that if an appropriate cell line does not exist, it will nevertheless be possible to construct an appropriate cell line. For example, a cell line can be transfected with an expression vector comprising the target gene in order to generate an appropriate cell line for assay purposes.[0049]
For gene function analysis, it is possible to operate upon a genetic system having a lack of information regarding, or incomplete characterization of, the biological function(s) of the target nucleic acid or its gene product(s). This is a powerful agent of the invention. A target nucleic acid for gene function analysis might be absolutely uncharacterized, or might be thought to have a function based on minimal data or homology to another gene. By application of the process of the invention to such a target, active compounds that modulate the expression of the gene can be developed and applied to cells. The resulting cellular, biochemical or molecular biological responses are observed, and this information is used by those skilled in the art to elucidate the function of the target gene.[0050]
For target validation and drug discovery, another selection criterion is disease association. Candidate target genes are placed into one of several broad categories of known or deduced disease association.[0051]Level 1 Targets are target nucleic acids for which there is a strong correlation with disease. This correlation can come from multiple scientific disciplines including, but not limited to, epidemiology, wherein frequencies of gene abnormalities are associated with disease incidence; molecular biology, wherein gene expression and function are associated with cellular events correlated with a disease; and biochemistry, wherein the in vitro activities of a gene product are associated with disease parameters. Because there is a strong therapeutic rationale for focusing onLevel 1 Targets, these targets are most preferred for drug discovery and/or target validation.
[0052]Level 2 Targets are nucleic acid targets for which the combined epidemiological, molecular biological, and/or biochemical correlation with disease is not so clear as forLevel 1.Level 3 Targets are targets for which there is little or no data to directly link the target with a disease process, but there is indirect evidence for such a link, i.e., homology with aLevel 1 orLevel 2 target nucleic acid.
One such set of selection criteria concerns the quantity and quality of target nucleotide sequence. There must be sufficient target nucleic acid sequence information available for oligonucleotide design. Moreover, such information must be of sufficient quality to give rise to an acceptable level of confidence in the data to perform the methods described herein. Thus, the data must not containing too many missing or incorrect base entries. In the case of a target sequence that encodes a polypeptide, such errors can be detected by virtually translating all three reading frames of the sense strand of the target sequence and confirming the presence of a continuous polypeptide sequence having predictable attributes, e.g., encoding a polypeptide of known size, or encoding a polypeptide that is about the same length as a homologous protein. In any event, only a very high frequency of sequence errors will frustrate the methods of the invention; most oligonucleotides to the target sequence will avoid such errors unless such errors occur frequently throughout the entire target sequence.[0053]
Another preferred criterion is that appropriate culturable cell lines or other source of reproducible genetic expression should be available. Such cell lines express, or can be induced to express, the gene comprising the target nucleic acid sequence. The oligonucleotide compounds generated by the process of the invention are assayed using such cell lines and, if such assaying is performed robotically, the cell line is preferably tractable to robotic manipulation such as by growth in 96 well plates. Those skilled in the art will recognize that if an appropriate cell line does not exist, it will nevertheless be possible to construct an appropriate cell line. For example, a cell line can be transfected with an expression vector comprising the target gene in order to generate an appropriate cell line for assay purposes.[0054]
For gene function analysis, it is I-ossible to operate sequence or with the gene product thereof. In order not to prejudice the target selection process, and to ensure that the maximum number of nucleic acids actually involved in the causation, potentiation, aggravation, spread, continuance or after-effects of disease states are investigated, it is preferred to examine a balanced mix of[0055]Level 1, 2 and 3 target nucleic acids.
In order to carry out drug discovery, experimental systems and reagents shall be available in order for one to evaluate the therapeutic potential of active compounds generated by the process of the invention. Such systems may be operable in vitro (e.g., in vitro models of cell:cell association) or in vivo (e.g., animal models of disease states). It is also desirable, but not obligatory, to have available animal model systems which can be used to evaluate drug pharmacology.[0056]
Candidate targets nucleic acids can also classified by biological processes. For example, programmed cell death (‘apoptosis’) has recently emerged as an important biological process that is perturbed in a wide variety of diseases. Accordingly, nucleic acids that encode factors that play a role in the apoptotic process are identified as candidate targets. Similarly, potential target nucleic acids can be classified as being involved in inflammation, autoimmune disorders, cancer, or other pathological or dysfunctional processes.[0057]
Moreover, genes can often be grouped into families based on sequence homology and biological function. Individual family members can act redundantly, or can provide specificity through diversity of interactions with down-stream effectors, or through expression being restricted to specific cell types. When one member of a gene family is associated with a disease process then the rationale for targeting other members of the same family is reasonably strong. Therefore, members of such gene families are preferred target nucleic acids to which the methods and systems of the invention may be applied. Indeed, the potent specificity of antisense compounds for different gene family members makes the invention particularly suited for such targets (Albert et al.,[0058]Trends Pharm. Sci., 1994, 15, 250). Those skilled in the art will recognize that a partial or complete nucleotide sequence of such family members can be obtained using the polymerase chain reaction (PCR) and ‘universal’ primers, i.e., primers designed to be common to all members of a given gene family.
PCR products generated from universal primers can be cloned and sequenced or directly sequenced using techniques known in the art. Thus, although nucleotide sequences from cloned DNAs, or from complementary DNAs (cDNAs) derived from mRNAs, may be used in the process of the invention, there is no requirement that the target nucleotide sequence be isolated from a cloned nucleic acid. Any nucleotide sequence, no matter how determined, of any nucleic acid, isolated or prepared in any fashion, may be used as a target nucleic acid in the process of the invention.[0059]
Furthermore, although polypeptide-encoding nucleic acids provide the target nucleotide sequences in one embodiment of the invention, other nucleic acids may be targeted as well. Thus, for example, the nucleotide sequences of structural or enzymatic RNAs may be utilized for drug discovery and/or target validation when such RNAs are associated with a disease state, or for gene function analysis when their biological role is not known.[0060]
2. Assembly of Target Nucleotide SequenceFIG. 3 is a block diagram detailing the steps of the target nucleotide sequence assembly process,[0061]process step200 in accordance with one embodiment of the invention. The oligonucleotide design process,process step300, is facilitated by the availability of accurate target sequence information. Because of limitations of automated genome sequencing technology, gene sequences are often accumulated in fragments. Further, because individual genes are often being sequenced by independent laboratories using different sequencing strategies, sequence information corresponding to different fragments is often deposited in different databases. The target nucleic acid assembly process take advantage of computerized homology search algorithms and sequence fragment assembly algorithms to search available databases for related sequence information and incorporate available sequence information into the best possible representation of the target nucleic acid molecule, for example a RNA transcript. This representation is then used to design oligonucleotides,process step300, which can be tested for biological activity inprocess step700.
In the case of genes directing the synthesis of multiple transcripts, i.e. by alternative splicing, each distinct transcript is a unique target nucleic acid for purposes of[0062]step300. In one embodiment of the invention, if active compounds specific for a given transcript isoform are desired, the target nucleotide sequence is limited to those sequences that are unique to that transcript isoform. In another embodiment of the invention, if it is desired to modulate two or more transcript isoforms in concert, the target nucleotide sequence is limited to sequences that are shared between the two or more transcripts.
In the case of a polypeptide-encoding nucleic acid, it is generally preferred that full-length cDNA be used in the oligonucleotide design process step[0063]300 (with full-length cDNA being defined a reading from the 5′ cap to the poly A tail). Although full-length cDNA is preferred, it is possible to design oligonucleotides using partial sequence information. Therefore it is not necessary for the assembly process to generate a complete cDNA sequence. Further in some cases it may be desirable to design oligonucleotides targeting introns. In this case the process can be used to identify individual introns atprocess step220.
The process can be initiated by entering initial sequence information on a selected molecular target at[0064]process step205. In the case of a polypeptide-encoding nucleic acid, the full-length cDNA sequence is generally preferred for use in oligonucleotide design strategies atprocess step300. The first step is to determine if the initial sequence information represents the full-length cDNA,decision step210. In the case where the full-length cDNA sequence is available the process advances directly to theoligonucleotide design step300. When the full-length cDNA sequence is not available, databases are searched atprocess step212 for additional sequence information.
The algorithm preferably used in process steps[0065]212 and230 is BLAST (Altschul, et al.,J. Mol. Biol., 1990, 215, 403), or ‘Gapped BLAST’ (Altschul et al.,Nucl. Acids Res., 1997, 25, 3389). These are database search tools based on sequence homology used to identify related sequences in a sequence database. The BLAST search parameters are set to only identify closely related sequences. Some preferred databases searched by BLAST are a combination of public domain and proprietary databases. The databases, their contents, and sources are listed in Table 1.
When genomic sequence information is available at[0066]decision step215, introns are removed and exons are assembled into continuous sequence representing the cDNA sequence inprocess step220. Exon assembly occurs using the Phragment Assembly Program ‘Phrap’ (Copyright University of Washington Genome Center, Seattle, Wash.). The Phrap algorithm analyzes sets of overlapping sequences and assembles them into one continuous sequence referred to as a ‘contig’. The resulting contig is preferably used to search databases for additional sequence information atprocess step230. When genomic information is not available, the results ofprocess step212 are analyzed for individual exons atdecision step225. Exons are frequently recorded individually in databases. If multiple complete exons are identified, they are preferably assembled into a contig using Phrap atprocess step250. If multiple complete exons are not identified atdecision step225, then sequences can be analyzed for partial sequence information indecision step228. ESTs identified in the database dbEST are examples of such partial sequence information. If additional partial information is not found, then the process is advanced to processstep230 atdecision step228. If partial sequence information is found inprocess212 then that information is advanced to processstep230 viadecision step228.
Process steps[0067]230, decision steps240, decision steps260 and process steps250 define a loop designed to extend iteratively the amount of sequence information available for targeting. At the end of each iteration of this loop, the results are analyzed in decision steps240 and260. If no new information is found then the process advances at decision steps240 to processstep300. If there is an unexpectedly large amount of sequence information identified, then the process is preferably cycled back one iteration and that sequence is advanced at decision steps240 to processstep300. If a small amount of new sequence information is identified, then the loop is iterated such as by taking the 100 most 5-prime (5′) and100 most 3-prime (3′) bases and interating them through the BLAST homology search atprocess step230. New sequence information is added to the existing contig atprocess step250.
This loop is iterated until either no new sequence information is identified at decision steps[0068]240, or an unexpectedly large amount of new information is found atdecision step260, suggesting that the process moved outside the boundary of the gene into repetitive genomic sequence. In either of these cases, iteration of this loop is preferably stopped and the process advanced to the oligonucleotide design atprocess step300.
3. In Silico Generation of a Set of Nucleobase Sequences and Virtual OligonucleotidesFor the following[0069]steps300 and400, they may be performed in the order described below, i.e., step300 beforestep400, or, in an alternative embodiment of the invention,step400 beforestep300. In this alternate embodiment, each oligonucleotide chemistry is first assigned to each oligonucleotide sequence. Then, each combination of oligonucleotide chemistry and sequence is evaluated according to the parameters ofstep300. This embodiment has the desirable feature of taking into account the effect of alternative oligonucleotide chemistries on such parameters. For example, substitution of 5-methyl cytosine (5MeC or m5c) for cytosine in an antisense compound may enhance the stability of a duplex formed between that compound and its target nucleic acid. Other oligonucleotide chemistries that enhance oligonucleotide:[target nucleic acid] duplexes are known in the art (see for example, Freier et al.,NucleicAcids Research, 1997, 25, 4429). As will be appreciated by those skilled in the art, different oligonucleotide chemistries may be preferred for different target nucleic acids. That is, the optimal oligonucleotide chemistry for binding to a target DNA might be suboptimal for binding to a target RNA having the same nucleotide sequence.
In effecting the process of the invention in the[0070]order step300 beforestep400 as seen in FIG. 1, from a target nucleic acid sequence assembled atstep200, a list of oligonucleotide sequences is generated as represented in the flowchart shown in FIGS. 4 and 5. Instep302, the desired oligonucleotide length is chosen. In a preferred embodiment, oligonucleotide length is between from about 8 to about 30, more preferably from about 12 to about 25, nucleotides. Instep304, all possible oligonucleotide sequences of the desired length capable of hybridizing to the target sequence obtained instep200 are generated. In this step, a series of oligonucleotide sequences are generated, simply by determining the most 5′ oligonucleotide possible and ‘walking’ the target sequence in increments of one base until the 3′ most oligonucleotide possible is reached.
In[0071]step305, a virtual oligonucleotide chemistry is applied to the nucleobase sequences ofstep304 in order to yield a set of virtual oligonucleotides that can be evaluated in silico. Default virtual oligonucleotide chemistries include those that are well-characterized in terms of their physical and chemical properties, e.g., 2′-deoxyribonucleic acid having naturally occurring bases (A, T, C and G), unmodified sugar residues and a phosphodiester backbone.
4. In Silico Evaluation of Thermodynamic Properties of Virtual OligonucleotidesIn[0072]step306, a series of thermodynamic, sequence, and homology scores are preferably calculated for each virtual oligonucleotide obtained fromstep305. Thermodynamic properties are calculated as represented in FIG. 6. Instep308, the desired thermodynamic properties are selected. This will typically includestep309, calculation of the free energy of the target structure. If the oligonucleotide is a DNA molecule, then steps310,312, and314 are performed. If the oligonucleotide is an RNA molecule, then steps311,313 and315 are performed. In both cases, these steps correspond to calculation of the free energy of intramolecular oligonucleotide interactions, intermolecular interactions and duplex formation. In addition, a free energy of oligonucleotide-target binding is preferably calculated atstep316.
Other thermodynamic and kinetic properties may be calculated for oligonucleotides as represented at[0073]step317. Such other thermodynamic and kinetic properties may include melting temperatures, association rates, dissociation rates, or any other physical property that may be predictive of oligonucleotide activity.
The free energy of the target structure is defined as the free energy needed to disrupt any secondary structure in the target binding site of the targeted nucleic acid. This region includes any intra-target nucleotide base pairs that need to be disrupted in order for an oligonucleotide to bind to its complementary sequence. The effect of this localized disruption of secondary structure is to provide accessibility by the oligonucleotide. Such structures will include double helices, terminal unpaired and mismatched nucleotides, loops, including hairpin loops, bulge loops, internal loops and multibranch loops (Serra et al.,[0074]Methods in Enzymology, 1995, 259, 242).
The intermolecular free energies refer to inherent energy due to the most stable structure formed by two oligonucleotides; such structures include dimer formation. Intermolecular free energies should also be taken into account when, for example, two or more oligonucleotides, of different sequence are to be administered to the same cell in an assay.[0075]
The intramolecular free energies refer to the energy needed to disrupt the most stable secondary structure within a single oligonucleotide. Such structures include, for example, hairpin loops, bulges and internal loops. The degree of intramolecular base pairing is indicative of the energy needed to disrupt such base pairing.[0076]
The free energy of duplex formation is the free energy of denatured oligonucleotide binding to its denatured target sequence. The oligonucleotide-target binding is the total binding involved, and includes the energies involved in opening up intra- and inter-molecular oligonucleotide structures, opening up target structure, and duplex formation.[0077]
The most stable RNA structure is predicted based on nearest neighbor analysis (Serra et al.,[0078]Methods in Enzymology, 1995, 259, 242). This analysis is based on the assumption that stability of a given base pair is determined by the adjacent base pairs. For each possible nearest neighbor combination, thermodynamic properties have been determined and are provided. For double helical regions, two additional factors need to be considered, an entropy change required to initiate a helix and a entropy change associated with self-complementary strands only. Thus, the free energy of a duplex can be calculated using the equation:
ΔG°T=ΔH°−TΔS°
where:[0079]
ΔG is the free energy of duplex formation,[0080]
ΔH is the enthalpy change for each nearest neighbor,[0081]
ΔS is the entropy change for each nearest neighbor, and T is temperature.[0082]
The ΔH and ΔS for each possible nearest neighbor combination have been experimentally determined. These letter values are often available in published tables. For terminal unpaired and mismatched nucleotides, enthalpy and entropy measurements for each possible nucleotide combination are also available in published tables. Such results are added directly to values determined for duplex formation. For loops, while the available data is not as complete or accurate as for base pairing, one known model determines the free energy of loop formation as the sum of free energy based on loop size, the closing base pair, the interactions between the first mismatch of the loop with the closing base pair, and additional factors including being closed by AU or UA or a first mismatch of GA or UU. Such equations may also be used for oligoribonucleotide-target RNA interactions.[0083]
The stability of DNA duplexes is used in the case of intra- or intermolecular oligodeoxyribonucleotide interactions. DNA duplex stability is calculated using similar equations as RNA stability, except experimentally deter-mined values differ between nearest neighbors in DNA and RNA and helix initiation tends to be more favorable in DNA than in RNA (SantaLucia et al.,[0084]Biochemistry, 1996, 35, 3555).
Additional thermodynamic parameters are used in the case of RNA/DNA hybrid duplexes. This would be the case for an RNA target and oligodeoxynucleotide. Such parameters were determined by Sugimoto et al. ([0085]Biochemistry, 1995, 34, 11211). In addition to values for nearest neighbors, differences were seen for values for enthalpy of helix initiation.
5. In Silico Evaluation of Target AccessibilityTarget accessibility is believed to be an important consideration in selecting oligonucleotides. Such a target site will possess minimal secondary structure and thus, will require minimal energy to disrupt such structure. In addition, secondary structure in oligonucleotides, whether inter- or intra-molecular, is undesirable due to the energy required to disrupt such structures. Oligonucleotide-target binding is dependent on both these factors. It is desirable to minimize the contributions of secondary structure based on these factors. The other contribution to oligonucleotide-target binding is binding affinity. Favorable binding affinities based on tighter base pairing at the target site is desirable.[0086]
Following the calculation of thermodynamic properties ending at[0087]step317, the desired sequence properties to be scored are selected atstep324. These properties include the number of strings of four guanosine residues in a row at step325) or three guanosines in a row at step326), the length of the longest string of adenosines at step327), cytidines at step328) or uridines or thymidines at step329), the length of the longest string of purines at step330) or pyrimidines at step331), the percent composition of adenosine at step332), cytidine at step333), guanosine at step334) or uridines or thymidines atstep335, the percent composition of purines at step336) or pyrimidines at step337), the number of CG dinucleotide repeats at step338), CA dinucleotide repeats at step339) or UA or TA dinucleotide repeats at step340). In addition, other sequence properties may be used as found to be relevant and predictive of antisense efficacy, as represented atstep341.
These sequence properties may be important in predicting oligonucleotide activity, or lack thereof. For example, U.S. Pat. No. 5,523,389 discloses oligonucleotides containing stretches of three or four guanosine residues in a row. Oligonucleotides having such sequences may act in a sequence-independent manner. For an antisense approach, such a mechanism is not usually desired. In addition, high numbers of dinucleotide repeats may be indicative of low complexity regions which may be present in large numbers of unrelated genes. Unequal base composition, for example, 90% adenosine, can also give non-specific effects. From a practical standpoint, it may be desirable to remove oligonucleotides that possess long stretches of other nucleotides due to synthesis considerations. Other sequences properties, either listed above or later found to be of predictive value may be used to select oligonucleotide sequences.[0088]
Following[0089]step341, the homology scores to be calculated are selected instep342. Homology to nucleic acids encoding protein isoforms of the target, as represented atstep343, may be desired. For example, oligonucleotides specific for an isoform of protein kinase C can be selected. Also, oligonucleotides can be selected to target multiple isoforms of such genes. Homology to analogous target sequences, as represented atstep344, may also be desired. For example, an oligonucleotide can be selected to a region common to both humans and mice to facilitate testing of the oligonucleotide in both species. Homology to splice variants of the target nucleic acid, as represented atstep345, may be desired. In addition, it may be desirable to determine homology to other sequence variants as necessary, as represented instep346.
Following[0090]step346, from which scores were obtained in each selected parameter, a desired range is selected to select the most promising oligonucleotides, as represented atstep347. Typically, only several parameters will be used to select oligonucleotide sequences. As structure prediction improves, additional parameters may be used. Once the desired score ranges are chosen, a list of all oligonucleotides having parameters falling within those ranges will be generated, as represented atstep348.
6. Targeting Oligonucleotides to Functional Regions of a Nucleic AcidIt may be desirable to target oligonucleotide sequences to specific functional regions of the target nucleic acid. A decision is made whether to target such regions, as represented in[0091]decision step349. If it is desired to target functional regions then processstep350 occurs as seen in greater detail in FIG. 9. If it is not desired then the process proceeds to step375.
In[0092]step350, as seen in FIG. 9, the desired functional regions are selected. Such regions include the transcription start site or 5′ cap at step353), the 5′ untranslated region at step354), the start codon atstep355, the coding region at step356), the stop codon at step357), the 3′ untranslated region at step358), 5′ splice sites atstep359 or 3′ splice sites atstep360, specific exons at step361) or specific introns at step362), mRNA stabilization signal at step363), mRNA destabilization signal at step364), poly-adenylation signal at step365), poly-A addition site at step366), poly-A tail at step367), or thegene sequence 5′ of known pre-mRNA at step368). In addition, additional functional sites may be selected, as represented atstep369.
Many functional regions are important to the proper processing of the gene and are attractive targets for antisense approaches. For example, the AUG start codon is commonly targeted because it is necessary to initiate translation. In addition, splice sites are thought to be attractive targets because these regions are important for processing of the mRNA. Other known sites may be more accessible because of interactions with protein factors or other regulatory molecules.[0093]
After the desired functional regions are selected and determined, then a subset of all previously selected oligonucleotides are selected based on hybridization to only those desired functional regions, as represented by[0094]step370.
7. Uniform Distribution of OligonucleotidesWhether or not targeting functional sites is desired, a large number of oligonucleotide sequences may result from the process thus far. In order to reduce the number of oligonucleotide sequences to a manageable number, a decision is made whether to uniformly distribute selected oligonucleotides along the target, as represented in[0095]step375. A uniform distribution of oligonucleotide sequences will aim to provide complete coverage throughout the complete target nucleic acid or the selected functional regions. A utility is used to automate the distribution of sequences, as represented instep380. Such a utility factors in parameters such as length of the target nucleic acid, total number of oligonucleotide sequences desired, oligonucleotide sequences per unit length, number of oligonucleotide sequences per functional region. Manual selection of oligonucleotide sequences is also provided for bystep385. In some cases, it may be desirable to manually select oligonucleotide sequences. For example, it may be useful to determine the effect of small base shifts on activity. Once the desired number of oligonucleotide sequences is obtained either fromstep380 or step385, then these oligonucleotide sequences are passed ontostep400 of the process, where oligonucleotide chemistries are assigned.
8. Assignment of Actual Oligonucleotide ChemistryOnce a set of select nucleobase sequences has been generated according to the preceding process and decision steps, actual oligonucleotide chemistry is assigned to the sequences. An ‘actual oligonucleotide chemistry’ or simply ‘chemistry’ is a chemical motif that is common to a particular set of robotically synthesized oligonucleotide compounds. Preferred chemistries include, but are not limited to, oligonucleotides in which every linkage is a phosphorothioate linkage, and chimeric oligonucleotides in which a defined number of 5′ and/or 3′ terminal residues have a 2′-methoxyethoxy modification.[0096]
Chemistries can be assigned to the nucleobase sequences during general procedure step[0097]400 (FIG. 1). The logical basis for chemistry assignment is illustrated in FIGS. 10 and 11 and an iterative routine for stepping through an oligonucleotide nucleoside by nucleoside is illustrated in FIG. 12. Chemistry assignment can be effected by assignment directly into a word processing program, via an inter-active word processing program or via automated programs and devices. In each of these instances, the output file is selected to be in a format that can serve as an input file to automated synthesis devices.
9. Oligonucleotide CompoundsIn the context of this invention, in reference to oligonucleotides, the term ‘oligonucleotide’ is used to refer to an oligomer or polymer of ribonucleic acid (RNA) or deoxyribonucleic acid (DNA) or mimetics thereof. Thus this term includes oligonucleotides composed of naturally-occurring nucleobases, sugars and covalent intemucleoside (backbone) linkages as well as oligonucleotides having non-naturally-occurring portions which function similarly. Such modified or substituted oligonucleotides are often preferred over native forms, i.e., phosphodiester linked A, C, G, T and U nucleosides, because of desirable properties such as, for example, enhanced cellular uptake, enhanced affinity for nucleic acid target and increased stability in the presence of nucleases.[0098]
The oligonucleotide compounds in accordance with this invention can be of various lengths depending on various parameters, including but not limited to those discussed above in reference to the selection criteria of[0099]general procedure300. For use as antisense oligonucleotides compounds of the invention preferably are from about 8 to about 30 nucleobases in length. Particularly preferred are antisense oligonucleotides comprising from about 12 to about 25 nucleobases (i.e. from about 8 to about 30 linked nucleosides). A discussion of antisense oligonucleotides and some desirable modifications can be found in De Mesmaeker et al.,Acc. Chem. Res., 1995, 28, 366. Other lengths of oligonucleotides might be selected for non-antisense targeting strategies, for instance using the oligonucleotides as ribozymes. Such ribozymes normally require oligonucleotides of longer length as is known in the art.
A nucleoside is a base-sugar combination. The base portion of the nucleoside is normally a heterocyclic base. The two most common classes of such heterocyclic bases are the purines and the pyrimidines. Nucleotides are nucleosides that further include a phosphate group covalently linked to the sugar portion of the nucleoside. For those nucleosides that include a normal (where normal is defined as being found in RNA and DNA) pentofuranosyl sugar, the phosphate group can be linked to either the 2′,3′ or 5′ hydroxyl moiety of the sugar. In forming oligonucleotides, the phosphate groups covalently link adjacent nucleosides to one another to form a linear polymeric compound. In turn the respective ends of this linear polymeric structure can be further joined to form a circular structure, however, open linear structures are generally preferred. Within the oligonucleotide structure, the phosphate groups are commonly referred to as forming the internucleoside backbone of the oligonucleotide. The normal linkage or backbone of RNA and DNA is a 3′ to 5′ phosphodiester linkage.[0100]
Specific examples of preferred oligonucleotides useful in this invention include oligonucleotides containing modified backbones or non-natural internucleoside linkages. As defined in this specification, oligonucleotides having modified backbones include those that retain a phosphorus atom in the backbone and those that do not have a phosphorus atom in the backbone. For the purposes of this specification, and as sometimes referenced in the art, modified oligonucleotides that do not have a phosphorus atom in their intemucleoside backbone can also be considered to be oligonucleosides.[0101]
10. Selection of Oligonucleotide ChemistriesIn a general logic scheme as illustrated in FIGS. 10 and 11, for each nucleoside position, the user or automated device is interrogated first for a base assignment, followed by a sugar assignment, a linker assignment and finally a conjugate assignment. Thus for each nucleoside, at process step[0102]410 a base is selected. In selecting the base,base chemistry1 can be selected atprocess step412 or one or more alternative bases are selected at process steps414,416 and418. After base selection is effected, the sugar portion of the nucleoside is selected. Thus for each nucleoside, at process step420 a sugar is selected that together with the select base will complete the nucleoside. In selecting the sugar,sugar chemistry1 can be selected atprocess422 or one or more alternative sugars are selected at process steps424,426 and428. For each two adjacent nucleoside units, atprocess step430, the internucleoside linker is selected. The linker chemistry for the internucleoside linker can belinker chemistry1 selected atprocess step432 or one or more alternative internucleoside linker chemistries are selected at process steps434,436 and438.
In addition to the base, sugar and internucleoside linkage, at each nucleoside position, one or more conjugate groups can be attached to the oligonucleotide via attachment to the nucleoside or attachment to the internucleoside link-age. The addition of a conjugate group is integrated at[0103]process step440 and the assignment of the conjugate group is effected atprocess step450.
For illustrative purposes in FIGS. 10 and 11, for each of the base, the sugar, the internucleoside linkers, or the conjugate,[0104]chemistries1 though n are illustrated. As described in this specification, it is understood that the number of alternate chemistries betweenchemistry1 and alternative chemistry n, for each of the base, the sugar, the internucleoside linkage and the conjugate, is variable and includes, but is not limited to, each of the specific alternative bases, sugar, intemucleoside linkers and conjugates identified in this specification as well as equivalents known in the art.
Utilizing the logic as described in conjunction with FIGS. 10 and 11, chemistry is assigned, as is shown in FIG. 12, to the list of oligonucleotides from[0105]general procedure300. In assigning chemistries to the oligonucleotides in this list, a pointer can be set atprocess step452 to the first oligonucleotide in the list and atstep453 to the first nucleotide of that first oligonucleotide. The base chemistry is selected atstep410, as described above, the sugar chemistry is selected atstep420, also as described above, followed by selection of the intemucleoside linkage atstep430, also as described above. Atdecision440, the process branches depending on whether a conjugate will be added at the current nucleotide position. If a conjugate is desired, the conjugate is selected atstep450, also as described above.
Whether or not a conjugate was added at[0106]decision step440, an inquiry is made atdecision step454. This inquiry asks if the pointer resides at the last nucleotide in the current oligonucleotide. If the result atdecision step454 is ‘No’, the pointer is moved to the next nucleotide in the current oligonucleotide and theloop including steps410,420,430,440 and454 is repeated. This loop is reiterated until the result atdecision step454 is ‘Yes.’
When the result at[0107]decision step454 is ‘Yes’, a query is made atdecision step460 concerning the location of the pointer in the list of oligonucleotides. If the pointer is not at the last oligonucleotide of the list, the ‘No’ path of thedecision step460 is followed and the pointer is moved to the next oligonucleotide in the list atprocess step458. With the pointer set to the next oligonucleotide in the list, the loop that starts at process steps453 is reiterated. When the result atdecision step460 is ‘Yes’, chemistry has been assigned to all of the nucleotides in the list of oligonucleotides.
11. Description of Oligonucleotide ChemistriesAs is illustrated in FIG. 10, for each nucleoside of an oligonucleotide, chemistry selection includes selection of the base forming the nucleoside from a large palette of different base units available. These may be ‘modified’ or ‘natural’ bases (also reference herein as nucleobases) including the natural purine bases adenine (A) and guanine (G), and the natural pyrimidine bases thymine (T), cytosine (C) and uracil (U). They further can include modified nucleo-bases including other synthetic and natural nucleobases such as 5-methylcytosine (5-me-C), 5-hydroxymethyl cytosine, xanthine, hypoxanthine, 2-aminoadenine, 6-methyl and other alkyl derivatives of adenine and guanine, 2-propyl and other alkyl derivatives of adenine and guanine, 2-thiouracil, 2-thiothymine and 2-thiocytosine, 5-propynyl uracil and cytosine, 6-azo uracil, cytosine and thymine, 5-uracil (pseudouracil), 4-thiouracil, 8-halo, 8-amino, 8-thiol, 8-thio-alkyl, 8-hydroxyl and other 8-substituted adenines and guanines, 5-halo uracils and cytosines particularly 5-bromo, 5-trifluoromethyl and other 5-substituted uracils and cytosines, 7-methylguanine and 7-methyladenine, 8-azaguanine and 8-azaadenine, 7-deazaguanine and 7-deazaadenine and 3-deazaguanine and 3-deazaadenine. Further nucleo-bases include those disclosed in U.S. Pat. No. 3,687,808, those disclosed in the[0108]Concise Encyclopedia Of Polymer Science And Engineering, pages 858-859, Kroschwitz, J. I., ed. John Wiley & Sons, 1990, those disclosed by Englisch et al.,Angewandte Chemie, International Edition, 1991, 30, 613, and those disclosed by Sanghvi, Y. S., Chapter 15, Antisense Research and Applications, pages 289-302, Crooke, S. T. and Lebleu, B., ed., CRC Press, 1993. Certain of these nucleobases are particularly useful for increasing the binding affinity of the oligomeric compounds of the invention. These include 5-substituted pyrimidines, 6-aza-pyrimidines and N-2, N-6 and 0-6 substituted purines, including 2-aminopropyladenine, 5-propynyluracil and 5-propynylcytosine. 5-methylcytosine substitutions have been shown to increase nucleic acid duplex stability by 0.6-1.2° C. (Sanghvi, Y. S., Crooke, S. T. and Lebleu, B., eds.,Antisense Research and Applications, CRC Press, Boca Raton, 1993, pp. 276-278) and are presently preferred for selection as the base. These are particularly useful when combined with a 2′-methoxyethyl sugar modifications, described below.
Representative United States patents that teach the preparation of certain of the above noted modified nucleobases as well as other modified nucleobases include, but are not limited to, the above noted U.S. Pat. No. 3,687,808, as well as U.S. Pat. Nos. 4,845,205; 5,130,302; 5,134,066; 5,175,273; 5,367,066; 5,432,272; 5,457,187; 5,459,255; 5,484,908; 5,502,177; 5,525,711; 5,552,540; 5,587,469; 5,594,121, 5,596,091; 5,614,617; and 5,681,941, Reference is also made to allowed U.S. patent application Ser. No. 08/762,488, filed on Dec. 10, 1996, commonly owned with the present application and herein incorporated by reference.[0109]
In selecting the base for any particular nucleoside of an oligonucleotide, consideration is first given to the need of a base for a particular specificity for hybridization to an opposing strand of a particular target. Thus if an ‘A base is required, adenine might be selected however other alternative bases that can effect hybridization in a manner mimicking an ‘A base such as 2,6-diaminopurine might be selected should other considersation, e.g., stronger hybridization (relative to hybridization achieved with adenine), be desired.[0110]
As is illustrated in FIG. 10, for each nucleoside of an oligonucleotide, chemistry selection includes selection of the sugar forming the nucleoside from a large palette of different sugar or sugar surrogate units available. These may be modified sugar groups, for instance sugars containing one or more substituent groups. Preferred substituent groups comprise the following at the 2′ position: OH; F; O—, S—, or N-alkyl, O—, S—, or N-alkenyl, or O, S— or N-alkynyl, wherein the alkyl, alkenyl and alkynyl may be substituted or unsubstituted C[0111]ito C10alkyl or C2to CIDalkenyl and alkynyl. Particularly preferred are O[(CH2)nO]mCH3, O(CH2)nOCH3, O(CH2)nNH2, O(CH2)nCH3, O(CH2)nONH2, and O(CH2)nON[(CH2)nCH3)]2, where n and m are from 1 to about 10. Other preferred substituent groups comprise one of the following at the 2′ position: C1to CIDlower alkyl, substituted lower alkyl, alkaryl, aralkyl, O -alkaryl or O -aralkyl, SH, SCH3, OCN, Cl, Br, CN, CF3, OCF3, SOCH3, SO2CH3, ON O2, NO2, N3, NH2, heterocycloalkyl, heterocycloalkaryl, aminoalkylamino, poly-alkylamino, substituted silyl, an RNA cleaving group, a reporter group, an intercalator, a group for improving the pharmacokinetic properties of an oligonucleotide, or a group for improving the pharmacodynamic properties of an oligonucleotide, and other substituents having similar properties. A preferred modification includes 2′-methoxyethoxy (2′-O—CH2CH2OCH3, also known as 2′-O-(2-methoxyethyl) or 2′-MOE) (Martin et al.,Helv. Chim. Acta, 1995, 78, 486) i.e., an alkoxyalkoxy group. A further preferred modification includes 2′-dimethylamino oxyethoxy, i.e., a O(CH)2O N(CH3)2group, also known as 2′-DMAOE, as described in co-owned U.S. patent application Ser. No. 09/016,520, filed on Jan. 30, 1998, the contents of which are herein incorporated by reference.
Other preferred modifications include 2′-methoxy (2′-O—CH3), 2′-aminopropoxy (2′-OCH[0112]2CH2CH2NH2) and 2′-fluoro (2′-F). Similar modifications may also be made at other positions on the sugar group, particularly the 3′ position of the sugar on the 3′ terminal nucleotide or in 2′-5′ linked oligonucleotides and the 5′ position of 5′ terminal nucleotide. The nucleosides of the oligonucleotides may also have sugar mimetics such as cyclobutyl moieties in place of the pentofuranosyl sugar.
Representative United States patents that teach the preparation of such modified sugars structures include, but are not limited to, U.S. Pat. Nos. 4,981,957; 5,118,800; 5,319,080; 5,359,044; 5,393,878; 5,446,137; 5,466,786; 5,514,785; 5,519,134; 5,567,811; 5,576,427; 5,591,722; 5,597,909; 5,610,300; 5,627,053; 5,639,873; 5,646,265; 5,658,873; 5,670,633; and 5,700,920, certain of which are commonly owned with the present application, each of which is herein incorporated by reference, together with allowed U.S. patent application Ser. No. 08/468,037, filed on Jun. 5, 1995, which is commonly owned with the present application and is herein incorporated by reference.[0113]
As is illustrated in FIG. 10, for each adjacent pair of nucleosides of an oligonucleotide, chemistry selection includes selection of the intemucleoside linkage. These intemucleoside linkages are also referred to as linkers, backbones or oligonucleotide backbones. For forming these nucleoside linkages, a palette of different intemucleoside linkages or backbones is available. These include modified oligonucleotide backbones, for example, phosphorothioates, chiral phosphorothioates, phosphorodithioates, phosphotriesters, aminoalkylphosphotriesters, methyl and other alkyl phosphonates including 3′-alkylene phosphonates and chiral phosphonates, phosphinates, phosphoramidates including 3′-amino phosphorarnidate and aminoalkylphosphoramidates, thionophosphoramidates, thionoalkylphosphonates, thionoalklyphosphotriesters, and boranophosphates having normal 3′-5′ linkages, 2′-5′ linked analogs of these, and those having inverted polarity wherein the adjacent pairs of nucleoside units are linked 3′-5′ to 5′-3′ or 2′-5′ to 5′-2′. Various salts, mixed salts and free acid forms are also included.[0114]
Representative United States patents that teach the preparation of the above phosphorus containing linkages include, but are not limited to, U.S. Pat. Nos. 3,687,808; 4,469,863; 4,476,301; 5,023,243; 5,177,196; 5,188,897; 5,264,423; 5,276,019; 5,278,302; 5,286,717; 5,321,131; 5,399,676; 5,405,939; 5,453,496; 5,455,233; 5,466,677; 5,476,925; 5,519,126; 5,536,821; 5,541,306; 5,550,111; 5,563,253; 5,571,799; 5,587,361; 5,625,050; and 5,697,248, certain of which are commonly owned with this application, and each of which is herein incorporated by reference.[0115]
Preferred intemucleoside linkages for oligonucleotides that do not include a phosphorus atom therein, i.e., for oligonucleosides, have backbones that are formed by short chain alkyl or cycloalkyl intersugar linkages, mixed heteroatom and alkyl or cycloalkyl intersugar linkages, or one or more short chain heteroatomic or heterocyclic intersugar linkages. These include those having morpholino linkages (formed in part from the sugar portion of a nucleoside); siloxane backbones; sulfide, sulfoxide and sulfone back-bones; formacetyl and thioformacetyl backbones; methylene formacetyl and thioformacetyl backbones; alkene containing backbones; sulfamate backbones; methyleneimino and methylenehydrazino backbones; sulfonate and sulfonamide backbones; amide backbones; and others having mixed N, O, S and CH[0116]2component parts.
Representative United States patents that teach the preparation of the above oligonucleosides include, but are not limited to, U.S. Pat. Nos. 5,034,506; 5,166,315; 5,185,444; 5,214,134; 5,216,141; 5,235,033; 5,264,562; 5,264,564; 5,405,938; 5,434,257; 5,466,677; 5,470,967; 5,489,677; 5,541,307; 5,561,225; 5,596,086; 5,602,240; 5,610,289; 5,602,240; 5,608,046; 5,610,289; 5,618,704; 5,623,070; 5,663,312; 5,633,360; 5,677,437; and 5,677,439, certain of which are commonly owned with this application, and each of which is herein incorporated by reference.[0117]
In other preferred oligonucleotides, i.e., oligonucleotide mimetics, both the sugar and the intersugar linkage, i.e., the backbone, of the nucleotide units are replaced with novel groups. The base units are maintained for hybridization with an appropriate nucleic acid target compound. One such oligomeric compound, an oligonucleotide mimetic that has been shown to have excellent hybridization properties, is referred to as a peptide nucleic acid (PNA). In PNA compounds, the sugar-phosphate backbone of an oligonucleotide is replaced with an amide-containing backbone, in particular an aminoethylglycine backbone. The nucleobases are retained and are bound directly or indirectly to aza nitrogen atoms of the amide portion of the backbone. Representative United States patents that teach the preparation of PNA compounds include, but are not limited to, U.S. Pat. Nos. 5,539,082; 5,714,331; and 5,719,262, each of which is herein incorporated by reference. Further teaching of PNA compounds can be found in Nielsen et al.,[0118]Science, 1991, 254, 1497.
For the intemucleoside linkages, the most preferred embodiments of the invention are oligonucleotides with phosphorothioate backbones and oligonucleosides with heteroatom backbones, and in particular —CHz NH—OCH[0119]z, —CHz N(CH3)—O—CH2-[known as a methylene (methylimino) or MMI backbone], —CHz O—N(CH3)—CHz, —CHz N(CH3)—N(CH3)—CHz and —O—N(CH3)—CH2—CHz [wherein the native phosphodiester backbone is represented as —O—P—O—CHz] of the above referenced U.S. Pat. No. 5,489,677, and the amide backbones of the above referenced U.S. Pat. No. 5,602,240. Also preferred are oligonucleotides having morpholino backbone structures of the above-referenced U.S. Pat. No. 5,034,506.
In attaching a conjugate group to one or more nucleosides or intemucleoside linkages of an oligo-nucleotide, various properties of the oligonucleotide are modified. Thus modification of the oligonucleotides of the invention to chemically link one or more moieties or conjugates to the oligonucleotide are intended to enhance the activity, cellular distribution or cellular uptake of the oligonucleotide. Such moieties include but are not limited to lipid moieties such as a cholesterol moiety (Letsinger et al.,[0120]Proc. Natl. Acad. Sci. USA, 1989, 86, 6553), cholic acid (Manoharan et al.,Bioorg. Med. Chem. Let., 1994, 4, 1053), a thioether, e.g., hexyl-S-tritylthiol (Manoharan et al.,Ann. N.Y. Acad. Sci., 1992, 660, 306; Manoharan et al.,Bioorg. Med. Chem. Let., 1993, 3, 2765), a thiocholesterol (Oberhauser et al.,Nucl. Acids Res., 1992, 20, 533), an aliphatic chain, e.g., dodecandiol or undecyl residues (Saison-Behmoaras et al.,EMBO J., 1991, 10, 111; Kabanov et al.,FEBS Lett., 1990, 259, 327; Svinarchuk et al.,Biochimie, 1993, 75, 49), a phospholipid, e.g., di-hexadecyl-rac-glycerol ortriethylammonium 1,2-di-O-hexadecyl-rac-glycero-3-H-phosphonate (Manoharan et al.,Tetrahedron Lett., 1995, 36, 3651; Shea et al.,Nucl. Acids Res., 1990, 18, 3777), a polyamine or a polyethylene glycol chain (Manoharan et al.,Nucleosides&Nucleotides, 1995, 14, 969), or adamantane acetic acid (Manoharan et al.,Tetrahedron Lett., 1995, 36, 3651), a palmityl moiety (Mishra et al.,Biochim. Biophys. Acta, 1995, 1264, 229), or an octadecylamine or hexylamino-carbonyl-oxycholesterol moiety (Crooke et al.,J. Pharmacol. Exp. Ther, 1996, 277, 923).
Representative United States patents that teach the preparation of such oligonucleotide conjugates include, but are not limited to, U.S. Pat. Nos. 4,828,979; 4,948,882; 5,218,105; 5,525,465; 5,541,313; 5,545,730; 5,552,538; 5,578,717, 5,580,731; 5,580,731; 5,591,584; 5,109,124; 5,118,802; 5,138,045; 5,414,077; 5,486,603; 5,512,439; 5,578,718; 5,608,046; 4,587,044; 4,605,735; 4,667,025; 4,762,779; 4,789,737; 4,824,941; 4,835,263; 4,876,335; 4,904,582; 4,958,013; 5,082,830; 5,112,963; 5,214,136; 5,082,830; 5,112,963; 5,214,136; 5,245,022; 5,254,469; 5,258,506; 5,262,536; 5,272,250; 5,292,873; 5,317,098; 5,371,241, 5,391,723; 5,416,203, 5,451,463; 5,510,475; 5,512,667; 5,514,785; 5,565,552; 5,567,810; 5,574,142; 5,585,481; 5,587,371; 5,595,726; 5,597,696; 5,599,923; 5,599,928 and 5,688,941, certain of which are commonly owned with the present application, and each of which is herein incorporated by reference.[0121]
12. Chimeric CompoundsIt is not necessary for all positions in a given compound to be uniformly modified. In fact, more than one of the aforementioned modifications may be incorporated in a single compound or even at a single nucleoside within an oligonucleotide. The present invention also includes compounds which are chimeric compounds. ‘Chimeric’ compounds or ‘chimeras,’ in the context of this invention, are compounds, particularly oligonucleotides, which contain two or more chemically distinct regions, each made up of at least one monomer unit, i.e., a nucleotide in the case of an oligonucleotide compound. These oligonucleotides typically contain at least one region wherein the oligonucleotide is modified so as to confer upon the oligonucleotide increased resistance to nuclease degradation, increased cellular uptake, and/or increased binding affinity for the target nucleic acid. An additional region of the oligonucleotide may serve as a substrate for enzymes capable of cleaving RNA:DNA or RNA:RNA hybrids.[0122]
By way of example, RNase H is a cellular endonuclease which cleaves the RNA strand of an RNA:DNA duplex. Activation of RNase H, therefore, results in cleavage of the RNA target, thereby greatly enhancing the efficiency of oligonucleotide inhibition of gene expression. Consequently, comparable results can often be obtained with shorter oligonucleotides when chimeric oligonucleotides are used, compared to phosphorothioate deoxyoligonucleotides hybridizing to the same target region. Cleavage of the RNA target can be routinely detected by gel electrophoresis and, if necessary, associated nucleic acid hybridization techniques known in the art.[0123]
Chimeric antisense compounds of the invention may be formed as composite structures representing the union of two or more oligonucleotides, modified oligonucleotides, oligonucleosides and/or oligonucleotide mimetics as described above. Such compounds have also been referred to in the art as “hybrids” or “gapmers”. Representative United States patents that teach the preparation of such hybrid structures include, but are not limited to, U.S. Pat. Nos. 5,013,830; 5,149,797; 5,220,007; 5,256,775; 5,366,878; 5,403,711; 5,491,133; 5,565,350; 5,623,065; 5,652,355; 5,652,356; and 5,700,922, certain of which are commonly owned with the present application and each of which is herein incorporated by reference, together with commonly owned and allowed U.S. patent application Ser. No. 08/465,880, filed on Jun. 6, 1995, also herein incorporated by reference.[0124]
13. Description of Automated Oligonucleotide SynthesisIn the next step of the overall process (illustrated in FIGS. 1 and 2), oligonucleotides are synthesized on an automated synthesizer. Although many devices may be employed, the synthesizer is preferably a variation of the synthesizer described in U.S. Pat. Nos. 5,472,672 and 5,529,756, the entire contents of which are herein incorporated by reference. The synthesizer described in those patents is modified to include movement in along the Y axis in addition to movement along the X axis. As so modified, a 96-well array of compounds can be synthesized by the synthesizer. The synthesizer further includes temperature control and the ability to maintain an inert atmosphere during all phases of synthesis. The reagent array delivery format employs orthogonal X-axis motion of a matrix of reaction vessels and Y-axis motion of an array of reagents. Each reagent has its own dedicated plumbing system to eliminate the possibility of cross-contamination of reagents and line flushing and/or pipette washing. This in combined with a high delivery speed obtained with a reagent mapping system allows for the extremely rapid delivery of reagents. This further allows long and complex reaction sequences to be performed in an efficient and facile manner.[0125]
The software that operates the synthesizer allows the straightforward programming of the parallel synthesis of a large number of compounds. The software utilizes a general synthetic procedure in the form of a command (.cmd) file, which calls upon certain reagents to be added to certain wells via lookup in a sequence (.seq) file. The bottle position, flow rate, and concentration of each reagent is stored in a lookup table (.tab) file. Thus, once any synthetic method has been outlined, a plate of compounds is made by permutating a set of reagents, and writing the resulting output to a text file. The text file is input directly into the synthesizer and used for the synthesis of the plate of compounds. The synthesizer is interfaced with a relational database allowing data output related to the synthesized compounds to be registered in a highly efficient manner.[0126]
Building of the .seq, .cmd and .tab files is illustrated in FIG. 13. Thus as a part of the general[0127]oligonucleotide synthesis procedure500, for each linker chemistry atprocess step502, a synthesis file, i.e., a .cmd file, is built atprocess step504. This file can be built fresh to reflect a completely new set of machine commands reflecting a set of chemical synthesis steps or it can modify an existing file stored atprocess step504 by editing that stored file inprocess step508. The .cmd files are built using a word processor and a command set of instructions as outlined below.
It will be appreciated that the preparation of control software and data files is within the routine skill of persons skilled in anotated nucleotide synthesis. The same will depend upon the hardware employed, the chemistries adopeted and the design paradigm selected by the operator.[0128]
In a like manner to the building the .cmd files, .tab files are built to reflect the necessary reagents used in the automatic synthesizer for the particular chemistries that have been selected for the linkages, bases, sugars and conjugate chemistries. Thus for each of a set of these chemistries at[0129]process step510, a .tab file is built atprocess step512 and stored atprocess step514. As with the .cmd files, an existing tab file can be edited atprocess step516.
Both the .cmd files and the tab files are linked together at[0130]process step518 and stored for later retrieval in anappropriate sample database520. Linking can be as simple as using like file names to associate a .cmd file to its appropriate tab file, e.g., synthesis 1.cmd is linked tosynthesis 1. tab by use of the same preamble in their names.
The automated, multi-well parallel array synthesizer employs a reagent array delivery format, in which each reagent utilized has a dedicated plumbing system. As seen in FIGS. 23 and 24, an[0131]inert atmosphere522 is maintained during all phases of a synthesis. Temperature is controlled via athermal transfer plate524, which holds an injection moldedreaction block526. The reaction plate assembly slides in the X-axis direction, while for example eight nozzle blocks (528,530,532,534,536,538,540 and542) holding the reagent lines slide in the Y-axis direction, allowing for the extremely rapid delivery of any of 64 reagents to 96 wells. In addition, there are for example, six banks of fixed nozzle blocks (544,546,548,550,552 and554) which deliver the same reagent or solvent to eight wells at once, for a total of 72 possible reagents.
In synthesizing oligonucleotides for screening, the target reaction vessels, a 96 well plate[0132]556 (a 2-dimensional array), moves in one direction along the X axis, while the series of independently controlled reagent delivery nozzles (528,530,532,534,536,538,540 and542) move along the Y-axis relative to thereaction vessel558. As thereaction plate556 and reagent nozzles (528,530,532,534,536,538,540 and542) can be moved independently at the same time, this arrangement facilitated the extremely rapid delivery of up to 72 reagents independently to each of the 96 reaction vessel wells.
The system software allows the straightforward programming of the synthesis of a large number of compounds by supplying the general synthetic procedure in the form of the command file to call upon certain reagents to be added to specific wells via lookup in the sequence file with the bottle position, flow rate, and concentration of each reagent being stored in the separate reagent table file. Compounds can be synthesized on various scales. For oligonucleotides, a 200 nmole scale is typically selected while for other compounds larger scales, as for example a 10 μmole scale (3-5 mg), might be utilized. The resulting crude compounds are generally >80% pure, and are utilized directly for high throughput screening assays. Alternatively, prior to use the plates can be subjected to quality control (see[0133]general procedure600 and Example 9) to ascertain their exact purity. Use of the synthesizer results in a very efficient means for the parallel synthesis of compounds for screening.
The software inputs accept tab delimited text files (as discussed above for[0134]file504 and512) from any text editor. A typical command file, a .cmd file, is shown in Example 3 at Table 2. Typical sequence files, seq files, are shown in Example 3 at Tables 3 and 4 (.SEQ file), and a typical reagent file, a tab file, is shown in Example 3 at Table 5. Table 3 illustrates the sequence file for an oligonucleotide having 2′-deoxy nucleotides at each position with a phosphorothioate backbone throughout. Table 4 illustrates the sequence file for an oligonucleotide, again having a phosphorothioate backbone throughout, however, certain modified nucleoside are utilized in portions of the oligonucleotide. As shown in this table, 2′-O-(methoxyethyl) modified nucleoside are utilized in a first region (a wing) of the oligonucleotide, followed by a second region (a gap) of 2′-deoxy nucleotides and finally a third region (a further wing) that has the same chemistry as the first region. Typically some of the wells of the 96well plate556 may be left empty (depending on the number of oligonucleotides to be made during an individual synthesis) or some of the well may have oligonucleotides that will serve as standards for comparison or analytical purposes.
Prior to loading reagents, moisture sensitive reagent lines are purged with argon at[0135]522 for 20 minutes. Reagents are dissolved to appropriate concentrations and installed on the synthesizer. Large bottles, collectively identified as558 in FIG. 23 (containing 8 delivery lines) are used for wash solvents and the delivery of general activators, trityl group cleaving reagents and other reagents that may be used in multiple wells during any particular synthesis. Small septa bottles, collectively identified as560 in FIG. 23, are utilized to contain individual nucleotide amidite precursor compounds. This allows for anhydrous preparation and efficient installation of multiple reagents by using needles to pressurize the bottle, and as a delivery path. After all reagents are installed, the lines are primed with reagent, flow rates measured, then entered into the reagent table (.tab file). A dry resin loaded plate is removed from vacuum and installed in the machine for the synthesis.
The modified 96[0136]well polypropylene plate556 is utilized as the reaction vessel. The working volume in each well is approximately 700 μl. The bottom of each well is provided with a pressed-fit 20 μm polypropylene frit and a long capillary exit into a lower collection chamber as is illustrated in FIG. 5 of the above referenced U.S. Pat. No. 5,372,672. The solid support for use in holding the growing oligonucleotide during synthesis is loaded into the wells of thesynthesis plate556 by pipetting the desired volume of a balanced density slurry of the support suspended in an appropriate solvent, typically an acetonitrile-methylene chloride mixture. Reactions can be run on various scales as for instance the above noted 200 nmole and 10 μmol scales. For oligonucleotide synthesis a CPG support is preferred, however other medium loading polystyrene-PEG supports such as TentaGel™ or ArgoGel™ can also be used.
As seen in FIG. 24, the synthesis plate is transported back and forth in the X-direction under an array of 8 moveable banks ([0137]530,532,534,536,538,540,542 and544) of 8 nozzles (64 total) in the Y-direction, and 6 banks (544,546,548,550,552 and554) of 48 fixed nozzles, so that each well can receive the appropriate amounts of reagents and/or solvents from any reservoir (large bottle or smaller septa bottle). A sliding balloon-type seal562 surrounds this nozzle array and joins it to thereaction plate headspace564. A slow sweep of nitrogen orargon522 at ambient pressure across the plate headspace is used to preserve an anhydrous environment.
The liquid contents in each well do not drip out until the headspace pressure exceeds the capillary forces on the liquid in the exit nozzle. A slight positive pressure in the lower collection chamber can be added to eliminate residual slow leakage from filled wells, or to effect agitation by bubbling inert gas through the suspension. In order to empty the wells, the headspace gas outlet valve is closed and the internal pressure raised to about 2 psi. Normally, liquid contents are blown directly to[0138]waste566. However, a 96 well microtiter plate can be inserted into the lower chamber beneath the synthesis plate in order to collect the individual well eluents for spectrophotometric monitoring (trityl, etc.) of reaction progress and yield.
The basic plumbing scheme for the machine is the gas-pressurized delivery of reagents. Each reagent is delivered to the synthesis plate through a dedicated supply line, collectively identified at[0139]568, solenoid valve collectively identified at570 and nozzle, collectively identified at572. Reagents never cross paths until they reach the reaction well. Thus, no line needs to be washed or flushed prior to its next use and there is no possibility of cross-contamination of reagents. The liquid delivery velocity is sufficiently energetic to thoroughly mix the contents within a well to form a homogeneous solution, even when employing solutions having drastically different densities. With this mixing, once reactants are in homogeneous solution, diffusion carries the individual components into and out of the solid support matrix where the desired reaction takes place. Each reagent reservoir can be plumbed to either a single nozzle or any combination of up to 8 nozzles. Each nozzle is also provided with a concentric nozzle washer to wash the outside of the delivery nozzles in order to eliminate problems of crystallized reactant buildup due to slow evaporation of solvent at the tips of the nozzles. The nozzles and supply lines can be primed into a set of dummy wells directly to waste at any time.
The entire plumbing system is fabricated with teflon tubing, and reagent reservoirs are accessed via syringe needle/septa or direct connection into the higher capacity bottles. The[0140]septum vials560 are held in removable 8-bottle racks to facilitate easy setup and cleaning. The priming volume for each line is about 350 μl. The minimum delivery volume is about 2 μl, and flow rate accuracy is ±5%. The actual amount of material delivered depends on a timed flow of liquid. The flow rate for a particular solvent will depend on its viscosity and wetting characteristics of the teflon tubing. The flow rate (typically 200-350 μl per sec) is experimentally determined, and this information is contained in the reagent table setup file.
Heating and cooling of the[0141]reaction block526 is effected utilizing a recirculatingheat exchanger plate524, similar to that found in PCR thermocyclers, that nests with thepolypropylene synthesis plate556 to provide good thermal contact. The liquid contents in a well can be heated or cooled at about 10° C. per minute over a range of +5 to +80° C., as polypropylene begins to soften and deform at about 80° C. For temperatures greater than this, a non-disposable synthesis plate machined from stainless steel or monel with replaceable frits can be utilized.
The hardware controller can be any of a wide variety, but conveniently can be designed around a set of three 1 MHz 86332 chips. This controller is used to drive the single X-axis and 8 Y-axis stepper motors as well as provide the timing functions for a total of 154 solenoid valves. Each chip has 16 bidirectional timer I/O and 8 interrupt channels in its timer processing unit (TPU). These are used to provide the step and direction signals, and to read 3 encoder inputs and 2 limit switches for controlling up to three motors per chip. Each 86332 chip also drives a serial chain of 8 UNC5891A darlington array chips to provide power to 64 valves with msec resolution. The controller communicates with the Windows software interface program running on a PC via a 19200 Hz serial channel, and uses an elementary instruction set to communicate valve_number, time_open, motor_number and position_data.[0142]
The three components of the software program that run the array synthesizer, the generalized procedure or command (.cmd) file which specifies the synthesis instructions to be performed, the sequence (.seq) file which specifies the scale of the reaction and the order in which variable groups will be added to the core synthon, and the reagent table (.tab) file which specifies the name of a chemical, its location (bottle number), flow rate, and concentration are utilized in conjunction with a basic set of command instructions.[0143]
One basic set of command instructions can be:
[0144] |
|
| ADD | | |
| IF | {block of instructions} | END IF |
| REPEAT | {block of instructions } | END REPEAT |
| PRIME, NOZZLE WASH |
| WAIT, DPAIN |
| LOAD, REMOVE |
| NEXT SEQUENCE |
| LOOP BEGIN, LOOP |
| END |
|
The ADD instruction has two forms, and is intended to have the look and feel of a standard chemical equation. Reagents are specified to be added by a molar amount if the number proceeds the name identifier, or by an absolute volume in microliters if the number follows the identifier. The number of reagents to be added is a parsed list, separated by the ‘+’ sign. For variable reagent identifiers, the key word, <seq>, means look in the sequence table for the identity of the reagent to be added, while the key word, <act>, means add the reagent which is associated with that particular <seq>. Reagents are delivered in the order specified in the list.[0145]
Thus:
[0146] |
|
| ADD | ACN | 300 |
| means: Add 300 μi of the named reagent acetonitrile; ACN to each well of |
| active synthesis |
| ADD | <seq> 300 |
| means: If the sequence pointer in the .seq file is to a reagent in the list of |
| reagents, independent of scale, add 300 μi of that particular reagent specified for |
| that well. |
| ADD | 1.1 PYR + 1.0 <seq> + 1.1 <actl> |
| means: If the sequence pointer in the .seq file is to a reagent in the list of acids |
| in the Class ACIDS_1, and PYR is the name of pyridine, and ethyl |
| chloroformate is defined in the .tab file to activate the class, ACIDS_1 then this |
| instruction means: |
| Add | 1.1 equiv. pyridine |
| | 1.0 equiv. of the acid specified for that well and |
| | 1.1 equiv. of the activator, ethyl chloroformate |
| The IF command allows one to test what type of reagent is specified in the <seq> variable |
| and process the succeeding block of commands accordingly. |
| Thus: |
| ACYLATION {the procedure name} |
| ADD 1.0 <seq> + 1.1 <actl> + 1.1 PYR |
| WAIT 60 |
| ADD 1.0 <seq> + 1.2 <actl> + 1.2 TEA |
means: Operate on those wells for which reagents contained in the[0147]Acid—1 class are specified, WAIT 60 sec, then operate on those wells for which reagents contained in theAcid—2 class are specified, then WAIT 60 sec longer, then DRAIN the whole plate. Note that theAcid—1 group has reacted for a total of 120 sec, while theAcid—2 group has reacted for only 60 sec.
The REPEAT command is a simple way to execute the same block of commands multiple times.[0148]
Thus:
[0149] | |
| |
| WASH_1 {the procedure name} |
| BEGIN |
| REPEAT |
| 3 |
| ADD ACN 300 |
| DRAIN 15 |
| END_EPEAT |
| END |
| |
means: repeats the add acetonitrile and drain sequence for each well three times.[0150]
The PRIME command will operate either on specific named reagents or on nozzles which will be used in the next associated <seq> operation. The μl amount dispensed into a prime port is a constant that can be specified in a config.dat file.[0151]
The NOZZLE_WASH command for washing the outside of reaction nozzles free from residue due to evaporation of reagent solvent will operate either on specific named reagents or on nozzles which have been used in the preceding associated <seq> operation. The machine is plumbed such that if any nozzle in a block has been used, all the nozzles in that block will be washed into the prime port.[0152]
The WAIT and DRAIN commands are by seconds, with the drain command applying a gas pressure over the top surface of the plate in order to drain the wells.[0153]
The LOAD and REMOVE commands are instructions for the machine to pause for operator action.[0154]
The NEXT_SEQUENCE command increments the sequence pointer to the next group of substituents to be added in the sequence file. The general form of a seq file entry is the definition:[0155]
Well_No Well_ID Scale Sequence[0156]
The sequence information is conveyed by a series of columns, each of which represents a variable reagent to be added at a particular position. The scale (μmole) variable is included so that reactions of different scale can be run at the same time if desired. The reagents are defined in a lookup table (the .tab file), which specifies the name of the reagent as referred to in the sequence and command files, its location (bottle number), flow rate, and concentration. This information is then used by the controller software and hardware to determine both the appropriate slider motion to position the plate and slider arms for delivery of a specific reagent, as well as the specific valve and time required to deliver the appropriate reagents. The adept classification of reagents allows the use of conditional IF loops from within a command file to perform addition of different reagents differently during a ‘single step’ performed across 96 wells simultaneously. The special class ACTIVATORS defines certain reagents that always get added with a particular class of reagents (for example tetrazole during a phosphitylation reaction in adding the next nucleotide to a growing oligonucleotide).[0157]
The general form of the .tab file is the definition:[0158]
Class Bottle Reagent Name Flow_rate Conc.[0159]
The LOOP_BEGIN and LOOP_END commands define the block of commands which will continue to operate until a NEXT_SEQUENCE command points past the end of the longest list of reactants in any well.[0160]
Not included in the command set is a MOVE command. For all of the above commands, if any plate or nozzle movement is required, this is automatically executed in order to perform the desired solvent or reagent delivery operation. This is accomplished by the controller software and hardware, which determines the correct nozzle(s) and well(s) required for a particular reagent addition, then synchronizes the position of the requisite nozzle and well prior to adding the reagent.[0161]
A MANUAL mode can also be utilized in which the synthesis plate and nozzle blocks can be ‘homed’ or moved to any position by the operator, the nozzles primed or washed, the various reagent bottles depressurized or washed with solvent, the chamber pressurized, etc. The automatic COMMAND mode can be interrupted at any point, MANUAL commands executed, and then operation resumed at the appropriate location. The sequence pointer can be incremented to restart a synthesis anywhere within a command file.[0162]
In reference to FIG. 14, the list of oligonucleotides for synthesis can be rearranged or grouped for optimization of synthesis. Thus at[0163]process step574, the oligonucleotides are grouped according to a factor on which to base the optimization of synthesis. As illustrated in the Examples below, one such factor is the 3′ most nucleoside of the oligonucleotide. Using the amidite approach for oligonucleotide synthesis, a nucleotide bearing a 3′ phosphoramite is added to the 5′ hydroxyl group of a growing nucleotide chain. The first nucleotide (at the 3′ terminus of the oligonucleotide-the 3′ most nucleoside) is first connected to a solid support. This is normally done batchwise on a large scale as is standard practice during oligonucleotide synthesis.
Such solid supports pre-loaded with a nucleoside are commercially available. In utilizing the multi well format for oligonucleotide synthesis, for each oligonucleotide to be synthesized, an aliquot of a solid support bearing the proper nucleoside thereon is added to the well for synthesis. Prior to loading the sequence of oligonucleotides to be synthesized in the .seq file, they are sorted by the 3′ terminal nucleotide. Based on that sorting, all of the oligonucleotide sequences having an ‘A’ nucleoside at their 3′ end are grouped together, those with a ‘C’ nucleoside are grouped together as are those with ‘G’ or ‘T’ nucleosides. Thus in loading the nucleoside-bearing solid support into the synthesis wells, machine movements are conserved.[0164]
The oligonucleotides can be grouped by the above described parameter or other parameters that facilitate the synthesis of the oligonucleotides. Thus in FIG. 14, sorting is noted as being effected by some parameter of[0165]type 1, as for instance the above described 3′ most nucleoside, or other types of parameters fromtype 2 to type n at process steps576,578 and580. Since synthesis will be from the 3′ end of the oligonucleotides to the 5′ end, the oligonucleotide sequences are reverse sorted to read 3′ to 5′. The oligonucleotides are entered in the seq file in this form, i.e., reading 3′ to 5′.
Once sorted into types, the position of the oligonucleotides on the synthesis plates is specified at[0166]process step582 by the creation of a .seq file as described above. The .seq file is associated with the respective .cmd and .tab files needed for synthesis of the particular chemistries specified for the oligonucleotides atprocess step584 by retrieval of the .cmd and .tab files atprocess step586 from thesample database520. These files are then input into the multi well synthesizer atprocess step588 for oligonucleotide synthesis. Once physically synthesized, the list of oligonucleotides again enters the general procedure flow as indicated in FIG. 1. For shipping, storage or other handling purposes, the plates can be lyophilized at this point if desired. Upon lyophilization, each well contains the oligonucleotides located therein as a dry compound.
14. Quality ControlIn an optional step, quality control is performed on the oligonucleotides at[0167]process step600 after a decision is made (decision step550) to perform quality control. Although optional, quality control may be desired when there is some reason to doubt that some aspect of thesynthetic process step500 has been compromised. Alternatively, samples of the oligonucleotides may be taken and stored in the event that the results of assays conducted using the oligonucleotides (process step700) yield confusing results or suboptimal data. In the latter event, for example, quality control might be performed afterdecision step800 if no oligonucleotides with sufficient activity are identified. In either event,decision step650 follows qualitycontrol step process600. If one or more of the oligonucleotides do not pass quality control,process step500 can be repeated, i.e., the oligonucleotides are synthesized for a second time.
The operation of the quality control system[0168]general procedure600 is detailed in steps610-660 of FIG. 15. Also referenced in the following discussion are the robotics and associated analytical instrumentation as shown in FIG. 18.
During step[0169]610 (FIG. 15), sterile, double-distilled water is transferred by an automated liquid handler (2040 of FIG. 18) to each well of a multi-well plate containing a set of lyophilized antisense oligonucleotides. The automated liquid handler (2040 of FIG. 18) reads the barcode sticker on the multi-well plate to obtain the plate's identification number. Automatedliquid handler2040 then queries Sample Database520 (which resides inDatabase Server2002 of FIG. 18) for the quality control assay instruction set for that plate and executes the appropriate steps. Three quality control processes are illustrated, however, it is understood that other quality control processes or steps maybe practiced in addition to or in place of the processes illustrated.
The first illustrative quality control process ([0170]steps622 to626) quantitates the concentration of oligonucleotide in each well. If this quality control step is performed, an automated liquid handler (2040 of FIG. 18) is instructed to remove an aliquot from each well of the master plate and generate a replicate daughter plate for transfer to the UV spectrophotometer (2016 of FIG. 18). The UV spectrophotometer (2016 of FIG. 18) then measures the optical density of each well at a wavelength of 260 nanometers. Using standardized conversion factors, a microprocessor within UV spectrophotometer (2016 of FIG. 18) then calculates a concentration value from the measured absorbance value for each well and output the results toSample Database520.
The second illustrative quality control process steps[0171]632 to636) quantitates the percent of total oligonucleotide in each well that is full length. If this quality control step is performed, an automated liquid handler (2040 of FIG. 18) is instructed to remove an aliquot from each well of the master plate and generate a replicate daughter plate for transfer to the multichannel capillary gel electrophoresis apparatus (2022 of FIG. 18). The apparatus electrophoretically resolves in capillary tube gels the oligonucleotide product in each well. As the product reaches the distal end of the tube gel during electrophoresis, a detection window dynamically measures the optical density of the product that passes by it. Following electrophoresis, the value of percent product that passed by the detection window with respect to time is utilized by a built in microprocessor to calculate the relative size distribution of oligonucleotide product in each well. These results are then output to the Sample Database (520.
The third illustrative quality control process steps[0172]632 to636) quantitates the mass of total oligonucleotide in each well that is full length. If this quality control step is performed, an automated liquid handler (2040 of FIG. 18) is instructed to remove an aliquot form each well of the master plate and generate a replicate daughter plate for transfer to the multichannel liquid electrospray mass spectrometer (2018 of FIG. 18). The apparatus then uses electrospray technology to inject the oligonucleotide product into the mass spectrometer. A built in microprocessor calculates the mass-to-charge ratio to arrive at the mass of oligonucleotide product in each well. The results are then output toSample Database520.
Following completion of the selected quality control processes, the output data is manually examined or is examined using an appropriate algorithm and a decision is made as to whether or not the plate receives ‘Pass’ or ‘Fail’ status. The current criteria for acceptance is that at least 85% of the oligonucleotides in a multi-well plate must be 85% or greater full length product as measured by both capillary gel electrophoresis and mass spectrometry. An input (manual or automated) is then made into[0173]Sample Database520 as to the pass/fail status of the plate. If a plate fails, the process cycles back to step500, and a new plate of the same oligonucleotides is automatically placed in the plate synthesis request queue (process554 of FIG. 15). If a plate receives ‘Pass’ status, an automated liquid handler (2040 of FIG. 18) is instructed to remove appropriate aliquots from each well of the master plate and generate two replicate daughter plates in which the oligonucleotide in each well is at a concentration of 30 micromolar. The plate then moves on to process700 for oligonucleotide activity evaluation.
15. Cell Lines for Assaying Oligonucleotide ActivityThe effect of antisense compounds on target nucleic acid expression can be tested in any of a variety of cell types provided that the target nucleic acid, or its gene product, is present at measurable levels. This can be routinely determined using, for example, PCR or Northern blot analysis. The following four cell types are provided for illustrative purposes, but other cell types can be routinely used.[0174]
T-24 cells: The transitional cell bladder carcinoma cell line T-24 is obtained from the American Type Culture Collection (ATCC) (Manassas, Va.). T-24 cells were routinely cultured in complete McCoy's 5A basal media (Life Technologies, Gaithersburg, Md.) supplemented with 10% fetal calf serum,[0175]penicillin 100 units per milliliter, andstreptomycin 100 micrograms per milliliter (all from Life Technologies). Cells are routinely passaged by trysinization and dilution when they reach 90% confluence. Cells are routinely seeded into 96-well plates (Falcon-Primaria #3872) at a density of 7000 cells/well for use in RT-PCR analysis. For Northern blotting or other analysis, cells are seeded onto 100 mm or other standard tissue culture plates and treated similarly, using appropriate volumes of medium and oligonucleotide.
A549 cells: The human lung carcinoma cell line A549 is obtained from the ATCC (Manassas, Va.). A549 cells were routinely cultured in DMEM basal media (Life Technologies) supplemented with 10% fetal calf serum,[0176]penicillin 100 units per milliliter, andstreptomycin 100 micrograms per milliliter (all from Life Technologies). Cells are routinely passaged by trysinization and dilution when they reach 90% confluence.
NHDF cells: Human neonatal dermal fibroblast (NHDF) were obtained from the Clonetics Corporation (Walkersville, Md.). NHDFs were routinely maintained in Fibroblast Growth Medium (Clonetics Corp.) as provided by the supplier. Cells are maintained for up to 10 passages as recommended by the supplier.[0177]
HEK cells: Human embryonic keratinocytes (HEK) were obtained from the Clonetics Corp. HEKs were routinely maintained in Keratinocyte Growth Medium (Clonetics Corp.) as provided by the supplier. Cell are routinely maintained for up to 10 passages as recommended by the supplier.[0178]
16. Treatment of Cells with Candidate CompoundsWhen cells reach about 80% confluency, they are treated with oligonucleotide. For cells grown in 96-well plates, wells are washed once with 200 μl Opti-MEM-I™ reduced-serum medium (Life Technologies) and then treated with 130 μl of Opti-MEM-I™ containing 3.75 μg/ml LIPOFECTIN (Life Technologies) and the desired oligonucleotide at a final concentration of 150 nM. After 4 hours of treatment, the medium was replaced with fresh medium. Cells were harvested 16 hours after oligonucleotide treatment.[0179]
17. Assaying Oligonucleotide Activityoligonucleotide-mediated modulation of expression of a target nucleic acid can be assayed in a variety of ways known in the art. For example, target RNA levels can be quantitated by, e.g., Northern blot analysis, competitive PCR, or reverse transcriptase polymerase chain reaction (RT-PCR). RNA analysis can be performed on total cellular RNA or, preferably in the case of polypeptide-encoding nucleic acids, poly(A)+mRNA. For RT-PCR, poly(A)+mRNA is preferred. Methods of RNA isolation are taught in, for example, Ausubel et al. ([0180]Short Protocols in Molecular Biology, 2nd Ed., pp. 4-1 to 4-13, Greene Publishing Associates and John Wiley & Sons, New York, 1992). Northern blot analysis is routine in the art (Id., pp. 4-14 to 4-29). Reverse transcriptase polymerase chain reaction (RT-PCR) can be conveniently accomplished using the commercially available ABI PRISM 7700 Sequence Detection System (PE-Applied Biosystems, Foster City, Calif.) according to manufacturer's instructions. Other methods of PCR are also known in the art.
Target protein levels can be quantitated in a variety of ways well known in the art, such as immunoprecipitation, Western blot analysis (immunoblotting), Enzyme-linked immunosorbent assay (ELISA) or fluorescence-activated cell sorting (FACS). Antibodies directed to a protein encoded by a target nucleic acid can be identified and obtained from a variety of sources, such as the MSRS catalog of antibodies, (Aerie Corporation, Birmingham, Mich. or via the internet at http://www.ANTIBODIESPROBES.com/), or can be prepared via conventional antibody generation methods. Methods for preparation of poly-clonal, monospecific (antipeptide’) and monoclonal antisera are taught by, for example, Ausubel et al. ([0181]Short Protocols in Molecular Biology, 2nd Ed., pp. 11-3 to 11-54, Greene Publishing Associates and John Wiley & Sons, New York, 1992).
Immunoprecipitation methods are standard in the art and are described by, for example, Ausubel et al. (Id., pp. 10-57 to 10-63). Western blot (immunoblot) analysis is standard in the art (Id., pp. 10-32 to 10-10-35). Enzyme-linked immunosorbent assays (ELISA) are standard in the art (Id., pp. 11-5 to 11-17).[0182]
Because it is preferred to assay the compounds of the invention in a batchwise fashion, i.e., in parallel to the automated synthesis process described above,. preferred means of assaying are suitable for use in 96-well plates and with robotic means. Accordingly, automated RT-PCR is preferred for assaying target nucleic acid levels, and automated ELISA is preferred for assaying target protein levels.[0183]
The assaying step,[0184]general procedure step700, is described in detail in FIG. 16. After an appropriate cell line is selected atprocess step710, a decision is made atdecision step714 as to whether RT-PCR will be the only method by which the activity of the compounds is evaluated. In some instances, it is desirable to run alternative assay methods atprocess step718; for example, when it is desired to assess target polypeptide levels as well as target RNA levels, an immunoassay such as an ELISA is run in parallel with the RT-PCR assays. Preferably, such assays are tractable to semi-automated or robotic means.
When RT-PCR is used to evaluate the activities of the compounds, cells are plated into multi-well plates (typically, 96-well plates) in[0185]process step720 and treated with test or control oligonucleotides inprocess step730. Then, the cells are harvested and lysed inprocess step740 and the lysates are introduced into an apparatus where RT-PCR is carried out inprocess step750. A raw data file is generated, and the data is downloaded and compiled atstep760. Spreadsheet files with data charts are generated atprocess step770, and the experimental data is analyzed atprocess step780. Based on the results, a decision is made atprocess step785 as to whether it is necessary to repeat the assays and, if so, the process begins again withstep720. In any event, data from all the assays on each oligonucleotide are complied and statistical parameters are automatically deter-mined atprocess step790.
18. Classification of Compounds Based on Their ActivityFollowing assaying,[0186]general procedure step700, oligonucleotide compounds are classified according to one or more desired properties. Typically, three classes of compounds are used: active compounds, marginally active (or ‘marginal’) compounds and inactive compounds. To some degree, the selection criteria for these classes vary from target to target, and members of one or more classes may not be present for a given set of oligonucleotides.
However, some criteria are constant. For example, inactive compounds will typically comprise those compounds having 5% or less inhibition of target expression (relative to basal levels). Active compounds will typically cause at least 30% inhibition of target expression, although lower levels of inhibition are acceptable in some instances. Marginal compounds will have activities intermediate between active and inactive compounds, with preferred marginal compounds having activities more like those of active compounds.[0187]
19. Optimization of Lead Compounds by SequenceOne means by which oligonucleotide compounds are optimized for activity is by varying their nucleobase sequences so that different regions of the target nucleic acid are targeted. Some such regions will be more accessible to oligonucleotide compounds than others, and ‘sliding’ a nucleobase sequence along a target nucleic acid only a few bases can have significant effects on activity. Accordingly, varying or adjusting the nucleobase sequences of the compounds of the invention is one means by which suboptimal compounds can be made optimal, or by which new active compounds can be generated.[0188]
The operation of the[0189]gene walk process1100 detailed in steps1104-1112 of FIG. 17 is detailed as follows. As used herein, the term ‘gene walk’ is defined as the process by which a specified oligonucleotide sequence x that binds to a specified nucleic acid target y is used as a frame of reference around which a series of new oligonucleotides sequences capable of hybridizing to nucleic acid target y are generated that are sequence shifted increments of oligonucleotide sequence x. Gene walking can be done “down-stream”, ‘upstream’ or in both directions from a specified oligonucleotide.
During[0190]step1104 the user manually enters the identification number of the oligonucleotide sequence around which it is desired to executegene walk process1100 and the name of the corresponding target nucleic acid. The user then enters the scope of the gene walk atstep1104, by which is meant the number of oligonucleotide sequences that it is desired to generate. The user then enters in step1108 a positive integer value for the sequence shift increment. Once this data is generated, the gene walk is effected. This causes a subroutine to be executed that automatically generates the desired list of sequences by walking along the target sequence. At that point, the user proceeds to process400 to assign chemistries to the selected oligonucleotides.
Example 16 below, details a gene walk. In subsequent steps, this new set of nucleobase sequences generated by the gene walk is used to direct the automated synthesis at[0191]general procedure step500 of a second set of candidate oligonucleotides. These compounds are then taken through subsequent process steps to yield active compounds or reiterated as necessary to optimize activity of the compounds.
20. Optimization of Lead Compounds by ChemistryAnother means by which oligonucleotide compounds of the invention are optimized is by reiterating portions of the process of the invention using marginal compounds from the first iteration and selecting additional chemistries to the nucleobase sequences thereof.[0192]
Thus, for example, an oligonucleotide chemistry different from that of the first set of oligonucleotides is assigned at[0193]general procedure step400. The nucleobase sequences of marginal compounds are used to direct the synthesis atgeneral procedure step500 of a second set of oligonucleotides having the second assigned chemistry. The resulting second set of oligonucleotide compounds is assayed in the same manner as the first set atprocedure process step700 and the results are examined to determine if compounds having sufficient activity have been generated atdecision step800.
21. Identification of Sites Amenable to Antisense TechnologiesIn a related process, a second oligonucleotide chemistry is assigned at[0194]procedure step400 to the nucleobase sequences of all of the oligonucleotides (or, at least, all of the active and marginal compounds) and a second set of oligonucleotides is synthesized atprocedure step500 having the same nucleobase sequences as the first set of compounds. The resulting second set of oligonucleotide compounds is assayed in the same manner as the first set atprocedure step700 and active and marginal compounds are identified at procedure steps800 and1000.
In order to identify sites on the target nucleic acid that are amenable to a variety of antisense technologies, the following mathematically simple steps are taken. The sequences of active and marginal compounds from two or more such automated syntheses/assays are compared and a set of nucleobase sequences that are active, or marginally so, in both sets of compounds is identified. The reverse complements of these nucleobase sequences corresponds to sequences of the target nucleic acid that are tractable to a variety of antisense and other sequence-based technologies. These antisense-sensitive sites are assembled into contiguous sequences (contigs) using the procedures described for assembling target nucleotide sequences (at procedure step[0195]200).
22. Systems for Executing Preferred Methods of the InventionAn embodiment of computer, network and instrument resources for effecting the methods of the invention is shown in FIG. 18. In this embodiment, four computer servers are provided. First, a[0196]large database server2002 stores all chemical structure, sample tracking and genomic, assay, quality control, and program status data. Further, this database server serves as the platform for a document management system. Second, acompute engine2004 runs computational programs including RNA folding, oligonucleotide walking, and genomic searching. Third, afile server2006 allows raw instrument output storage and sharing of robot instructions. Fourth, agroupware server2008 enhances staff communication and process scheduling.
A redundant high-speed network system is provided between the main servers and the[0197]bridges2026,2028 and2030. These bridges provide reliable network access to the many workstations and instruments deployed for this process. The instruments selected to support this embodiment are all designed to sample directly from standard 96 well microtiter plates, and include anoptical density reader2016, a combined liquid chromatography andmass spectroscopy instrument2018, a gel fluorescence andscintillation imaging system2032 and2042, a capillarygel electrophoreses system2022 and a real-time PCR system2034.
Most liquid handling is accomplished automatically using robots with individually controllable[0198]robotic pipetters2038 and2020 as well as a 96-well pipette system2040 for duplicating plates. Windows NT orMacintosh workstations2044,2024, and2036 are deployed for instrument control, analysis and productivity support.
23. Relational DatabaseData is stored in an appropriate database. For use with the methods of the invention, a relational database is preferred. FIG. 19 illustrates the data structure of a sample relational database. Various elements of data are segregated among linked storage elements of the database.[0199]
EXAMPLESThe following examples illustrate the invention and are not intended to limit the same. Those skilled in the art will recognize, or be able to ascertain through routine experimentation, numerous equivalents to the specific procedures, materials and devices described herein. Such equivalents are considered to be within the scope of the present invention.[0200]
Example 1Selection of CD40 as a TargetCell-cell interactions are a feature of a variety of biological processes. In the activation of the immune response, for example, one of the earliest detectable events in a normal inflammatory response is adhesion of leukocytes to the vascular endothelium, followed by migration of leukocytes out of the vasculature to the site of infection or injury. The adhesion of leukocytes to vascular endothelium is an obligate step in their migration out of the vasculature (for a review, see Albelda et al.,[0201]FASEB J., 1994, 8, 504). As is well known in the art, cell-cell interactions are also critical for propagation of both B-lymphocytes and T-lymphocytes resulting in enhanced humoral and cellular immune responses, respectively (for a reviews, see Makgoba et al.,Immunol. Today, 1989, 10, 417; Janeway,Sci. Amer., 1993, 269, 72).
CD40 was first characterized as a receptor expressed on B-lymphocytes. It was later found that engagement of B-cell CD40 with CD40L expressed on activated T-cells is essential for T-cell dependent B-cell activation (i.e. proliferation, immunoglobulin secretion, and class switching) (for a review, see Gruss et al.[0202]Leuk. Lymphoma, 1997, 24, 393). A full cDNA sequence for CD40 is available (GenBank accession number X60592, incorporated herein as SEQ ID NO:85).
As interest in CD40 mounted, it was subsequently revealed that functional CD40 is expressed on a variety of cell types other than B-cells, including macrophages, dendritic cells, thymic epithelial cells, Langerhans cells, and endothelial cells (Ibid.). These studies have led to the current belief that CD40 plays a much broader role in immune regulation by mediating interactions of T-cells with cell types other than B-cells. In support of this notion, it has been shown that stimulation of CD40 in macrophages and dendritic results is required for T-cell activation during antigen presentation (Id.). Recent evidence points to a role for CD40 in tissue inflammation as well. Production of the inflammatory mediators IL-12 and nitric oxide by macrophages has been shown to be CD40 dependent (Buhlmann et al.,[0203]J. Clin. Immunol., 1996, 16, 83). In endothelial cells, stimulation of CD40 by CD40L has been found to induce surface expression of E-selectin, ICAM-1, and VCAM-1, promoting adhesion of leukocytes to sites of inflammation (Buhlmann et al.,J. Clin. Immunol, 1996, 16, 83; Gruss et al.,Leuk Lymphoma, 1997, 24, 393). Finally, a number of reports have documented overexpression of CD40 in epithelial and hematopoietic tumors as well as tumor infiltrating endothelial cells, indicating that CD40 may play a role in tumor growth and/or angiogenesis as well (Gruss et al.,Leuk Lymphoma, 1997, 24, 393-422; Kluth et al.Cancer Res, 1997, 57, 891).
Due to the pivotal role that CD40 plays in humoral immunity, the potential exists that therapeutic strategies aimed at downregulating CD40 may provide a novel class of agents useful in treating a number of immune associated disorders, including but not limited to graft versus host disease, graft rejection, and autoimmune diseases such as multiple sclerosis, systemic lupus erythematosus, and certain forms of arthritis. Inhibitors of CD40 may also prove useful as an anti-inflammatory compound, and could therefore be useful as treatment for a variety of diseases with an inflammatory component such as asthma, rheumatoid arthritis, allograft rejections, inflammatory bowel disease, and various dermatological conditions, including psoriasis. Finally, as more is learned about the association between CD40 overexpression and tumor growth, inhibitors of CD40 may prove useful as anti-tumor agents as well.[0204]
Currently, there are no known therapeutic agents which effectively inhibit the synthesis of CD40. To date, strategies aimed at inhibiting CD40 function have involved the use of a variety of agents that disrupt CD40/CD40L binding. These include monoclonal antibodies directed against either CD40 or CD40L, soluble forms of CD40, and synthetic peptides derived from a second CD40 binding protein, A20. The use of neutralizing antibodies against CD40 and/or CD40L in animal models has provided evidence that inhibition of CD40 stimulation would have therapeutic benefit for GVHD, allograft rejection, rheumatoid arthritis, SLE, MS, and B-cell lymphoma (Buhlmann et al., J.[0205]Clin. Immunol, 1996, 16, 83). However, due to the expense, short half-life, and bioavailability problems associated with the use of large proteins as therapeutic agents, there is a long felt need for additional agents capable of effectively inhibiting CD40 function. oligonucleotides compounds avoid many of the pitfalls of current agents used to block CD40/CD40L interactions and may therefore prove to be uniquely useful in a number of therapeutic applications.
Example 2Generation of Virtual Oligonucleotides Targeted to CD40The process of the invention was used to select oligonucleotides targeted to CD40, generating the list of oligonucleotide sequences with desired properties as shown in FIG. 22. From the assembled CD40 sequence, the process began with determining the desired oligonucleotide length to be eighteen nucleotides, as represented in[0206]step2500. All possible oligonucleotides of this length were generated by Oligo 5.0™, as represented instep2504. Desired thermodynamic properties were selected instep2508. The single parameter used was oligonucleotides of melting temperature less than or equal to 40° C. were discarded. Instep2512, oligonucleotide melting temperatures were calculated by Oligo 5.0™. Oligonucleotide sequences possessing an undesirable score were discarded. It is believed that oligonucleotides with melting temperatures near or below physiological and cell culture temperatures will bind poorly to target sequences. All oligonucleotide sequences remaining were exported into a spreadsheet. Instep2516, desired sequence properties are selected. These include discarding oligonucleotides with at least one stretch of four guanosines in a row and stretches of six of any other nucleotide in a row. Instep2520, a spreadsheet macro removed all oligonucleotides containing the text string ‘GGGG’. Instep2524, another spreadsheet macro removed all oligonucleotides containing the text strings ‘AAAAAA’ or ‘CCCCCC’ or ‘TTTTTT’. From the remaining oligonucleotide sequences, 84 sequences were selected manually with the criteria of having an uniform distribution of oligonucleotide sequences throughout the target sequence, as represented instep2528. These oligonucleotide sequences were then passed to the next step in the process, assigning actual oligonucleotide chemistries to the sequences.
Example 3Input Files for Automated Oligonucleotide SynthesisCommand File (.cmd File)[0207]
Table 2 is a command file for synthesis of oligonucleotide having regions of 2′-O-(methoxyethyl) nucleosides and region of 2′-deoxy nucleosides each linked by phosphorothioate intemucleotide linkages.
[0208] | TABLE 2 |
| |
| |
| SOLD_SUPPORT_SKIP |
| BEGIN |
| Next_Sequence |
| END |
| INITIAL-WASH |
| BEGIN |
| Add ACN |
| 300 |
| Drain 10 |
| END |
| LOOP-BEGIN |
| DEBLOCK |
| BEGIN |
| Prime TCA |
| Load Tray |
| Repeat |
| 2 |
| Add TCA 150 |
| Wait 10 |
| Drain 8 |
| End_Repeat |
| Remove Tray |
| Add TCA 125 |
| Wait 10 |
| Drain 8 |
| END |
| WASH_AFTER_DEBLOCK |
| BEGIN |
| Repeat |
| 3 |
| Add ACN 250 To_All |
| Drain 10 |
| End_Repeat |
| END |
| COUPLING |
| BEGIN |
| if class = DEOXY_THIOATE |
| Nozzle wash <act1> |
| prime <act1> |
| prime <seq> |
| Add <act1> 70 + <seq> 70 |
| Wait 40 |
| Drain 5 |
| end-if |
| if class = MOE_THIOATE |
| Nozzle wash <act1> |
| Prime <act1> |
| prime <seq> |
| Add <act1> 120 + <seq> 120 |
| Wait 230 |
| Drain 5 |
| End_if |
| END |
| WASH_AFTER_COUPLING |
| BEGIN |
| Add ACN |
| 200 To_All |
| Drain 10 |
| END |
| OXIDIZE |
| BEGIN |
| if class = DEOXY_THIOATE |
| Add BEAU 180 |
| Wait 40 |
| Drain 7 |
| end_if |
| if class =MOE_THIOATE |
| Add BEAU |
| 200 |
| Wait 120 |
| Drain 7 |
| end_if |
| END |
| CAP |
| BEGIN |
| Add CAP_B 80 + CAP_A 80 |
| Wait 20 |
| Drain 7 |
| END |
| WASH_AFTER_CAP |
| BEGIN |
| Add ACN 150To_All |
| Drain |
| 5 |
| Add ACN 250 To_All |
| Drain 11 |
| END |
| BASE_COUNTER |
| BEGIN |
| Next_Sequence |
| END |
| LOOP_END |
| DEBLOCK_FINAL |
| BEGIN |
| Prime TCA |
| Load Tray |
| Repeat |
| 2 |
| Add TCA 150 To_All |
| Wait 10 |
| Drain 8 |
| End_Repeat |
| Remove Tray |
| Add TCA 125 To_All |
| Wait 10 |
| Drain 10 |
| END |
| FINAL_WASH |
| BEGIN |
| Repeat |
| 4 |
| Add ACN 300 to_All |
| Drain_12 |
| End_Repeat |
| END |
| ENDALL |
| BEGIN |
| Wait |
| 3 |
| END |
| |
Sequence Files (.seq Files)[0209]
Table 3 is a seq file for oligonucleotides having 2′-deoxy nucleosides linked by phosphorothioate internucleotide linkages.
[0210]| TABLE 3 |
|
|
| Identity of columns: Syn #, Well, Scale, | |
| Nucleotide at particular position (identified |
| using base identifier followed by backbone |
| identifier where ‘s’ is phosphorothioate). |
| Note the columns wrap around to next line when |
| longer than one line. |
|
|
| 1 | A01 | 200 | As Cs Cs As Gs Gs As Cs Gs | |
|
| | | Gs Cs Gs Gs As Cs Cs As Gs |
|
| 2 | A02 | 200 | As Cs Gs Gs Cs Gs Gs As Cs |
|
| | | Cs As Gs As Gs Ts Gs Gs As |
|
| 3 | A03 | 200 | As Cs Cs As As Gs Cs As Gs |
|
| | | As Cs Gs Gs As Gs As Cs Gs |
|
| 4 | A04 | 200 | As Gs Gs As Gs As Cs Cs Cs |
|
| | | Cs Gs As Cs Gs As As Cs Gs |
|
| 5 | A05 | 200 | As Cs Cs Cs Cs Gs As Cs Gs |
|
| | | As As Cs Gs As Cs Ts Gs Gs |
|
| 6 | A06 | 200 | As Cs Gs As As Cs Gs As Cs |
|
| | | Ts Gs Gs Cs Gs As Cs As Gs |
|
| 7 | A07 | 200 | As Cs Gs As Cs Ts Gs Gs Cs |
|
| | | Gs As Cs As Gs Gs Ts As Gs |
|
| 8 | A08 | 200 | As Cs As Gs Gs Ts As Gs Gs |
|
| | | Ts Cs Ts Ts Gs Gs Ts Gs Gs |
|
| 9 | A09 | 200 | As Gs Gs Ts Cs Ts Ts Gs Gs |
|
| | | Ts Gs Gs Gs Ts Gs As Cs Gs |
|
| 10 | A10 | 200 | As Gs Ts Cs As Cs Gs As Cs |
|
| | | As As Gs As As As Cs As Cs |
|
| 11 | A11 | 200 | As Cs Gs As Cs As As Gs As |
|
| | | As As Cs As Cs Gs Gs Ts Cs |
|
| 12 | A12 | 200 | As Gs As As As Cs As Cs Gs |
|
| | | Gs Ts Cs Gs Gs Ts Cs Cs Ts |
|
| 13 | B01 | 200 | As As Cs As Cs Gs Gs Ts Cs |
|
| | | Gs Gs Ts Cs Cs Ts Gs Ts Cs |
|
| 14 | B02 | 200 | As Cs Ts Cs As Cs Ts Gs As |
|
| | | Cs Gs Ts Gs Ts Cs Ts Cs As |
|
| 15 | B03 | 200 | As Cs Gs Gs As As Gs Gs As |
|
| | | As Cs Gs Cs Cs As Cs Ts Ts |
|
| 16 | B04 | 200 | As Ts Cs Ts Gs Ts Gs Gs As |
|
| | | Cs Cs Ts Ts Gs Ts Cs Ts Cs |
|
| 17 | B05 | 200 | As Cs As Cs Ts Ts Cs Ts Ts |
|
| | | Cs Cs Gs As Cs Cs Gs Ts Gs |
|
| 18 | B06 | 200 | As Cs Ts Cs Ts Cs Gs As Cs |
|
| | | As Cs As Gs Gs As Cs Gs Ts |
|
| 19 | B07 | 200 | As As As Cs Cs Cs Cs As Gs |
|
| | | Ts Ts Cs Gs Ts Cs Ts As As |
|
| 20 | B08 | 200 | As Ts Gs Ts Cs Cs Cs Cs As |
|
| | | As As Gs As Cs Ts As Ts Gs |
|
| 21 | B09 | 200 | As Cs Gs Cs Ts Cs Gs Gs Gs |
|
| | | As Cs Gs Gs Gs Ts Cs As Gs |
|
| 22 | B10 | 200 | As Gs Cs Cs Gs As As Gs As |
|
| | | As Gs As Gs Gs Ts Ts As Cs |
|
| 23 | B11 | 200 | As Cs As Cs As Gs Ts As Gs |
|
| | | As Cs Gs As As As Gs Cs Ts |
|
| 24 | B12 | 200 | As Cs As Cs Ts Cs Ts Gs Gs |
|
| | | Ts Ts Ts Cs Ts Gs Gs As Cs |
|
| 25 | C01 | 200 | As Cs Gs As Cs Cs As Gs As |
|
| | | As As Ts As Gs Ts Ts Ts Ts |
|
| 26 | C02 | 200 | As Gs Ts Ts As As As As Gs |
|
| | | Gs Gs Cs Ts Gs Cs Ts As Gs |
|
| 27 | C03 | 200 | As Gs Gs Ts Ts Gs Ts Gs As |
|
| | | Cs Gs As Cs Gs As Gs Gs Ts |
|
| 28 | C04 | 200 | As As Ts Gs Ts As Cs Cs Ts |
|
| | | As Gs Gs Cs Ts Ts Gs Gs Cs |
|
| 29 | C05 | 200 | As Gs Ts Cs As Cs Gs Ts Cs |
|
| | | Cs Ts Cs Ts Cs Ts Gs Ts Cs |
|
| 30 | C06 | 200 | Cs Ts Gs Gs Cs Gs As Cs As |
|
| | | Gs Gs Ts As Gs Gs Ts Cs Ts |
|
| 31 | C07 | 200 | Cs Ts Cs Ts Gs Ts Gs Ts Gs |
|
| | | As Cs Gs Gs Ts Gs Gs Ts Cs |
|
| 32 | C08 | 200 | Cs As Gs Gs Ts Cs Gs Ts Cs |
|
| | | Ts Ts Cs Cs Cs Gs Ts Gs Gs |
|
| 33 | C09 | 200 | Cs Ts Gs Ts Gs Gs Ts As Gs |
|
| | | As Cs Gs Ts Gs Gs As Cs As |
|
| 34 | C10 | 200 | Cs Ts As As Cs Gs As Ts Gs |
|
| | | Ts Cs Cs Cs Cs As As As Gs |
|
| 35 | C11 | 200 | Cs Ts Gs Ts Ts Cs Gs As Cs |
|
| | | As Cs Ts Cs Ts Gs Gs Ts Ts |
|
| 36 | C12 | 200 | Cs Ts Gs Gs As Cs Cs As As |
|
| | | Cs As Cs Gs Ts Ts Gs Ts Cs |
|
| 37 | D01 | 200 | Cs Cs Gs Ts Cs Cs Gs Ts Gs |
|
| | | Ts Ts Ts Gs Ts Ts Cs Ts Gs |
|
| 38 | D02 | 200 | Cs Ts Gs As Cs Ts As Cs As |
|
| | | As Cs As Gs As Cs As Cs Cs |
|
| 39 | D03 | 200 | Cs As As Cs As Gs As Cs As |
|
| | | Cs Cs As Gs Gs Gs Gs Ts Cs |
|
| 40 | D04 | 200 | Cs As Gs Gs Gs Gs Ts Cs Cs |
|
| | | Ts As Gs Cs Cs Gs As Cs Ts |
|
| 41 | D05 | 200 | Cs Ts Cs Ts As Gs Ts Ts As |
|
| | | As As As Gs Gs Gs Cs Ts Gs |
|
| 42 | D06 | 200 | Cs Ts Gs Cs Ts As Gs As As |
|
| | | Gs Gs As Cs Cs Gs As Gs Gs |
|
| 43 | D07 | 200 | Cs Ts Gs As As As Ts Gs Ts |
|
| | | As Cs Cs Ts As Cs Gs Gs Ts |
|
| 44 | D08 | 200 | Cs As Cs Cs Cs Gs Ts Ts Ts |
|
| | | Gs Ts Cs Cs Gs Ts Cs As As |
|
| 45 | D09 | 200 | Cs Ts Cs Gs As Ts As Cs Gs |
|
| | | Gs Gs Ts Cs As Gs Ts Cs As |
|
| 46 | D10 | 200 | Gs Gs Ts As Gs Gs Ts Cs Ts |
|
| | | Ts Gs Gs Ts Gs Gs Gs Ts Gs |
|
| 47 | D11 | 200 | Gs As Cs Ts Ts Ts Gs Cs Cs |
|
| | | Ts Ts As Cs Gs Gs As As Gs |
|
| 48 | D12 | 200 | Gs Ts Gs Gs As Gs Ts Cs Ts |
|
| | | Ts Ts Gs Ts Cs Ts Gs Ts Gs |
|
| 49 | E01 | 200 | Gs Gs As Gs Ts Cs Ts Ts Ts |
|
| | | Gs Ts Cs Ts Gs Ts Gs Gs Ts |
|
| 50 | E02 | 200 | Gs Gs As Cs As Cs Ts Cs Ts |
|
| | | Cs Gs As Cs As Cs As Gs Gs |
|
| 51 | E03 | 200 | Gs As Cs As Cs As Gs Gs As |
|
| | | Cs Gs Ts Gs Gs Cs Gs As Gs |
|
| 52 | E04 | 200 | Gs As Gs Ts As Cs Gs As Gs |
|
| | | Cs Gs Gs Gs Cs Cs Gs As As |
|
| 53 | E05 | 200 | Gs As Cs Ts As Ts Gs Gs Ts |
|
| | | As Gs As Cs Gs Cs Ts Cs Gs |
|
| 54 | E06 | 200 | Gs As As Gs As Gs Gs Ts Ts |
|
| | | As Cs As Cs As Gs Ts As Gs |
|
| 55 | E07 | 200 | Gs As Gs Gs Ts Ts As Cs As |
|
| | | Cs As Gs Ts As Gs As Cs Gs |
|
| 56 | E08 | 200 | Gs Ts Ts Gs Ts Cs Cs Gs Ts |
|
| | | Cs Cs Gs Ts Gs Ts Ts Ts Gs |
|
| 57 | E09 | 200 | Gs As Cs Ts Cs Ts Cs Gs Gs |
|
| | | Gs As Cs Cs As Cs Cs As Cs |
|
| 58 | E10 | 200 | Gs Ts As Gs Gs As Gs As As |
|
| | | Cs Cs As Cs Gs As Cs Cs As |
|
| 59 | E11 | 200 | Gs Gs Ts Ts Cs Ts Ts Cs Gs |
|
| | | Gs Ts Ts Gs Gs Ts Ts As Ts |
|
| 60 | E12 | 200 | Gs Ts Gs Gs Gs Gs Ts Ts Cs |
|
| | | Gs Ts Cs Cs Ts Ts Gs Gs Gs |
|
| 61 | F01 | 200 | Gs Ts Cs As Cs Gs Ts Cs Cs |
|
| | | Ts Cs Ts Gs As As As Ts Gs |
|
| 62 | F02 | 200 | Gs Ts Cs Cs Ts Cs Cs Ts As |
|
| | | Cs Cs Gs Ts Ts Ts Cs Ts Cs |
|
| 63 | F03 | 200 | Gs Ts Cs Cs Cs Cs As Cs Gs |
|
| | | Ts Cs Cs Gs Ts Cs Ts Ts Cs |
|
| 64 | F04 | 200 | Ts Cs As Cs Cs As Gs Gs As |
|
| | | Cs Gs Gs Cs Gs Gs As Cs Cs |
|
| 65 | F05 | 200 | Ts As Cs Cs As As Gs Cs As |
|
| | | Gs As Cs Gs Gs As Gs As Cs |
|
| 66 | F06 | 200 | Ts Cs Cs Ts Gs Ts Cs Ts Ts |
|
| | | Ts Gs As Cs Cs As Cs Ts Cs |
|
| 67 | F07 | 200 | Ts Gs Ts Cs Ts Ts Ts Gs As |
|
| | | Cs Cs As Cs Ts Cs As Cs Ts |
|
| 68 | F08 | 200 | Ts Gs As Cs Cs As Cs Ts Cs |
|
| | | As Cs Ts Gs As Cs Gs Ts Gs |
|
| 69 | F09 | 200 | Ts Gs As Cs Gs Ts Gs Ts Cs |
|
| | | Ts Cs As As Gs Ts Gs As Cs |
|
| 70 | F10 | 200 | Ts Cs As As Gs Ts Gs As Cs |
|
| | | Ts Ts Ts Gs Cs Cs Ts Ts As |
|
| 71 | F11 | 200 | Ts Gs Ts Ts Ts As Ts Gs As |
|
| | | Cs Gs Cs Ts Gs Gs Gs Gs Ts |
|
| 72 | F12 | 200 | Ts Ts As Ts Gs As Cs Gs Cs |
|
| | | Ts Gs Gs Gs Gs Ts Ts Gs Gs |
|
| 73 | G01 | 200 | Ts Gs As Cs Gs Cs Ts Gs Gs |
|
| | | Gs Gs Ts Ts Gs Gs As Ts Cs |
|
| 74 | G02 | 200 | Ts Cs Gs Ts Cs Ts Ts Cs Cs |
|
| | | Cs Gs Ts Gs Gs As Gs Ts Cs |
|
| 75 | G03 | 200 | Ts Gs Gs Ts As Gs As Cs Gs |
|
| | | Ts Gs Gs As Cs As Cs Ts Ts |
|
| 76 | G04 | 200 | Ts Ts Cs Ts Ts Cs Cs Gs As |
|
| | | Cs Cs Gs Ts Gs As Cs As Ts |
|
| 77 | G05 | 200 | Ts Gs Gs Ts As Gs As Cs Gs |
|
| | | Cs Ts Cs Gs Gs Gs As Cs Gs |
|
| 78 | G06 | 200 | Ts As Gs As Cs Gs Cs Ts Cs |
|
| | | Gs Gs Gs As Cs Gs Gs Gs Ts |
|
| 79 | G07 | 200 | Ts Ts Ts Ts As Cs As Gs Ts |
|
| | | Gs Gs Gs As As Cs Cs Ts Gs |
|
| 80 | G08 | 200 | Ts Gs Gs Gs As As Cs Cs Ts |
|
| | | Gs Ts Ts Cs Gs As Cs As Cs |
|
| 81 | G09 | 200 | Ts Cs Gs Gs Gs As Cs Cs As |
|
| | | Cs Cs As Cs Ts As Gs Gs Gs |
|
| 82 | G10 | 200 | Ts As Gs Gs As Cs As As As |
|
| | | Cs Gs Gs Ts As Gs Gs As Cs |
|
| 83 | G11 | 200 | Ts Gs Cs Ts As Gs As As Gs |
|
| | | Gs As Cs Cs Gs As Gs Gs Ts |
|
| 84 | G12 | 200 | Ts Cs Ts Gs Ts Cs As Cs Ts |
|
| | | Cs Cs Gs As Cs Gs Ts Gs Gs |
|
|
Table 4 is a .seq file for oligonucleotides having regions of 2′-O-methoxyethyl)nucleosides and region of 2′-deoxy nucleosides each linked by phosphorothioate inter-nucleotide linkages.
[0211]| TABLE 4 |
|
|
| Identity of columns: Syn #, Well, Scale, | |
| Nucleotide at particular position (identified |
| using base identifier followed by backbone |
| identifier where ‘s’ is phosphorothioate and |
| ‘moe’ indicated a 2′-O-(methoxyethy) substituted |
| nucleoside). The columns wrap around to next |
| line when longer than one line. |
|
|
| 1 | A01 | 200 | moeAs moeCs moeCs moeAs Gs Gs As | |
|
| | | Cs Gs Gs Cs Gs Gs As |
|
|
| | | moeCs moeCs moeAs moeGs |
|
| 2 | A02 | 200 | moeAs moeCs moeGs moeGs Cs Gs Gs |
|
| | | As Cs Cs As Gs As Gs |
|
| | | moeTs moeGs moeGs moeAs |
|
| 3 | A03 | 200 | moeAs moeCs moeCs moeAs As Gs Cs |
|
| | | As Gs As Cs Gs Gs As |
|
| | | moeGs moeAs moeCs moeGs |
|
| 4 | A04 | 200 | moeAs moeGs moeGs moeAs Gs As Cs |
|
| | | Cs Cs Cs Gs As Cs Gs |
|
| | | moeAs moeAs moeCs moeGs |
|
| 5 | A05 | 200 | moeAs moeCs moeCs moeCs Cs Gs As |
|
| | | Cs Gs As As Cs Gs As |
|
| | | moeCs moeTs moeGs moeGs |
|
| 6 | A06 | 200 | moeAs moeCs moeGs moeAs As Cs Gs |
|
| | | As Cs Ts Gs Gs Cs Gs |
|
| | | moeAs moeCs moeAs moeGs |
|
| 7 | A07 | 200 | moeAs moeCs moeGs moeAs Cs Ts Gs |
|
| | | Gs Cs Gs As Cs As Gs |
|
| | | moeGs moeTs moeAs moeGs |
|
| 8 | A08 | 200 | moeAs moeCs moeAs moeGs Gs Ts As |
|
| | | Gs Gs Ts Cs Ts Ts Gs |
|
| | | moeGs moeTs moeGs moeGs |
|
| 9 | A09 | 200 | moeAs moeGs moeGs moeTs Cs Ts Ts |
|
| | | Gs Gs Ts Gs Gs Gs Ts |
|
| | | moeGs moeAs moeCs moeGs |
|
| 10 | A10 | 200 | moeAs moeGs moeTs moeCs As Cs Gs |
|
| | | As Cs As As Gs As As |
|
| | | moeAs moeCs moeAs moeCs |
|
| 11 | A11 | 200 | moeAs moeCs moeGs moeAs Cs As As |
|
| | | Gs As As As Cs As Cs |
|
| | | moeGs moeGs moeTs moeCs |
|
| 12 | A12 | 200 | moeAs moeGs moeAs moeAs As Cs As |
|
| | | Cs Gs Gs Ts Cs Gs Gs |
|
| | | moeTs moeCs moeCs moeTs |
|
| 13 | B01 | 200 | moeAs moeAs moeCs moeAs Cs Gs Gs |
|
| | | Ts Cs Gs Gs Ts Cs Cs |
|
| | | moeTs moeGs moeTs moeCs |
|
| 14 | B02 | 200 | moeAs moeCs moeTs moeCs As Cs Ts |
|
| | | Gs As Cs Gs Ts Gs Ts |
|
| | | moeCs moeTs moeCs moeAs |
|
| 15 | B03 | 200 | moeAs moeCs moeGs moeGs As As Gs |
|
| | | Gs As As Cs Gs Cs Cs |
|
| | | moeAs moeCs moeTs moeTs |
|
| 16 | B04 | 200 | moeAs moeTs moeCs moeTs Gs Ts Gs |
|
| | | Gs As Cs Cs Ts Ts Gs |
|
| | | moeTs moeCs moeTs moeCs |
|
| 17 | B05 | 200 | moeAs moeCs moeAs moeCs Ts Ts Cs |
|
| | | Ts Ts Cs Cs Gs As Cs |
|
| | | moeCs moeGs moeTs moeGs |
|
| 18 | B06 | 200 | moeAs moeCs moeTs moeCs Ts Cs Gs |
|
| | | As Cs As Cs As Gs Gs |
|
| | | moeAs moeCs moeGs moeTs |
|
| 19 | B07 | 200 | moeAs moeAs moeAs moeCs Cs Cs Cs |
|
| | | As Gs Ts Ts Cs Gs Ts |
|
| | | moeCs moeTs moeAs moeAs |
|
| 20 | B08 | 200 | moeAs moeTs moeGs moeTs Cs Cs Cs |
|
| | | Cs As As As Gs As Cs |
|
| | | moeTs moeAs moeTs moeCs |
|
| 21 | B09 | 200 | moeAs moeCs moeGs moeCs Ts Cs Gs |
|
| | | Gs Gs As Cs Gs Gs Gs |
|
| | | moeTs moeCs moeAs moeGs |
|
| 22 | B10 | 200 | moeAs moeGs moeCs moeCs Gs As As |
|
| | | Gs As As Gs As Gs Gs |
|
| | | moeTs moeTs moeAs moeCs |
|
| 23 | B11 | 200 | moeAs moeCs moeAs moeCs As Gs Ts |
|
| | | As Gs As Cs Gs As As |
|
| | | moeAs moeGs moeCs moeTs |
|
| 24 | B12 | 200 | moeAs moeCs moeAs moeCs Ts Cs Ts |
|
| | | Gs Gs Ts Ts Ts Cs Ts |
|
| | | moeGs moeGs moeAs moeCs |
|
| 25 | C01 | 200 | moeAs moeCs moeGs moeAs Cs Cs As |
|
| | | Gs As As As Ts As Gs |
|
| | | moeTs moeTs moeTs moeTs |
|
| 26 | C02 | 200 | moeAs moeGs moeTs moeTs As As As |
|
| | | As Gs Gs Gs Gs Ts Gs |
|
| | | moeCs moeTs moeAs moeGs |
|
| 27 | C03 | 200 | moeAs moeGs moeGs moeTs Ts Gs Ts |
|
| | | Gs As Cs Gs As Cs Gs |
|
| | | moeAs moeGs moeGs moeTs |
|
| 28 | C04 | 200 | moeAs moeAs moeTs moeGs Ts As Cs |
|
| | | Cs Ts As Cs Gs Gs Ts |
|
| | | moeTs moeGs moeGs moeCs |
|
| 29 | C05 | 200 | moeAs moeGs moeTs moeCs As Cs Gs |
|
| | | Ts Cs Cs Ts Cs Ts Cs |
|
| | | moeTs moeGs moeTs moeCs |
|
| 30 | C06 | 200 | moeCs moeTs moeGs moeGs Cs Gs As |
|
| | | Cs As Gs Gs Ts As Gs |
|
| | | moeGs moeTs moeCs moeTs |
|
| 31 | C07 | 200 | moeCs moeTs moeCs moeTs Gs Ts Gs |
|
| | | Ts Gs As Cs Gs Gs Ts |
|
| | | moeGs moeGs moeTs moeCs |
|
| 32 | C08 | 200 | moeCs moeAs moeGs moeGs Ts Cs Gs |
|
| | | Ts Cs Ts Ts Cs Cs Cs |
|
| | | moeGs moeTs moeGs moeGs |
|
| 33 | C09 | 200 | moeCs moeTs moeGs moeTs Gs Gs Ts |
|
| | | As Gs As Cs Gs Ts Gs |
|
| | | moeGs moeAs moeCs moeAs |
|
| 34 | C10 | 200 | moeCs moeTs moeAs moeAs Cs Gs As |
|
| | | Ts Gs Ts Cs Cs Cs Cs |
|
| | | moeAs moeAs moeAs moeGs |
|
| 35 | C11 | 200 | moeCs moeTs moeGs moeTs Ts Cs Gs |
|
| | | As Cs As Cs Ts Cs Ts |
|
| | | moeGs moeGs moeTs moeTs |
|
| 36 | C12 | 200 | moeCs moeTs moeGs moeGs As Cs Cs |
|
| | | As As Cs As Cs Gs Ts |
|
| | | moeTs moeGs moeTs moeCs |
|
| 37 | D01 | 200 | moeCs moeCs moeGs moeTs Cs Cs Gs |
|
| | | Ts Gs Ts Ts Ts Gs Ts |
|
| | | moeTs moeCs moeTs moeGs |
|
| 38 | D02 | 200 | moeCs moeTs moeGs moeAs Cs Ts As |
|
| | | Cs As As Cs As Gs As |
|
| | | moeCs moeAs moeCs moeCs |
|
| 39 | D03 | 200 | moeCs moeAs moeAs moeCs As Gs As |
|
| | | Cs As Cs Cs As Gs Gs |
|
| | | moeGs moeGs moeTs moeCs |
|
| 40 | D04 | 200 | moeCs moeAs moeGs moeGs Gs Gs Ts |
|
| | | Cs Cs Ts As Gs Cs Cs |
|
| | | moeGs moeAs moeCs moeTs |
|
| 41 | D05 | 200 | moeCs moeTs moeCs moeTs As Gs Ts |
|
| | | Ts As As As As Gs Gs |
|
| | | moeGs moeCs moeTs moeGs |
|
| 42 | D06 | 200 | moeCs moeTs moeGs moeCs Ts As Gs |
|
| | | As As Gs Gs As Cs Cs |
|
| | | moeGs moeAs moeGs moeGs |
|
| 43 | D07 | 200 | moeCs moeTs moeGs moeAs As As Ts |
|
| | | Gs Ts As Cs Cs Ts As |
|
| | | moeCs moeGs moeGs moeTs |
|
| 44 | D08 | 200 | moeCs moeAs moeCs moeCs Cs Gs Ts |
|
| | | Ts Ts Gs Ts Cs Cs Gs |
|
| | | moeTs moeCs moeAs moeAs |
|
| 45 | D09 | 200 | moeCs moeTs moeCs moeGs As Ts As |
|
| | | Cs Gs Gs Gs Ts Cs As |
|
| | | moeGs moeTs moeCs moeAs |
|
| 46 | D10 | 200 | moeGs moeGs moeTs moeAs Gs Gs Ts |
|
| | | Cs Ts Ts Gs Gs Ts Gs |
|
| | | moeGs moeGs moeTs moeGs |
|
| 47 | D11 | 200 | moeGs moeAs moeCs moeTs Ts Ts Gs |
|
| | | Cs Cs Ts Ts As Cs Gs |
|
| | | moeGs moeAs moeAs moeGs |
|
| 48 | D12 | 200 | moeGs moeTs moeGs moeGs As Gs Ts |
|
| | | Cs Ts Ts Ts Gs Ts Cs |
|
| | | moeTs moeGs moeTs moeGs |
|
| 49 | E01 | 200 | moeGs moeGs moeAs moeGs Ts Cs Ts |
|
| | | Ts Ts Gs Ts Cs Ts Gs |
|
| | | moeTs moeGs moeGs moeTs |
|
| 50 | E02 | 200 | moeGs moeGs moeAs moeCs As Cs Ts |
|
| | | Cs Ts Cs Gs As Cs As |
|
| | | moeCs moeAs moeGs moeGs |
|
| 51 | E03 | 200 | moeGs moeAs moeCs moeAs Cs As Gs |
|
| | | Gs As Cs Gs Ts Gs Gs |
|
| | | moeCs moeGs moeAs moeGs |
|
| 52 | E04 | 200 | moeGs moeAs moeGs moeTs As Cs Gs |
|
| | | As Gs Cs Gs Gs Gs Cs |
|
| | | moeCs moeGs moeAs moeAs |
|
| 53 | E05 | 200 | moeGs moeAs moeCs moeTs As Ts Gs |
|
| | | Gs Ts As Gs As Cs Gs |
|
| | | moeCs moeTs moeCs moeGs |
|
| 54 | E06 | 200 | moeGs moeAs moeAs moeGs As Gs Gs |
|
| | | Ts Ts As Cs As Cs As |
|
| | | moeGs moeTs moeAs moeGs |
|
| 55 | E07 | 200 | moeGs moeAs moeGs moeGs Ts Ts As |
|
| | | Cs As Cs As Gs Ts As |
|
| | | moeGs moeAs moeCs moeGs |
|
| 56 | E08 | 200 | moeGs moeTs moeTs moeGs Ts Cs Cs |
|
| | | Gs Ts Cs Cs Gs Ts Gs |
|
| | | moeTs moeTs moeTs moeGs |
|
| 57 | E09 | 200 | moeGs moeAs moeCs moeTs Cs Ts Cs |
|
| | | Gs Gs Gs As Cs Cs As |
|
| | | moeCs moeCs moeAs moeCs |
|
| 58 | E10 | 200 | moeGs moeTs moeAs moeGs Gs As Gs |
|
| | | As As Cs Cs As Cs Gs |
|
| | | moeAs moeCs moeCs moeAs |
|
| 59 | E11 | 200 | moeGs moeGs moeTs moeTs Cs Ts Ts |
|
| | | Cs Gs Gs Ts Ts Gs Gs |
|
| | | moeTs moeTs moeAs moeTs |
|
| 60 | E12 | 200 | moeGs moeTs moeGs moeGs Gs Gs Ts |
|
| | | Ts Cs Gs Ts Cs Cs Ts |
|
| | | moeTs moeGs moeGs moeGs |
|
| 61 | F01 | 200 | moeCs moeTs moeCs moeAs Cs Gs Ts |
|
| | | Cs Cs Ts Cs Ts Gs As |
|
| | | moeAs moeAs moeTs moeGs |
|
| 62 | F02 | 200 | moeGs moeTs moeCs moeCs Ts Cs Cs |
|
| | | Ts As Cs Cs Gs Ts Ts |
|
| | | moeTs moeCs moeTs moeCs |
|
| 63 | F03 | 200 | moeGs moeTs moeCs moeCs Cs Cs As |
|
| | | Cs Gs Ts Cs Cs Gs Ts |
|
| | | moeCs moeTs moeTs moeCs |
|
| 64 | F04 | 200 | moeTs moeCs moeAs moeCs Cs As Gs |
|
| | | Gs As Cs Gs Gs Cs Gs |
|
| | | moeGs moeAs moeCs moeCs |
|
| 65 | F05 | 200 | moeTs moeAs moeCs moeCs As As Gs |
|
| | | Cs As Gs As Cs Gs Gs |
|
| | | moeAs moeGs moeAs moeCs |
|
| 66 | F06 | 200 | moeTs moeCs moeCs moeTs Gs Ts Cs |
|
| | | Ts Ts Ts Gs As Cs Cs |
|
| | | moeAs moeCs moeTs moeCs |
|
| 67 | F07 | 200 | moeTs moeGs moeTs moeCs Ts Ts Ts |
|
| | | Gs As Cs Cs As Cs Ts |
|
| | | moeCs moeAs moeCs moeTs |
|
| 68 | F08 | 200 | moeTs moeGs moeAs moeCs Cs As Cs |
|
| | | Ts Cs As Cs Ts Gs As |
|
| | | moeCs moeGs moeTs moeGs |
|
| 69 | F09 | 200 | moeTs moeGs moeAs moeCs Gs Ts Gs |
|
| | | Ts Cs Ts Cs As As Gs |
|
| | | moeTs moeGs moeAs moeCs |
|
| 70 | F10 | 200 | moeTs moeCs moeAs moeAs Gs Ts Gs |
|
| | | As Cs Ts Ts Ts Gs Cs |
|
| | | moeCs moeTs moeTs moeAs |
|
| 71 | F11 | 200 | moeTs moeGs moeTs moeTs Ts As Ts |
|
| | | Gs As Cs Gs Cs Ts Gs |
|
| | | moeGs moeGs moeGs moeTs |
|
| 72 | F12 | 200 | moeTs moeTs moeAs moeTs Gs As Cs |
|
| | | Gs Cs Ts Gs Gs Gs Gs |
|
| | | moeTs moeTs moeGs moeGs |
|
| 73 | G01 | 200 | moeTs moeGs moeAs moeCs Gs Cs Ts |
|
| | | Gs Gs Gs Gs Ts Ts Gs |
|
| | | moeGs moeAs moeTs moeCs |
|
| 74 | G02 | 200 | moeTs moeCs moeGs moeTs Cs Ts Ts |
|
| | | Cs Cs Gs Gs Ts Gs Gs |
|
| | | moeAs moeGs moeTs moeCs |
|
| 75 | G03 | 200 | moeTs moeGs moeGs moeTs As Gs As |
|
| | | Cs Gs Ts Gs Gs As Cs |
|
| | | moeAs moeCs moeTs moeTs |
|
| 76 | G04 | 200 | moeTs moeTs moeCs moeTs Ts Cs Cs |
|
| | | Gs As Cs Cs Gs Ts Gs |
|
| | | moeAs moeCs moeAs moeTs |
|
| 77 | G05 | 200 | moeTs moeGs moeGs moeTs As Gs As |
|
| | | Cs Gs Cs Ts Cs Gs Gs |
|
| | | moeGs moeAs moeCs moeGs |
|
| 78 | G06 | 200 | moeTs moeAs moeGs moeAs Cs Gs Cs |
|
| | | Ts Cs Gs Gs Gs As Cs |
|
| | | moeGs moeGs moeGs moeTs |
|
| 79 | G07 | 200 | moeTs moeTs moeTs moeTs As Cs As |
|
| | | Gs Ts Gs Gs Gs As As |
|
| | | moeCs moeCs moeTs moeGs |
|
| 80 | G08 | 200 | moeTs moeGs moeGs moeGs As As Cs |
|
| | | Cs Ts Gs Ts Ts Cs Gs |
|
| | | moeAs moeCs moeAs moeCs |
|
| 81 | G09 | 200 | moeTs moeCs moeGs moeGs Gs As Cs |
|
| | | Cs As Cs Cs As Cs Ts |
|
| | | moeAs moeGs moeGs moeGs |
|
| 82 | G10 | 200 | moeTs moeAs moeGs moeGs As Cs As |
|
| | | As As Cs Gs Gs Ts As |
|
| | | moeGs moeGs moeAs moeGs |
|
| 83 | G11 | 200 | moeTs moeGs moeCs moeTs As Gs As |
|
| | | As Gs Gs As Cs Cs Gs |
|
| | | moeAs moeGs moeGs moeTs |
|
| 84 | G12 | 200 | moeTs moeCs moeTs moeGs Ts Cs As |
|
| | | Cs Ts Cs Cs Gs As Cs |
|
| | | moeGs moeTs moeGs moeGs |
|
Reagent File (.tab File)[0212]
Table 5 is a tab for reagents necessary for synthesizing an oligonucleotides having both 2′-O-(methoxy-ethy) nucleosides and 2′-deoxy nucleosides located therein.
[0213]| TABLE 5 |
|
|
| Identity of columns: GroupName, Bottle ID, |
| ReagentName, FlowRate, Concentration. |
| Wherein reagent name is identified using base |
| identifier, ‘moe’ indicated a 2′-O-(methoxyethy) |
| substituted nucleoside and ‘cpg’ indicates a |
| control pore glass solid support medium. The |
| columns wrap around to next line when |
| longer than one line. |
|
|
| SUPPORT | | | | |
| BEGIN |
| 0 | moeG | moeGcpg | | 100 | 1 |
| 0 | moe5meC | moe5meCcpg | | 100 | 1 |
| 0 | moeA | moeAcpg | | 100 | 1 |
| 0 | moeT | moeTcpg | | 100 | 1 |
| END |
| DEBLOCK |
| BEGIN |
| 70 | TCA | TCA | | 100 | 1 |
| END |
| WASH |
| BEGIN |
| 65 | ACN | ACN | 190 | 1 |
| END |
| OXIDIZERS |
| BEGIN |
| 68 | BEAU | BEAUCAGE | 320 | 1 |
| END |
| CAPPING |
| BEGIN |
| 66 | CAP_B | CAP_B | | 220 | 1 |
| 67 | CAP_A | CAP_A | | 230 | 1 |
| END |
| DEOXY_THIOATE |
| BEGIN |
| 31,32 | Gs | deoxyG | | 270 | 1 |
| 39,40 | 5meCs | 5methyldeoxyC | | 270 | 1 |
| 37,38 | As | deoxyA | 270 | 1 |
| 29,30 | Ts | deoxyT | | 270 | 1 |
| END |
| MOE-THIOATE |
| BEGIN |
| 15,16 | moeGs | methoxyethoxyG | | 240 | 1 |
| 23,24 | moe5meCs | methoxyethoxyC | | 240 | 1 |
| 21,22 | moeAs | methoxyethoxyA | | 240 | 1 |
| 13,14 | moeTs | methoxyethoxyT | | 240 | 1 |
| END |
| ACTIVATORS |
| BEGIN |
|
|
|
| 5,6,7,8 | SET | s-ethyl-tet | 280 |
| Activates |
| DEOXY_THIOATE |
| MOE_THIOATE |
| END |
|
Example 4Oligonucleotide Synthesis-96 Well Plate Formatoligonucleotides were synthesized via solid phase P(III) phosphoramidite chemistry using a multi well automated synthesizer utilizing input files as described in EXAMPLE 3 above. The oligonucleotides were synthesized by assembling 96 sequences simultaneously in a standard 96 well format. Phosphodiester intemucleotide linkages were afforded by oxidation with aqueous iodine. Phosphorothioate intemucleotide linkages were generated by sulfurization utilizing 3,H-1,2 benzodithiole-3-[0214]one 1,1 dioxide (Beaucage Reagent) in anhydrous acetonitrile. Standard base-protected beta-cyanoethyldiisopropyl phosphoramidites were purchased from commercial vendors (e.g. PE/ABI, Pharmacia). Non-standard nucleosides are synthesized as per known literature or patented methods. They are utilized as base protected beta-cyanoethyldiisopropyl phosphoramidites.
Oligonucleotides were cleaved from support and deprotected with concentrated NH[0215]4OH at elevated temperature (55-60° C.) for 12-16 hours and the released product then dried in vacuo. The dried product was then re-suspended in sterile water to afford a master plate from which all analytical and test plate samples are then diluted utilizing robotic pipettors.
Example 5Alternative Oligonucleotide SynthesisUnsubstituted and substituted phosphodiester oligo nucleotides are alternately synthesized on an automated DNA synthesizer (Applied Biosystems model 380B) using standard phosphoramidite chemistry with oxidation by iodine.[0216]
Phosphorothioates are synthesized as per the phosphodiester oligonucleotides except the standard oxidation bottle was replaced by 0.2 M solution of 3H-1,2-benzodithiole-3-[0217]one 1,1-dioxide in acetonitrile for the stepwise thiation of the phosphite linkages. The thiation wait step was increased to 68 sec and was followed by the capping step. After cleavage from the CPG column and deblocking in concentrated ammonium hydroxide at 55° C. (18 hr), the oligonucleotides were purified by precipitating twice with 2.5 volumes of ethanol from a 0.5 M NaCl solution.
Phosphinate oligonucleotides are prepared as described in U.S. Pat. No. 5,508,270, herein incorporated by reference.[0218]
Alkyl phosphonate oligonucleotides are prepared as described in U.S. Pat. No. 4,469,863, herein incorporated by reference.[0219]
3′-Deoxy-3′-methylene phosphonate oligonucleotides are prepared as described in U.S. Pat. Nos. 5,610,289 or 5,625,050, herein incorporated by reference.[0220]
Phosphoramidite oligonucleotides are prepared as described in U.S. Pat. No. 5,256,775 or U.S. Pat. No. 5,366,878, hereby incorporated by reference.[0221]
Alkylphosphonothioate oligonucleotides are prepared as described in published PCT applications PCT/US94/00902 and PCT/US93/06976 (published as WO 94/17093 and WO 94/02499, respectively).[0222]
3′-Deoxy-3′-amino phosphoramidate oligonucleotides are prepared as described in U.S. Pat. No. 5,476,925, herein incorporated by reference.[0223]
Phosphotriester oligonucleotides are prepared as described in U.S. Pat. No. 5,023,243, herein incorporated by reference.[0224]
Boranophosphate oligonucleotides are prepared as described in U.S. Pat. Nos. 5,130,302 and 5,177,198, both herein incorporated by reference.[0225]
Methylenemethylimino linked oligonucleosides, also identified as MMI linked oligonucleosides, methylenedi-methylhydrazo linked oligonucleosides, also identified as MDH linked oligonucleosides, and methylenecarbonylamino linked oligonucleosides, also identified as amide-3 linked oligonucleosides, and methyleneaminocarbonyl linked oligo nucleosides, also identified as amide-4 linked oligonucleosides, as well as mixed backbone compounds having, for instance, alternating MMI and PO or PS linkages are prepared as described in U.S. Pat. Nos. 5,378,825; 5,386,023; 5,489,677; 5,602,240 and 5,610,289, all of which are herein incorporated by reference.[0226]
Formacetal and thioformacetal linked oligonucleosides are prepared as described in U.S. Pat. Nos. 5,264,562 and 5,264,564, herein incorporated by reference.[0227]
Ethylene oxide linked oligonucleosides are prepared as described in U.S. Pat. No. 5,223,618, herein incorporated by reference.[0228]
Example 6PNA SynthesisPeptide nucleic acids (PNAS) are prepared in accordance with any of the various procedures referred to in Peptide Nucleic Acids (PNA): Synthesis, Properties and Potential Applications,[0229]Bioorganic&Medicinal Chemistry, 1996, 4, 5. They may also be prepared in accordance with U.S. Pat. Nos. 5,539,082; 5,700,922, and 5,719,262, herein incorporated by reference.
Example 7Chimeric Oligonucleotide SynthesisChimeric oligonucleotides, oligonucleosides or mixed oligonucleotides/oligonucleosides of the invention can be of several different types. These include a first type wherein the ‘gap’ segment of linked nucleosides is positioned between 5′ and 3′ ‘wing’ segments of linked nucleosides and a second ‘open end’ type wherein the ‘gap’ segment is located at either the 3′ or the 5′ terminus of the oligomeric compound. Oligonucleotides of the first type are also known in the art as ‘gapmers’ or gapped oligonucleotides. oligonucleotides of the second type are also known in the art as ‘hemimers’ or ‘wingmers.’[0230]
A. [2′-O-Me]-[2′-deoxy]-[2′-O-Me] Chimeric Phosphorothioate Oligonucleotides[0231]
Chimeric oligonucleotides having 2′-O-alkyl phosphorothioate and 2′-deoxy phosphorothioate oligonucleotide segments are synthesized using 2′-deoxy-5′-dimethoxytrityl-3′-O-phosphoramidites for the DNA portion and 5′-dimethoxytrityl-2′-O-methyl-3′-O-phosphoramidites for 5′ and 3′ wings. The standard synthesis cycle is modified by increasing the wait step after the delivery of tetrazole and base to 600 s repeated four times for DNA and twice for 2′-O-methyl. The fully protected oligonucleotide was cleaved from the support and the phosphate group is deprotected in 3:1 Ammonia/Ethanol at room temperature over-night then lyophilized to dryness. Treatment in methanolic ammonia for 24 hrs at room temperature is done to deprotect all bases and the samples are again lyophilized to dryness.[0232]
B. [2′-O-(2-Methoxyethyl)]-[2-deoxy]-[2′-O-(Methoxyethyl)] Chimeric Phosphorothioate Oligonucleotides[0233]
[2′-O-(2-methoxyethyl)]-[2′-deoxy]-[-2′-O-(methoxy-ethyl)] chimeric phosphorothioate oligonucleotides are prepared as per the procedure above for the 2′-O-methyl chimeric oligonucleotide, with the substitution of 2′-O-(methoxyethyl)amidites for the 2-O-methyl amidites.[0234]
C. [2′-O-(2-Methoxyethyl)Phosphodiester]-[2′-deoxy Phosphorothioate]-[2′-O-(2-Methoxyethyl)Phosphodiester] Chimeric Oligonucleotide[0235]
[2′-O-(2-methoxyethyl phosphodiester]-[2′-deoxy phosphorothioate]-[2′-O-(methoxyethyl)phosphodiester] chimeric oligonucleotides are prepared as per the above procedure for the 2′-O-methyl chimeric oligonucleotide with the substitution of 2′-O-(methoxyethyl)amidites for the 2′-O-methyl amidites in the wing portions. Sulfirization utilizing 3,H-1,2 benzodithiole-3-[0236]one 1,1 dioxide (Beaucage Reagent) is used to generate the phosphorothioate inter-nucleotide linkages within the wing portions of the chimeric structures. Oxidization with iodine is used to generate the phosphodiester intemucleotide linkages for the center gap.
Other chimeric oligonucleotides, chimeric oligonucleosides and mixed chimeric oligonucleotides/oligonucleosides are synthesized according to U.S. Pat. No. 5,623,065, herein incorporated by reference.[0237]
Example 8Output Oligonucleotides from Automated Oligonucleotide SynthesisUsing the seq files, the .cmd files and .tab file of Example 3, oligonucleotides were prepared as per the protocol of the 96 well format of Example 4. The oligonucleotides were prepared utilizing phosphorothioate chemistry to give in one instance a first library of phosphorothioate oligodeoxynucleotides. The oligonucleotides were prepared in a second instance as a second library of hybrid oligonucleotides having phosphorothioate backbones with a first and third ‘wing’ region of 2′-O-(methoxyethyl)nucleotides on either side of a center gap region of 2′-deoxy nucleotides. The two libraries contained the same set of oligonucleotide sequences. Thus the two libraries are redundant with respect to sequence but are unique with respect to the combination of sequence and chemistry. Because the sequences of the second library of compounds is the same as the first (how-ever the chemistry is different), for brevity sake, the second library is not shown.[0238]
For illustrative purposes Tables 6-a and 6-b show the sequences of an intial first library, i.e., a library of phosphorothioate oligonucleotides targeted to a CD40 tar-get. The compounds of Table 6-a shows the members of this listed in compliance with the established rule for listing SEQ ID NO:, i.e., in numerical SEQ ID NO: order.
[0239]| TABLE 6-a |
|
|
| Sequences of Oligonucleotides | |
| Targeted to CD40 by SEQ ID NO.: |
| | SEQ | |
| NUCLEOBASE SEQUENCE | ID NO. |
| |
| CCAGGCGGCAGGACCACT | 1 | |
| |
| GACCAGGCGGCAGGACCA | 2 |
| |
| AGGTGAGACCAGGCGGCA | 3 |
| |
| CAGAGGCAGACGAACCAT | 4 |
| |
| GCAGAGGCAGACGAACCA | 5 |
| |
| GCAAGCAGCCCCAGAGGA | 6 |
| |
| GGTCAGCAAGCAGCCCCA | 7 |
| |
| GACAGCGGTCAGCAAGCA | 8 |
| |
| GATGGACAGCGGTCAGCA | 9 |
| |
| TCTGGATGGACAGCGGTC | 10 |
| |
| GGTGGTTCTGGATGGACA | 11 |
| |
| GTGGGTGGTTCTGGATGG | 12 |
| |
| GCAGTGGGTGGTTCTGGA | 13 |
| |
| CACAAAGAACAGCACTGA | 14 |
| |
| CTGGCACAAAGAACAGCA | 15 |
| |
| TCCTGGCTGGCACAAAGA | 16 |
| |
| CTGTCCTGGCTGGCACAA | 17 |
| |
| CTCACCAGTTTCTGTCCT | 18 |
| |
| TCACTCACCAGTTTCTGT | 19 |
| |
| GTGCAGTCACTCACCAGT | 20 |
| |
| ACTCTGTGCAGTCACTCA | 21 |
| |
| CAGTGAACTCTGTGCAGT | 22 |
| |
| ATTCCGTTTCAGTGAACT | 23 |
| |
| GAAGGCATTCCGTTTCAG | 24 |
| |
| TTCACCGCAAGGAAGGCA | 25 |
| |
| CTCTGTTCCAGGTGTCTA | 26 |
| |
| CTGGTGGCAGTGTGTCTC | 27 |
| |
| TGGGGTCGCAGTATTTGT | 28 |
| |
| GGTTGGGGTCGCAGTATT | 29 |
| |
| CTAGGTTGGGGTCGCAGT | 30 |
| |
| GGTGCCCTTCTGCTGGAC | 31 |
| |
| CTGAGGTGCCCTTCTGCT | 32 |
| |
| GTGTCTGTTTCTGAGGTG | 33 |
| |
| TGGTGTCTGTTTCTGAGG | 34 |
| |
| ACAGGTGCAGATGGTGTC | 35 |
| |
| TTCACAGGTGCAGATGGT | 36 |
| |
| GTGCCAGCCTTCTTCACA | 37 |
| |
| TACAGTGCCAGCCTTCTT | 38 |
| |
| GGACACAGCTCTCACAGG | 39 |
| |
| TGCAGGACACAGCTCTCA | 40 |
| |
| GAGCGGTGCAGGACACAG | 41 |
| |
| AAGCCGGGCGAGCATGAG | 42 |
| |
| AATCTGCTTGACCCCAAA | 43 |
| |
| GAAACCCCTGTAGCAATC | 44 |
| |
| GTATCAGAAACCCCTGTA | 45 |
| |
| GCTCGCAGATGGTATCAG | 46 |
| |
| GCAGGGCTCGCAGATGGT | 47 |
| |
| TGGGCAGGGCTCGCAGAT | 48 |
| |
| GACTGGGCAGGGCTCGCA | 49 |
| |
| CATTGGAGAAGAAGCCGA | 50 |
| |
| GATGACACATTGGAGAAG | 51 |
| |
| GCAGATGACACATTGGAG | 52 |
| |
| TCGAAAGCAGATGACACA | 53 |
| |
| GTCCAAGGGTGACATTTT | 54 |
| |
| CACAGCTTGTCCAAGGGT | 55 |
| |
| TTGGTCTCACAGCTTGTC | 56 |
| |
| CAGGTCTTTGGTCTCACA | 57 |
| |
| CTGTTGCACAACCAGGTC | 58 |
| |
| GTTTGTGCCTGCCTGTTG | 59 |
| |
| GTCTTGTTTGTGCCTGCC | 60 |
| |
| CCACAGACAACATCAGTC | 61 |
| |
| CTGGGGACCACAGACAAC | 62 |
| |
| TCAGCCGATCCTGGGGAC | 63 |
| |
| CACCACCAGGGCTCTCAG | 64 |
| |
| GGGATCACCACCAGGGCT | 65 |
| |
| GAGGATGGCAAACAGGAT | 66 |
| |
| ACCAGCACCAAGAGGATG | 67 |
| |
| TTTTGATAAAGACCAGCA | 68 |
| |
| TATTGGTTGGCTTCTTGG | 69 |
| |
| GGGTTCCTGCTTGGGGTG | 70 |
| |
| GTCGGGAAAATTGATCTC | 71 |
| |
| GATCGTCGGGAAAATTGA | 72 |
| |
| GGAGCCAGGAAGATCGTC | 73 |
| |
| TGGAGCCAGGAAGATCGT | 74 |
| |
| TGGAGCAGCAGTGTTGGA | 75 |
| |
| GTAAAGTCTCCTGCACTG | 76 |
| |
| TGGCATCCATGTAAAGTC | 77 |
| |
| CGGTTGGCATCCATGTAA | 78 |
| |
| CTCTTTGCCATCCTCCTG | 79 |
| |
| CTGTCTCTCCTGCACTGA | 80 |
| |
| GGTGCAGCCTCACTGTCT | 81 |
| |
| AACTGCCTGTTTGCCCAC | 82 |
| |
| CTTCTGCCTGCACCCCTG | 83 |
| |
| ACTGACTGGGCATAGCTC | 84 |
| |
The sequences shown in Table 6-a above and Table 6-B below are in a 5′ to 3′ direction. This is reversed with respect to 3′ to 5′ direction shown in the seq files of Example 3. For synthesis purposes, the .seq files are generated reading from 3′ to 5′. This allows for aligning all of the 3′ most ‘A nucleosides together, all of the 3′ most ‘G’ nucleosides together, all of the 3′ most ‘C’ nucleosides together and all of the 3′ most ‘T’ nucleosides together. Thus when the first nucleoside of each particular oligonucleotide (attached to the solid support) is added to the wells on the plates, machine movement is reduced since an automatic pipette can move in a linear manner down one row and up another on the 96 well plate.[0240]
The location of the well holding each particular oligonucleotides is indicated by row and column. There are eight rows designated A to G and twelve columns designated 1 to 12 in a typical 96 well format plate. Any particular well location is indicated by its ‘Well No.’ which is indicated by the combination of the row and the column, e.g. AO8 is the well at row A,[0241]column 8.
In Table 6-b below, the oligonucleotide of Table 6-a are shown reordered according to the Well No. on their synthesis plate. The order shown in Table 6-b is the actually order as synthesized on an automated synthesizer taking advantage of the preferred placement of the first nucleoside according to the above alignment criteria.
[0242]| TABLE 6-b |
|
|
| Sequences of Oligonucleotides | |
| Targeted to CD40 Order by Synthesis Well No. |
| A01 | GACCAGGCGGCAGGACCA | 2 | |
| |
| A02 | AGGTGAGACCAGGCGGCA | 3 |
| |
| A03 | GCAGAGGCAGACGAACCA | 5 |
| |
| A04 | GCAAGCAGCCCCAGAGGA | 6 |
| |
| A05 | GGTCAGCAAGCAGCCCCA | 7 |
| |
| A06 | GACAGCGGTCAGCAAGCA | 8 |
| |
| A07 | GATGGACAGCGGTCAGCA | 9 |
| |
| A08 | GGTGGTTCTGGATGGACA | 11 |
| |
| A09 | GCAGTGGGTGGTTCTGGA | 13 |
| |
| A10 | CACAAAGAACAGCACTGA | 14 |
| |
| A11 | CTGGCACAAAGAACAGCA | 15 |
| |
| A12 | TCCTGGCTGGCACAAAGA | 16 |
| |
| B01 | CTGTCCTGGCTGGCACAA | 17 |
| |
| B02 | ACTCTGTGCAGTCACTCA | 21 |
| |
| B03 | TTCACCGCAAGGAAGGCA | 25 |
| |
| B04 | CTCTGTTCCAGGTGTCTA | 26 |
| |
| B05 | GTGCCAGCCTTCTTCACA | 37 |
| |
| B06 | TGCAGGACACAGCTCTCA | 40 |
| |
| B07 | AATCTGCTTGACCCCAAA | 43 |
| |
| B08 | GTATCAGAAACCCCTGTA | 45 |
| |
| B09 | GACTGGGCAGGGCTCGCA | 49 |
| |
| B10 | CATTGGAGAAGAAGCCGA | 50 |
| |
| B11 | TCGAAAGCAGATGACACA | 53 |
| |
| B12 | CAGGTCTTTGGTCTCACA | 57 |
| |
| C01 | TTTTGATAAAGACCAGCA | 68 |
| |
| C02 | GATCGTCGGGAAAATTGA | 72 |
| |
| C03 | TGGAGCAGCAGTGTTGGA | 75 |
| |
| C04 | CGGTTGGCATCCATGTAA | 78 |
| |
| C05 | CTGTCTCTCCTGCACTGA | 80 |
| |
| C06 | TCTGGATGGACAGCGGTC | 10 |
| |
| C07 | CTGGTGGCAGTGTGTCTC | 27 |
| |
| C08 | GGTGCCCTTCTGCTGGAC | 31 |
| |
| C09 | ACAGGTGCAGATGGTGTC | 35 |
| |
| C10 | GAAACCCCTGTAGCAATC | 44 |
| |
| C11 | TTGGTCTCACAGCTTGTC | 56 |
| |
| C12 | CTGTTGCACAACCAGGTC | 58 |
| |
| D01 | GTCTTGTTTGTGCCTGCC | 60 |
| |
| D02 | CCACAGACAACATCAGTC | 61 |
| |
| D03 | CTGGGGACCACAGACAAC | 62 |
| |
| D04 | TCAGCCGATCCTGGGGAC | 63 |
| |
| D05 | GTCGGGAAAATTGATCTC | 71 |
| |
| D06 | GGAGCCAGGAAGATCGTC | 73 |
| |
| D07 | TGGCATCCATGTAAAGTC | 77 |
| |
| D08 | AACTGCCTGTTTGCCCAC | 82 |
| |
| D09 | ACTGACTGGGCATAGCTC | 84 |
| |
| D10 | GTGGGTGGTTCTGGATGG | 12 |
| |
| D11 | GAAGGCATTCCGTTTCAG | 24 |
| |
| D12 | GTGTCTGTTTCTGAGGTG | 33 |
| |
| E01 | TGGTGTCTGTTTCTGAGG | 34 |
| |
| E02 | GGACACAGCTCTCACAGG | 39 |
| |
| E03 | GAGCGGTGCAGGACACAG | 41 |
| |
| E04 | AAGCCGGGCGAGCATGAG | 42 |
| |
| E05 | GCTCGCAGATGGTATCAG | 46 |
| |
| E06 | GATGACACATTGGAGAAG | 51 |
| |
| E07 | GCAGATGACACATTGGAG | 52 |
| |
| E08 | GTTTGTGCCTGCCTGTTG | 59 |
| |
| E09 | CACCACCAGGGCTCTCAG | 64 |
| |
| E10 | ACCAGCACCAAGAGGATG | 67 |
| |
| E11 | TATTGGTTGGCTTCTTGG | 69 |
| |
| E12 | GGGTTCCTGCTTGGGGTG | 70 |
| |
| F01 | GTAAAGTCTCCTGCACTG | 76 |
| |
| F02 | CTCTTTGCCATCCTCCTG | 79 |
| |
| F03 | CTTCTGCCTGCACCCCTG | 83 |
| |
| F04 | CCAGGCGGCAGGACCACT | 1 |
| |
| F05 | CAGAGGCAGACGAACCAT | 4 |
| |
| F06 | CTCACCAGTTTCTGTCCT | 18 |
| |
| F07 | TCACTCACCAGTTTCTGT | 19 |
| |
| F08 | GTGCAGTCACTCACCAGT | 20 |
| |
| F09 | CAGTGAACTCTGTGCAGT | 22 |
| |
| F10 | ATTCCGTTTCAGTGAACT | 23 |
| |
| F11 | TGGGGTCGCAGTATTTGT | 28 |
| |
| F12 | GGTTGGGGTCGCAGTATT | 29 |
| |
| G01 | CTAGGTTGGGGTCGCAGT | 30 |
| |
| G02 | CTGAGGTGCCCTTCTGCT | 32 |
| |
| G03 | TTCACAGGTGCAGATGGT | 36 |
| |
| G04 | TACAGTGCCAGCCTTCTT | 38 |
| |
| G05 | GCAGGGCTCGCAGATGGT | 47 |
| |
| G06 | TGGGCAGGGCTCGCAGAT | 48 |
| |
| G07 | GTCCAAGGGTGACATTTT | 54 |
| |
| G08 | CACAGCTTGTCCAAGGGT | 55 |
| |
| G09 | GGGATCACCACCAGGGCT | 65 |
| |
| G10 | GAGGATGGCAAACAGGAT | 66 |
| |
| G11 | TGGAGCCAGGAAGATCGT | 74 |
| |
| G12 | GGTGCAGCCTCACTGTCT | 81 |
| |
Example 9Oligonucleotide AnalysisA. Oligonucleotide Analysis-96 Well Plate Format[0243]
The concentration of oligonucleotide in each well was assessed by dilution of samples and UV absorption spectroscopy. The full-length integrity of the individual products was evaluated by capillary electrophoresis (CE) in either the 96 well format (Beckman MDQ) or, for individually prepared samples, on a commercial CE apparatus (e.g., Beckman 5000, ABI 270). Base and backbone composition was confirmed by mass analysis of the compounds utilizing electrospray-mass spectroscopy. All assay test plates were diluted from the master plate using single and multi-channel robotic pipettors.[0244]
B. Alternative Oligonucleotide Analysis[0245]
After cleavage from the controlled pore glass sup-port (Applied Biosystems) and deblocking in concentrated ammonium hydroxide at 55° C. for 18 hours, the oligonucleotides or oligonucleosides are purified by precipitation twice out of 0.5 M NaCl with 2.5 volumes ethanol. Synthesized oligonucleotides are analyzed by polyacrylamide gel electrophoresis on denaturing gels. Oligonucleotide purity is checked by[0246]31P nuclear magnetic resonance spectroscopy, and/or by HPLC, as described by Chiang et al.,J. Biol. Chem. 1991, 266, 18162.
Example 10Automated Assay of CD40 OligonucleotidesA. Poly(A)+mRNA isolation[0247]
Poly(A)+mRNA was isolated according to Miura et al. ([0248]Clin. Chem., 1996, 42, 1758). Briefly, for cells grown on 96-well plates, growth medium was removed from the cells and each well was washed with 200 μl cold PBS. 60 μl lysis buffer (10 mM Tris-HCl, pH 7.6, 1 mM EDTA, 0.5 M NaCl, 0.5% NP-40, 20 mM vanadyl-ribonucleoside complex) was added to each well, the plate was gently agitated and then incubated at room temperature for five minutes. 55 ul of lysate was transferred to Oligo d(T) coated 96 well plates (AGCT Inc., Irvine, Calif.). Plates were incubated for 60 minutes at room temperature, washed 3 times with 200 ul of wash buffer (10 mM Tris-HCl pH 7.6, 1 mM EDTA, 0.3 M NaCl). After the final wash, the plate was blotted on paper towels to remove excess wash buffer and then air-dried for 5 minutes. 60 ul of elution buffer (5 mM Tris-HCl pH 7.6), preheated to 70° C. was added to each well, the plate was incubated on a 90° C. plate for 5 minutes, and the eluate then transferred to a fresh 96-well plate. Cells grown on 100 mm or other standard plates may be treated similarly, using appropriate volumes of all solutions.
B. RT-PCR Analysis of CD40 mRNA Levels[0249]
Quantitation of CD40 mRNA levels was deter-mined by reverse transcriptase polymerase chain reaction (RT-PCR) using the ABI PRISM™ 7700 Sequence Detection System (PE-Applied Biosystems, Foster City, Calif.) according to manufacturer's instructions. This is a closed-tube, non-gel-based, fluorescence detection system which allows high-throughput quantitation of polymerase chain reaction (PCR) products in real-time.[0250]
As opposed to standard PCR, in which amplification products are quantitated after the PCR is completed, products in RT-PCR are quantitated as they accumulate. This is accomplished by including in the PCR reaction an oligonucleotide probe that anneals specifically between the for-ward and reverse PCR primers, and contains two fluorescent dyes. A reporter dye (e.g., JOE or FAM, PE-Applied Biosystems, Foster City, Calif.) is attached to the 5′ end of the probe and a quencher dye (e.g., TAMRA, PE-Applied Biosystems, Foster City, Calif.) is attached to the 3′ end of the probe. When the probe and dyes are intact, reporter dye emission is quenched by the proximity of the 3′ quencher dye. During amplification, annealing of the probe to the target sequence creates a substrate that can be cleaved by the 5′-exonuclease activity of Taq polymerase. During the extension phase of the PCR amplification cycle, cleavage of the probe by Taq polymerase releases the reporter dye from the remainder of the probe (and hence from the quencher moiety) and a sequence-specific fluorescent signal is generated.[0251]
With each cycle, additional reporter dye molecules are cleaved from their respective probes, and the fluorescence intensity is monitored at regular (six-second) intervals by laser optics built into the ABI PRISM™ 7700 Sequence Detection System. In each assay, a series of parallel reactions containing serial dilutions of mRNA from untreated control samples generates a standard curve that is used to quantitate the percent inhibition after antisense oligonucleotide treatment of test samples.[0252]
RT-PCR reagents were obtained from PE-Applied Biosystems, Foster City, Calif. RT-PCR reactions were carried out by adding 25[0253]ul PCR cocktail 1×Tagman™ buffer A, 5.5 mM MgCl2, 300 uM each of DATP, dCTP and dGTP, 600 uM of dUTP, 100 nM each of forward primer, reverse primer, and probe, 20 U RNAse inhibitor, 1.25 units AmpliTaq Gold™, and 12.5 U MuLV reverse transcriptase) to 96 well plates containing 25 ul poly(A) mRNA solution. The RT reaction was carried out by incubation for 30 minutes at 48° C. following a 10 minute incubation at 95° C. to activate the AmpliTaq Gold™, 40 cycles of a two-step PCR protocol were carried out: 95° C. for 15 seconds (denaturation) followed by 60° C. for 1.5 minutes (annealing/extension).
For CD40, the PCR primers were: forward primer:[0254]
(SEQ ID NO:86) reverse primier:[0255]
(SEQ ID NO:87), and the PCR probe was: FAM-TTCCTTGCGGTGAAAGCGAATTCCTTAMRA[0256]
(SEQ ID NO:88) where FAM (PE-Applied Biosystems, Foster City, Calif.) is the fluorescent reporter dye and TAMRA (PE-Applied Biosystems, Foster City, Calif.) is the quencher dye.[0257]
For GAPDH the PCR primers were:[0258]
forward primer: GAAGGTGAAGGTCGGAGTC (SEQ ID NO:89)[0259]
reverse primer: GAAGATGGTGATGGGATTTC (SEQ ID NO:90), and the was: 5′ JOECAAGCTTCCCGTTCTCAGCC-[0260]TAMRA 3′ (SEQ ID No. 91) where plied Biosystems, Foster City, Calif.) is the fluorescent reporter dye and TAMRA d Biosystems, Foster City, Calif.) is the quencher dye.
Example 11Inhibition of CD40 Expression by Phosphorothioate OligodeoxynucleotidesIn accordance with the present invention, a series of oligonucleotides complementary to mRNA were designed to target different regions of the human CD40 mRNA, using published sequences (GenBank accession number X60592, incorporated herein as SEQ ID NO:85). The oligonucleotides are shown in Table 7. Target sites are indicated by the beginning nucleotide numbers, as given in the sequence source reference (X60592), to which the oligonucleotide binds. All compounds in Table 7 are oligodeoxynucleotides with phosphorothioate backbones (intemucleoside linkages) throughout. Data are averages from three experiments.
[0261]| TABLE 7 |
|
|
| Inhibition of CD40 mRNA Levels by |
| Phosphorothioate Oligodeoxynucleotides |
| TARGET | | % | SEQ | |
| ISIS# | SITE | SEQUENCE | INHIB. | ID NO. |
|
| 18623 | 18 | CCAGGCGGCAGGACCACT | 30.71 | 1 | |
|
| 18624 | 20 | GACCAGGCGGCAGGACCA | 28.09 | 2 |
|
| 18625 | 26 | AGGTGAGACCAGGCGGCA | 21.89 | 3 |
|
| 18626 | 48 | CAGAGGCAGACGAACCAT | 0.00 | 4 |
|
| 18627 | 49 | GCAGAGGCAGACGAACCA | 0.00 | 5 |
|
| 18628 | 73 | GCAAGCAGCCCCAGAGGA | 0.00 | 6 |
|
| 18629 | 78 | GGTCAGCAAGCAGCCCCA | 29.96 | 7 |
|
| 18630 | 84 | GACAGCGGTCAGCAAGCA | 0.00 | 8 |
|
| 18631 | 88 | GATGGACAGCGGTCAGCA | 0.00 | 9 |
|
| 18632 | 92 | TCTGGATGGACAGCGGTC | 0.00 | 10 |
|
| 18633 | 98 | GGTGGTTCTGGATGGACA | 0.00 | 11 |
|
| 18634 | 101 | GTGGGTGGTTCTGGATGG | 0.00 | 12 |
|
| 18635 | 104 | GCAGTGGGTGGTTCTGGA | 0.00 | 13 |
|
| 18636 | 152 | CACAAAGAACAGCACTGA | 0.00 | 14 |
|
| 18637 | 156 | CTGGCACAAAGAACAGCA | 0.00 | 15 |
|
| 18638 | 162 | TCCTGGCTGGCACAAAGA | 0.00 | 16 |
|
| 18639 | 165 | CTGTCCTGGCTGGCACAA | 4.99 | 17 |
|
| 18640 | 176 | CTCACCAGTTTCTGTCCT | 0.00 | 18 |
|
| 18641 | 179 | TCACTCACCAGTTTCTGT | 0.00 | 19 |
|
| 18642 | 185 | GTGCAGTCACTCACCAGT | 0.00 | 20 |
|
| 18643 | 190 | ACTCTGTGCAGTCACTCA | 0.00 | 21 |
|
| 18644 | 196 | CAGTGAACTCTGTGCAGT | 5.30 | 22 |
|
| 18645 | 205 | ATTCCGTTTCAGTGAACT | 0.00 | 23 |
|
| 18646 | 211 | GAAGGCATTCCGTTTCAG | 9.00 | 24 |
|
| 18647 | 222 | TTCACCGCAAGGAAGGCA | 0.00 | 25 |
|
| 18648 | 250 | CTCTGTTCCAGGTGTCTA | 0.00 | 26 |
|
| 18649 | 267 | CTGGTGGCAGTGTGTCTC | 0.00 | 27 |
|
| 18650 | 286 | TGGGGTCGCAGTATTTGT | 0.00 | 28 |
|
| 18651 | 289 | GGTTGGGGTCGCAGTATT | 0.00 | 29 |
|
| 18652 | 292 | CTAGGTTGGGGTCGCAGT | 0.00 | 30 |
|
| 18653 | 318 | GGTGCCCTTCTGCTGGAC | 19.67 | 31 |
|
| 18654 | 322 | CTGAGGTGCCCTTCTGCT | 15.63 | 32 |
|
| 18655 | 332 | GTGTCTGTTTCTGAGGTG | 0.00 | 33 |
|
| 18656 | 334 | TGGTGTCTGTTTCTGAGG | 0.00 | 34 |
|
| 18657 | 345 | ACAGGTGCAGATGGTGTC | 0.00 | 35 |
|
| 18658 | 348 | TTCACAGGTGCAGATGGT | 0.00 | 36 |
|
| 18659 | 360 | GTGCCAGCCTTCTTCACA | 5.67 | 37 |
|
| 18660 | 364 | TACAGTGCCAGCCTTCTT | 7.80 | 38 |
|
| 18661 | 391 | GGACACAGCTCTCACAGG | 0.00 | 39 |
|
| 18662 | 395 | TGCAGGACACAGCTCTCA | 0.00 | 40 |
|
| 18663 | 401 | GAGCGGTGCAGGACACAG | 0.00 | 41 |
|
| 18664 | 416 | AAGCCGGGCGAGCATGAG | 0.00 | 42 |
|
| 18665 | 432 | AATCTGCTTGACCCCAAA | 5.59 | 43 |
|
| 18666 | 446 | GAAACCCCTGTAGCAATC | 0.10 | 44 |
|
| 18667 | 452 | GTATCAGAAACCCCTGTA | 0.00 | 45 |
|
| 18668 | 463 | GCTCGCAGATGGTATCAG | 0.00 | 46 |
|
| 18669 | 468 | GCAGGGCTCGCAGATGGT | 34.05 | 47 |
|
| 18670 | 471 | TGGGCAGGGCTCGCAGAT | 0.00 | 48 |
|
| 18671 | 474 | GACTGGGCAGGGCTCGCA | 2.71 | 49 |
|
| 18672 | 490 | CATTGGAGAAGAAGCCGA | 0.00 | 50 |
|
| 18673 | 497 | GATGACACATTGGAGAAG | 0.00 | 51 |
|
| 18674 | 500 | GCAGATGACACATTGGAG | 0.00 | 52 |
|
| 18675 | 506 | TCGAAAGCAGATGACACA | 0.00 | 53 |
|
| 18676 | 524 | GTCCAAGGGTGACATTTT | 8.01 | 54 |
|
| 18677 | 532 | CACAGCTTGTCCAAGGGT | 0.00 | 55 |
|
| 18678 | 539 | TTGGTCTCACAGCTTGTC | 0.00 | 56 |
|
| 18679 | 546 | CAGGTCTTTGGTCTCACA | 6.98 | 57 |
|
| 18680 | 558 | CTGTTGCACAACCAGGTC | 18.76 | 58 |
|
| 18681 | 570 | GTTTGTGCCTGCCTGTTG | 2.43 | 59 |
|
| 18682 | 575 | GTCTTGTTTGTGCCTGCC | 0.00 | 60 |
|
| 18683 | 590 | CCACAGACAACATCAGTC | 0.00 | 61 |
|
| 18684 | 597 | CTGGGGACCACAGACAAC | 0.00 | 62 |
|
| 18685 | 607 | TCAGCCGATCCTGGGGAC | 0.00 | 63 |
|
| 18686 | 621 | CACCACCAGGGCTCTCAG | 23.31 | 64 |
|
| 18687 | 626 | GGGATCACCACCAGGGCT | 0.00 | 65 |
|
| 18688 | 657 | GAGGATGGCAAACAGGAT | 0.00 | 66 |
|
| 18689 | 668 | ACCAGCACCAAGAGGATG | 0.00 | 67 |
|
| 18690 | 679 | TTTTGATAAAGACCAGCA | 0.00 | 68 |
|
| 18691 | 703 | TATTGGTTGGCTTCTTGG | 0.00 | 69 |
|
| 18692 | 729 | GGGTTCCTGCTTGGGGTG | 0.00 | 70 |
|
| 18693 | 750 | GTCGGGAAAATTGATCTC | 0.00 | 71 |
|
| 18694 | 754 | GATCGTCGGGAAAATTGA | 0.00 | 72 |
|
| 18695 | 765 | GGAGCCAGGAAGATCGTC | 0.00 | 73 |
|
| 18696 | 766 | TGGAGCCAGGAAGATCGT | 0.00 | 74 |
|
| 18697 | 780 | TGGAGCAGCAGTGTTGGA | 0.00 | 75 |
|
| 18698 | 796 | GTAAAGTCTCCTGCACTG | 0.00 | 76 |
|
| 18699 | 806 | TGGCATCCATGTAAAGTC | 0.00 | 77 |
|
| 18700 | 810 | CGGTTGGCATCCATGTAA | 0.00 | 78 |
|
| 18701 | 834 | CTCTTTGCCATCCTCCTG | 4.38 | 79 |
|
| 18702 | 861 | CTGTCTCTCCTGCACTGA | 0.00 | 80 |
|
| 18703 | 873 | GGTGCAGCCTCACTGTCT | 0.00 | 81 |
|
| 18704 | 910 | AACTGCCTGTTTGCCCAC | 33.89 | 82 |
|
| 18705 | 954 | CTTCTGCCTGCACCCCTG | 0.00 | 83 |
|
| 18706 | 976 | ACTGACTGGGCATAGCTC | 0.00 | 84 |
|
As shown in Table 7, SEQ ID NOS:1, 2, 7, 47 and 82 demonstrated at least 25% CD40 expression and are therefore preferred compounds of the invention.[0262]
Example 12Inhibition of CD40 Expression byPhosphorothioate 2′-MOE Gapmer OligonucleotidesIn accordance with the present invention, a second series of oligonucleotides complementary to mRNA were designed to target different regions of the human CD40 mRNA, using published sequence X60592. The oligonucleotides are shown in Table 8. Target sites are indicated by the beginning or initial nucleotide numbers, as given in the sequence source reference (X60592), to which the oligonucleotide binds.[0263]
All compounds in Table 8 are chimeric oligonucleotides (gapmers') 18 nucleotides in length, composed of a central ‘gap’ region consisting of ten 2′-deoxynucleotides, which is flanked on both sides (5′ and 3′ directions) by four-nucleotide ‘wings.’ The wings are composed of 2′-methoxyethyl (2′-MOE) nucleotides. The intersugar (backbone) linkages are phosphorothioate (P═S) through-out the oligonucleotide. Cytidine residues in the 2′-MOE wings are 5-methylcytidines. Data are averaged from three experiments.
[0264]| TABLE 8 |
|
|
| Inhibition of CD40 mRNA Levels by | |
| Chimeric Phosphorothioate Oligonucleotides |
| TARGET | | % | SEQ | |
| ISIS# | SITE | SEQUENCE | Inhibition | ID NO. |
|
| 19211 | 18 | CCAGGCGGCAGGACCA | 75.71 | 1 | |
|
| 19212 | 20 | GACCAGGCGGCAGGAC | 77.23 | 2 |
|
| 19213 | 26 | AGGTGAGACCAGGCGG | 80.82 | 3 |
|
| 19214 | 48 | CAGAGGCAGACGAACC | 23.68 | 4 |
|
| 19215 | 49 | GCAGAGGCAGACGAAC | 45.97 | 5 |
|
| 19216 | 73 | GCAAGCAGCCCCAGAG | 65.80 | 6 |
|
| 19217 | 78 | GGTCAGCAAGCAGCCC | 74.73 | 7 |
|
| 19218 | 84 | GACAGCGGTCAGCAAG | 67.21 | 8 |
|
| 19219 | 88 | GATGGACAGCGGTCAG | 65.14 | 9 |
|
| 19220 | 92 | TCTGGATGGACAGCGG | 78.71 | 10 |
|
| 19221 | 98 | GGTGGTTCTGGATGGA | 81.33 | 11 |
|
| 19222 | 101 | GTGGGTGGTTCTGGAT | 57.79 | 12 |
|
| 19223 | 104 | GCAGTGGGTGGTTCTG | 73.70 | 13 |
|
| 19224 | 152 | CACAAAGAACAGCACT | 40.25 | 14 |
|
| 19225 | 156 | CTGGCACAAAGAACAG | 60.11 | 15 |
|
| 19226 | 162 | TCCTGGCTGGCACAAA | 10.18 | 16 |
|
| 19227 | 165 | CTGTCCTGGCTGGCAC | 24.37 | 17 |
|
| 19228 | 176 | CTCACCAGTTTCTGTCC | 22.30 | 18 |
|
| 19229 | 179 | TCACTCACCAGTTTCTG | 40.64 | 19 |
|
| 19230 | 185 | GTGCAGTCACTCACCA | 82.04 | 20 |
|
| 19231 | 190 | ACTCTGTGCAGTCACTC | 37.59 | 21 |
|
| 19232 | 196 | CAGTGAACTCTGTGCA | 40.26 | 22 |
|
| 19233 | 205 | ATTCCGTTTCAGTGAAC | 56.03 | 23 |
|
| 19234 | 211 | GAAGGCATTCCGTTTC | 32.21 | 24 |
|
| 19235 | 222 | TTCACCGCAAGGAAGG | 61.03 | 25 |
|
| 19236 | 250 | CTCTGTTCCAGGTGTCT | 62.19 | 26 |
|
| 19237 | 267 | CTGGTGGCAGTGTGTC | 70.32 | 27 |
|
| 19238 | 286 | TGGGGTCGCAGTATTT | 0.00 | 28 |
|
| 19239 | 289 | GGTTGGGGTCGCAGTA | 19.40 | 29 |
|
| 19240 | 292 | CTAGGTTGGGGTCGCA | 36.32 | 30 |
|
| 19241 | 318 | GGTGCCCTTCTGCTGG | 78.91 | 31 |
|
| 19242 | 322 | CTGAGGTGCCCTTCTGC | 69.84 | 32 |
|
| 19243 | 332 | GTGTCTGTTTCTGAGGT | 63.32 | 33 |
|
| 19244 | 334 | TGGTGTCTGTTTCTGAG | 42.83 | 34 |
|
| 19245 | 345 | ACAGGTGCAGATGGTG | 73.31 | 35 |
|
| 19246 | 348 | TTCACAGGTGCAGATG | 47.72 | 36 |
|
| 19247 | 360 | GTGCCAGCCTTCTTCAC | 61.32 | 37 |
|
| 19248 | 364 | TACAGTGCCAGCCTTCT | 46.82 | 38 |
|
| 19249 | 391 | GGACACAGCTCTCACA | 0.00 | 39 |
|
| 19250 | 395 | TGCAGGACACAGCTCT | 52.05 | 40 |
|
| 19251 | 401 | GAGCGGTGCAGGACAC | 50.15 | 41 |
|
| 19252 | 416 | AAGCCGGGCGAGCATG | 32.36 | 42 |
|
| 19253 | 432 | AATCTGCTTGACCCCA | 0.00 | 43 |
|
| 19254 | 446 | GAAACCCCTGTAGCAA | 0.00 | 44 |
|
| 19255 | 452 | GTATCAGAAACCCCTG | 36.13 | 45 |
|
| 19256 | 463 | GCTCGCAGATGGTATC | 64.65 | 46 |
|
| 19257 | 468 | GCAGGGCTCGCAGATG | 74.95 | 47 |
|
| 19258 | 471 | TGGGCAGGGCTCGCAG | 0.00 | 48 |
|
| 19259 | 474 | GACTGGGCAGGGCTCG | 82.00 | 49 |
|
| 19260 | 490 | CATTGGAGAAGAAGCC | 41.31 | 50 |
|
| 19261 | 497 | GATGACACATTGGAGA | 13.81 | 51 |
|
| 19262 | 500 | GCAGATGACACATTGG | 78.48 | 52 |
|
| 19263 | 506 | TCGAAAGCAGATGACA | 59.28 | 53 |
|
| 19264 | 524 | GTCCAAGGGTGACATT | 70.99 | 54 |
|
| 19265 | 532 | CACAGCTTGTCCAAGG | 0.00 | 55 |
|
| 19266 | 539 | TTGGTCTCACAGCTTGT | 45.92 | 56 |
|
| 19267 | 546 | CAGGTCTTTGGTCTCAC | 63.95 | 57 |
|
| 19268 | 558 | CTGTTGCACAACCAGG | 82.32 | 58 |
|
| 19269 | 570 | GTTTGTGCCTGCCTGTT | 70.10 | 59 |
|
| 19270 | 575 | GTCTTGTTTGTGCCTGC | 68.95 | 60 |
|
| 19271 | 590 | CCACAGACAACATCAG | 11.22 | 61 |
|
| 19272 | 597 | CTGGGGACCACAGACA | 9.04 | 62 |
|
| 19273 | 607 | TCAGCCGATCCTGGGG | 0.00 | 63 |
|
| 19274 | 621 | CACCACCAGGGCTCTC | 23.08 | 64 |
|
| 19275 | 626 | GGGATCACCACCAGGG | 57.94 | 65 |
|
| 19276 | 657 | GAGGATGGCAAACAGG | 49.14 | 66 |
|
| 19277 | 668 | ACCAGCACCAAGAGGA | 3.48 | 67 |
|
| 19278 | 679 | TTTTGATAAAGACCAG | 30.58 | 68 |
|
| 19279 | 703 | TATTGGTTGGCTTCTTG | 49.26 | 69 |
|
| 19280 | 729 | GGGTTCCTGCTTGGGG | 13.95 | 70 |
|
| 19281 | 750 | GTCGGGAAAATTGATC | 54.78 | 71 |
|
| 19282 | 754 | GATCGTCGGGAAAATT | 0.00 | 72 |
|
| 19283 | 765 | GGAGCCAGGAAGATCG | 69.47 | 73 |
|
| 19284 | 766 | TGGAGCCAGGAAGATC | 54.48 | 74 |
|
| 19285 | 780 | TGGAGCAGCAGTGTTG | 15.17 | 75 |
|
| 19286 | 796 | GTAAAGTCTCCTGCAC | 30.62 | 76 |
|
| 19287 | 806 | TGGCATCCATGTAAAG | 65.03 | 77 |
|
| 19288 | 810 | CGGTTGGCATCCATGT | 34.49 | 78 |
|
| 19289 | 834 | CTCTTTGCCATCCTCCT | 41.84 | 79 |
|
| 19290 | 861 | CTGTCTCTCCTGCACTG | 25.68 | 80 |
|
| 19291 | 873 | GGTGCAGCCTCACTGT | 76.27 | 81 |
|
| 19292 | 910 | AACTGCCTGTTTGCCCA | 63.34 | 82 |
|
| 19293 | 954 | CTTCTGCCTGCACCCCT | 0.00 | 83 |
|
| 19294 | 976 | ACTGACTGGGCATAGC | 11.55 | 84 |
|
As shown in Table 8, SEQ ID NOS:1, 2, 3, 6, 7, 8, 9, 10, 11, 12, 13, 15, 20, 23, 25, 26, 27, 31, 32, 33, 35, 37, 40, 41, 46, 47, 49, 52, 53, 54, 57, 58, 59, 60, 65, 71, 73, 74, 77, 81 and 82 demonstrated at least 50% inhibition of CD40 expression and are therefore preferred compounds of the invention.[0265]
Example 13Oligonucleotide-Sensitive Sites of the CD40 Target Nucleic AcidAs the data presented in the preceding two Examples shows, several sequences were present in preferred compounds of two distinct oligonucleotide chemistries. Specifically, compounds having SEQ ID NOS:1, 2, 7, 47 and 82 are preferred in both instances. These compounds map to different regions of the CD40 transcript but nevertheless define accessible sites of the target nucleic acid.[0266]
For example, SEQ ID NOS:1 and 2 overlap each other and both map to the 5-untranslated region (5′-UTR) of CD40. Accordingly, this region of CD40 is particularly preferred for modulation via sequence-based technologies. Similarly, SEQ ID NOS:7 and 47 map to the open reading frame of CD40, whereas SEQ ID NO:82 maps to the 3′-untranslated region (3′-UTR). Thus, the ORF and 3′-UTR of CD40 may be targeted by sequence-based technologies as well.[0267]
The reverse complements of the active CD40 compounds are easily determined by those skilled in the art and may be assembled to yield nucleotide sequences corresponding to accessible sites on the target nucleic acid. For example, the assembled reverse complement of SEQ ID NOS:1 and 2 is represented below as SEQ ID NO:92:
[0268] | |
| 5′-AGTGGTCCTGCCGCCTGGTC-3′ | SEQ ID NO: 92 | |
| ||||||||||||||||||||| |
| ||||||||||||||||||||| |
| TCACCAGGACGGCGGACC -5′ | SEQ ID NO: 1 |
| |
| ACCAGGACGGCGGACCAG-5′ | SEQ ID NO: 2 |
Through multiple iterations of the process of the invention, more extensive ‘footprints’ are generated. A library of this information is compiled and may be used by those skilled in the art in a variety of sequence-based technologies to study the molecular and biological functions of CD40 and to investigate or confirm its role in various diseases and disorders.[0269]
Example 14Site Selection ProgramIn a preferred embodiment of the invention, illustrated in FIG. 20, an application is deployed which facilitates the selection process for determining the target positions of the oligos to be synthesized, or ‘sites.’ This program is written using a three-tiered object-oriented approach. All aspects of the software described, therefore, are tightly integrated with the relational database. For this reason, explicit database read and write steps are not shown. It should be assumed that each step described includes data-base access. The description below illustrates one way the program can be used. The actual interface allows users to skip from process to process at will, in any order.[0270]
Before running the site picking program, the target must have all relevant properties computed as described previously and indicated in[0271]process step2204. When the site picking program is launched atprocess step2206 the user is presented with a panel showing targets which have previously been selected and had their properties calculated. The user selects one target to work with atprocess step2208 and proceeds to decide if any derived properties will be needed atprocess step2210. Derived properties are calculated by performing mathematical operations on combinations of pre-calculated properties as defined by the user atprocess step2212.
The derived properties are made available as peers with all the pre-calculated properties. The user selects one of the properties to view plotted versus target position at[0272]process step2214. This graph is shown above a linear representation of the target. The horizontal or position axis of both the graph and target are linked and scalable by the user. The zoom range goes from showing the full target length to showing individual target bases as letters and individual property points. The user next selects a threshold value below or above which all sites will be eliminated from future consideration atprocess step2216. The user decides whether to eliminate more sites based on any other proper-ties atprocess step2218. If they choose to eliminate more, they return to pick another property to display atprocess step2214 and threshold atprocess step2216.
After eliminating sites, the user selects from the remaining list by choosing any property at[0273]process step2220 and then choosing a manual or automatic selection technique atprocess step2222. In the automatic technique, the user decides whether they want to pick from maxima or minima and the number of maxima or minima to be selected as sites atprocess step2224. The software automatically finds and picks the points. When picking manually the user must decide if they wish to use automatic peak finding atprocess step2226. If the user selects automatic peak finding, then user must click on the graphed property with the mouse atprocess step2236. The nearest maxima or minima, depending on the modifier key held down, to the selected point will be picked as the site. Without the peak finding option, the user must pick a site atprocess step2238 by clicking on its position on the linear representation of target.
Each time a site, or group of sites, is picked, a dynamic property is calculated for all possible sites (not yet eliminated) at[0274]process step2230. This property indicates the nearness of the site two a picked site allowing the user to pick sites in subsequent iterations based on target coverage. After new sites are picked, the user determines if the desired number of sites has been picked. If too few sites have been picked the user returns to pick more2220. If too many sites have been picked, the user may eliminate them by selecting and deleting them on the target display atprocess step2234. If the correct number of sites is picked, and the user is satisfied with the set of picked sites, the user registers these sites to the database along with their name, notebook number, and page number atprocess step2238. The database time stamps this registration event.
Example 15Site Selection ProgramIn a preferred embodiment of the invention, illustrated in FIG. 21, an application is deployed which facilitates the assignment of specific chemical structure to the complement of the sequence of the sites previously picked and facilitates the registration and ordering of these now fully defined antisense compounds. This program is written using a three-tiered object-oriented approach. All aspects of the software described, therefore, are tightly integrated with the relational database. For this reason, explicit database read and write steps are not shown, it being understood that each step described also includes appropriate database read/write access.[0275]
To begin using the oligonucleotide chemistry assignment program, the user launches it at[0276]process step2302. The user then selects from the previously selected sets of oligonucleotides atprocess step2304, registered to the database in site picker'sprocess step2238. Next, the user must decide whether to manually assign the chemistry a base at a time, or run the sites through a template atprocess step2306. If the user chooses to use a template, they must determine if a desired template is available atprocess step2308. If a template is not available with the desired chemistry modifications and the correct length, the user can define one atprocess step2314.
To define a template, the user must select the length of the oligonucleotide the template is to define. This oligonucleotide is then represented as a bar with selectable regions. The user sets the number of regions on the oligonucleotide, and the positions and lengths of these regions by dragging them back and forth on the bar. Each region is represented by a different color.[0277]
For each region, the user defines the chemistry modifications for the sugars, the linkers, and the heterocycles at each base position in the region. At least four heterocycle chemistries must be given, one for each of the four possible base types (A, G, C or T or U) in the site sequence the template will be applied to. A user interface is provided to select these chemistries which show the molecular structure of each component selected and its modification name. By pushing on a pop-up list next to each of the pictures, the user may choose from a list of structures and names, those possible to put in this place. For example, the heterocycle that represents the base type G is shown as a two dimensional structure diagram. If the user clicks on the pop-up list, a row of other possible structures and names is shown. The user drags the mouse to the desired chemistry and releases the mouse. Now the newly selected molecule is displayed as the choice for G type heterocycle modifications.[0278]
Once the user has created a template, or selected an existing one, the software applies the template at[0279]process step2312 to each of the complements of the sites in the list. When the templates are applied, it is possible that chemistries will be defined which are impossible to make with the chemical precursors presently used on the automatic synthesizer. To check this, a database is maintained of all precursors previously designed, and their availability for automated synthesis. When the templates are applied, the resulting molecules are tested atprocess step2316 against this database to see if they are readily synthesized.
If a molecule is not readily synthesized, it is added to a list that the user inspects. At[0280]process step2318, the user decides whether to modify the chemistry to make it compatible with the currently recognized list of available chemistries or to ignore it. To modify a chemistry, the user must use the base at a time interface atprocess step2322. The user can also choose to go directly to this step, bypassing templates all together atprocess step2306.
The base at a time interface at[0281]process step2322 is very similar to the template editor atprocess step2314 except that instead of specifying chemistries for regions, they are defined one base at a time. This interface also differs in that it dynamically checks to see if the design is readily synthesized as the user makes selections. In other words, each choice made limits the choices the software makes available on the pop-up selection lists. To accommodate this function, an additional choice is made available on each pop-up of ‘not defined.’ For example, this allows the user to inhibit linker choice from restricting the sugar choices by first setting the linker to ‘not defined.’ The user would then pick the sugar, and then pick from the remaining linker choices available.
Once all of the sites on the list are assigned chemistries or dropped, they are registered at[0282]process step2324 to the commercial chemical structure database. Registering to this database makes sure the structure is unique, assigns it a new identifier if it is unique, and allows future structure and substructure searching by creating various hash-tables. The compound definition is also stored atprocess step2326 to various hash tables referred to as chemistry/position tables. These allow antisense compound searching and categorization based on oligonucleotide chemistry modification sequences and equivalent base sequences.
The results of the registration are displayed at[0283]process step2328 with the new IDs if they are new compounds and with the old IDs if they have been previously registered. The user next selects which of the compounds processed they wish to order for synthesis atprocess step2330 and registers an order list atprocess step2332 by including scientist name, notebook number and page number. The database time-stamps this entry. The user may than choose atprocess step2334 to quit the program atprocess step2338, go back to the beginning and choose a new site list to work withprocess step2304, or start the oligonucleotide ordering interface atprocess step2336.
Example 16Gene Walk to Optimize Oligonucleotide SequenceA gene walk is executed using a CD40 antisense oligonucleotide having SEQ ID NO:15 (5′-CTGGCACAAAGAACAGCA. In effecting this gene walk, the following parameters are used:
[0284] | |
| |
| Gene Walk Parameter | Entered value |
| |
| Oligonucleotide Sequence ID: | 15 |
| Name of Gene Target: | CD40 |
| Scope of Gene Walk: | 20 |
| Sequence Shift Increment: | 1 |
| |
Entering these values and effecting the gene walk centered on SEQ ID NO:15 automatically generates the following new oligonucleotides:
[0285]| TABLE 8 |
|
|
| Oligonucleotide Generated By Gene Walk | |
| 93 | GAACAGCACTGACTGTTT | |
|
| 94 | AGAACAGCACTGACTGTT |
|
| 95 | AAGAACAGCACTGACTGT |
|
| 96 | AAAGAACAGCACTGACTG |
|
| 97 | CAAAGAACAGCACTGACT |
|
| 98 | ACAAAGAACAGCACTGAC |
|
| 99 | CACAAAGAACAGCACTGA |
|
| 100 | GCACAAAGAACAGCACTG |
|
| 101 | GGCACAAAGAACAGCACT |
|
| 102 | TGGCACAAAGAACAGCAC |
|
| 15 | CTGGCACAAAGAACAGCA |
|
| 103 | GCTGGCACAAAGAACAGC |
|
| 104 | GGCTGGCACAAAGAACAG |
|
| 105 | TGGCTGGCACAAAGAACA |
|
| 106 | CTGGCTGGCACAAAGAAC |
|
| 107 | CCTGGCTGGCACAAAGAA |
|
| 108 | TCCTGGCTGGCACAAAGA |
|
| 109 | GTCCTGGCTGGCACAAAG |
|
| 110 | TGTCCTGGCTGGCACAAA |
|
| 111 | CTGTCCTGGCTGGCACAA |
|
| 112 | TCTGTCCTGGCTGGCACA |
|
The list shown above contains 20 oligonucleotide sequences directed against the CD40 nucleic acid sequence. They are ordered by the position along the CD40 sequence at which the 5′ terminus of each oligonucleotide hybridizes. Thus, the first ten oligonucleotides are single-base frame shift sequences directed against the CD40 sequence upstream of compound SEQ ID NO: 15 and the latter ten are single-base frame shift sequences directed against the CD40 sequence downstream of compound SEQ ID NO:15.[0286]
111218nucleic acidsinglelinear 1 CCAGGCGGCA GGACCACT 1818nucleic acidsinglelinear 2 GACCAGGCGG CAGGACCA 1818nucleic acidsinglelinear 3 AGGTGAGACC AGGCGGCA 1818nucleic acidsinglelinear 4 CAGAGGCAGA CGAACCAT 1818nucleic acidsinglelinear 5 GCAGAGGCAG ACGAACCA 1818nucleic acidsinglelinear 6 GCAAGCAGCC CCAGAGGA 1818nucleic acidsinglelinear 7 GGTCAGCAA GCAGCCCCA 1818nucleic acidsinglelinear 8 GACAGCGGTC AGCAAGCA 1818nucleic acidsinglelinear 9 GATGGACAGC GGTCAGCA 1818nucleic acidsinglelinear 10 TCTGGATGGA CAGCGGTC 1818nucleic acidsinglelinear 11 GGTGGTTCTG GATGGACA 1818nucleic acidsinglelinear 12 GTGGGTGGTT CTGGATGG 1818nucleic acidsinglelinear 13 GCAGTGGGTG GTTCTGGA 1818nucleic acidsinglelinear 14 CACAAAGAAC AGCACTGA 1818nucleic acidsinglelinear 15 CTGGCACAAA GAACAGCA 1818nucleic acidsinglelinear 16 TCCTGGCTGG CACAAAGA 1818nucleic acidsinglelinear 17 CTGTCCTGGC TGGCACAA 1818nucleic acidsinglelinear 18 CTCACCAGTT TCTGTCCT 1818nucleic acidsinglelinear 19 TCACTCACCA GTTTCTGT 1818nucleic acidsinglelinear 20 GTGCAGTCAC TCACCAGT 1818nucleic acidsinglelinear 21 ACTCTGTGCA GTCACTCA 1818nucleic acidsinglelinear 22 CAGTGAACTC TGTGCAGT 1818nucleic acidsinglelinear 23 ATTCCGTTTC AGTGAACT 1818nucleic acidsinglelinear 24 GAAGGCATTC CGTTTCAG 1818nucleic acidsinglelinear 25 TTCACCGCAA GGAAGGCA 1818nucleic acidsinglelinear 26 CTCTGTTCCA GGTGTCTA 1818nucleic acidsinglelinear 27 CTGGTGGCAG TGTGTCTC 1818nucleic acidsinglelinear 28 TGGGGTCGCA GTATTTGT 1818nucleic acidsinglelinear 29 GGTTGGGGTC GCAGTATT 1818nucleic acidsinglelinear 30 CTAGGTTGGG GTCGCAGT 1818nucleic acidsinglelinear 31 GGTGCCCTTC TGCTGGAC 1818nucleic acidsinglelinear 32 CTGAGGTGCC CTTCTGCT 1818nucleic acidsinglelinear 33 GTGTCTGTTT CTGAGGTG 1818nucleic acidsinglelinear 34 TGGTGTCTGT TTCTGAGG 1818nucleic acidsinglelinear 35 ACAGGTGCAG ATGGTGTC 1818nucleic acidsinglelinear 36 TTCACAGGTG CAGATGGT 1818nucleic acidsinglelinear 37 GTGCCAGCCT TCTTCACA 1818nucleic acidsinglelinear 38 TACAGTGCCA GCCTTCTT 1818nucleic acidsinglelinear 39 GGACACAGCT CTCACAGG 1818nucleic acidsinglelinear 40 TGCAGGACAC AGCTCTCA 1818nucleic acidsinglelinear 41 GAGCGGTGCA GGACACAG 1818nucleic acidsinglelinear 42 AAGCCGGGCG AGCATGAG 1818nucleic acidsinglelinear 43 AATCTGCTTG ACCCCAAA 1818nucleic acidsinglelinear 44 GAAACCCCTG TAGCAATC 1818nucleic acidsinglelinear 45 GTATCAGAAA CCCCTGTA 1818nucleic acidsinglelinear 46 GCTCGCAGAT GGTATCAG 1818nucleic acidsinglelinear 47 GCAGGGCTCG CAGATGGT 1818nucleic acidsinglelinear 48 TGGGCAGGGC TCGCAGAT 1818nucleic acidsinglelinear 49 GACTGGGCAG GGCTCGCA 1818nucleic acidsinglelinear 50 CATTGGAGAA GAAGCCGA 1818nucleic acidsinglelinear 51 GATGACACAT TGGAGAAG 1818nucleic acidsinglelinear 52 GCAGATGACA CATTGGAG 1818nucleic acidsinglelinear 53 TCGAAAGCAG ATGACACA 1818nucleic acidsinglelinear 54 GTCCAAGGGT GACATTTT 1818nucleic acidsinglelinear 55 CACAGCTTGT CCAAGGGT 1818nucleic acidsinglelinear 56 TTGGTCTCAC AGCTTGTC 1818nucleic acidsinglelinear 57 CAGGTCTTTG GTCTCACA 1818nucleic acidsinglelinear 58 CTGTTGCACA ACCAGGTC 1818nucleic acidsinglelinear 59 GTTTGTGCCT GCCTGTTG 1818nucleic acidsinglelinear 60 GTCTTGTTTG TGCCTGCC 1818nucleic acidsinglelinear 61 CCACAGACAA CATCAGTC 1818nucleic acidsinglelinear 62 CTGGGGACCA CAGACAAC 1818nucleic acidsinglelinear 63 TCAGCCGATC CTGGGGAC 1818nucleic acidsinglelinear 64 CACCACCAGG GCTCTCAG 1818nucleic acidsinglelinear 65 GGGATCACCA CCAGGGCT 1818nucleic acidsinglelinear 66 GAGGATGGCA AACAGGAT 1818nucleic acidsinglelinear 67 ACCAGCACCA AGAGGATG 1818nucleic acidsinglelinear 68 TTTTGATAAA GACCAGCA 1818nucleic acidsinglelinear 69 TATTGGTTGG CTTCTTGG 1818nucleic acidsinglelinear 70 GGGTTCCTGC TTGGGGTG 1818nucleic acidsinglelinear 71 GTCGGGAAAA TTGATCTC 1818nucleic acidsinglelinear 72 GATCGTCGGG AAAATTGA 1818nucleic acidsinglelinear 73 GGAGCCAGGA AGATCGTC 1818nucleic acidsinglelinear 74 TGGAGCCAGG AAGATCGT 1818nucleic acidsinglelinear 75 TGGAGCAGCA GTGTTGGA 1818nucleic acidsinglelinear 76 GTAAAGTCTC CTGCACTG 1818nucleic acidsinglelinear 77 TGGCATCCAT GTAAAGTC 1818nucleic acidsinglelinear 78 CGGTTGGCAT CCATGTAA 1818nucleic acidsinglelinear 79 CTCTTTGCCA TCCTCCTG 1818nucleic acidsinglelinear 80 CTGTCTCTCC TGCACTGA 1818nucleic acidsinglelinear 81 GGTGCAGCCT CACTGTCT 1818nucleic acidsinglelinear 82 AACTGCCTGT TTGCCCAC 1818nucleic acidsinglelinear 83 CTTCTGCCTG CACCCCTG 1818nucleic acidsinglelinear 84 ACTGACTGGG CATAGCTC 181004 base pairsnucleic acidsinglelinear 85 GCCTCGCTCG GGCGCCCAGT GGTCCTGCCG CCTGGTCTCA CCTCGCCATG 50 GTTCGTCTGC CTCTGCAGTG CGTCCTCTGG GGCTGCTTGC TGACCGCTGT 100 CCATCCAGAA CCACCCACTG CATGCAGAGA AAAACAGTAC CTAATAAACA 150 GTCAGTGCTG TTCTTTGTGC CAGCCAGGAC AGAAACTGGT GAGTGACTGC 200 ACAGAGTTCA CTGAAACGGA ATGCCTTCCT TGCGGTGAAA GCGAATTCCT 250 AGACACCTGG AACAGAGAGA CACACTGCCA CCAGCACAAA TACTGCGACC 300 CCAACCTAGG GCTTCGGGTC CAGCAGAAGG GCACCTCAGA AACAGACACC 350 ATCTGCACCT GTGAAGAAGG CTGGCACTGT ACGAGTGAGG CCTGTGAGAG 400 CTGTGTCCTG CACCGCTCAT GCTCGCCCGG CTTTGGGGTC AAGCAGATTG 450 CTACAGGGGT TTCTGATACC ATCTGCGAGC CCTGCCCAGT CGGCTTCTTC 500 TCCAATGTGT CATCTGCTTT CGAAAAATGT CACCCTTGGA CAAGCTGTGA 550 GACCAAAGAC CTGGTTGTGC AACAGGCAGG CACAAACAAG ACTGATGTTG 600 TCTGTGGTCC CCAGGATCGG CTGAGAGCCC TGGTGGTGAT CCCCATCATC 650 TTCGGGATCC TGTTTGCCAT CCTCTTGGTG CTGGTCTTTA TCAAAAAGGT 700 GGCCAAGAAG CCAACCAATA AGGCCCCCCA CCCCAAGCAG GAACCCCAGG 750 AGATCAATTT TCCCGACGAT CTTCCTGGCT CCAACACTGC TGCTCCAGTG 800 CAGGAGACTT TACATGGATG CCAACCGGTC ACCCAGGAGG ATGGCAAAGA 850 GAGTCGCATC TCAGTGCAGG AGAGACAGTG AGGCTGCACC CACCCAGGAG 900 TGTGGCCACG TGGGCAAACA GGCAGTTGGC CAGAGAGCCT GGTGCTGCTG 950 CTGCAGGGGT GCAGGCAGAA GCGGGGAGCT ATGCCCAGTC AGTGCCAGCC 1000 CCTC 100423 base pairsnucleic acidsinglelinear 86 CAGAGTTCAC TGAAACGGAA TGC 2323 base pairsnucleic acidsinglelinear 87 GGTGGCAGTG TGTCTCTCTG TTC 2325 base pairsnucleic acidsinglelinear 88 TTCCTTGCGG TGAAAGCGAA TTCCT 2519 base pairsnucleic acidsinglelinear 89 GAAGGTGAAG GTCGGAGTC 1920 base pairsnucleic acidsinglelinear 90 GAAGATGGTG ATGGGATTTC 2020 base pairsnucleic acidsinglelinear 91 CAAGCTTCCC GTTCTCAGCC 2020 base pairsnucleic acidsinglelinear 92 AGTGGTCCTG CCGCCTGGTC 2018nucleic acidsinglelinear 93 GAACAGCACT GACTGTTT 1818nucleic acidsinglelinear 94 AGAACAGCAC TGACTGTT 1818nucleic acidsinglelinear 95 AAGAACAGCA CTGACTGT 1818nucleic acidsinglelinear 96 AAAGAACAGC ACTGACTG 1818nucleic acidsinglelinear 97 CAAAGAACAG CACTGACT 1818nucleic acidsinglelinear 98 ACAAAGAACA GCACTGAC 1818nucleic acidsinglelinear 99 CACAAAGAAC AGCACTGA 1818nucleic acidsinglelinear 100 GCACAAAGAA CAGCACTG 1818nucleic acidsinglelinear 101 GGCACAAAGA ACAGCACT 1818nucleic acidsinglelinear 102 TGGCACAAAG AACAGCAC 1818nucleic acidsinglelinear 103 GCTGGCACAA AGAACAGC 1818nucleic acidsinglelinear 104 GGCTGGCACA AAGAACAG 1818nucleic acidsinglelinear 105 TGGCTGGCAC AAAGAACA 1818nucleic acidsinglelinear 106 CTGGCTGGCA CAAAGAAC 1818nucleic acidsinglelinear 107 CCTGGCTGGC ACAAAGAA 1818nucleic acidsinglelinear 108 TCCTGGCTGG CACAAAGA 1818nucleic acidsinglelinear 109 GTCCTGGCTG GCACAAAG 1818nucleic acidsinglelinear 110 TGTCCTGGCT GGCACAAA 1818nucleic acidsinglelinear 111 CTGTCCTGGC TGGCACAA 1818nucleic acidsinglelinear 112 TCTGTCCTGG CTGGCACA 18