Loading metrics
Open Access
Peer-reviewed
Research Article
Principles of Genome Evolution in the Drosophila melanogaster Species Group
- José M Ranz,
*To whom correspondence should be addressed. E-mail:jmr68@mole.bio.cam.ac.uk
Affiliation Department of Genetics, University of Cambridge, Cambridge, United Kingdom
⨯ - Damien Maurin,
Current address: Laboratory of Enzymology at Interfaces and Physiology of Lipolysis, CNRS, UPR 9025, Marseille, France
Affiliation Department of Genetics, University of Cambridge, Cambridge, United Kingdom
⨯ - Yuk S Chan,
Affiliation Department of Genetics, University of Cambridge, Cambridge, United Kingdom
⨯ - Marcin von Grotthuss,
Affiliation Department of Genetics, University of Cambridge, Cambridge, United Kingdom
⨯ - LaDeana W Hillier,
Affiliation Genome Sequencing Center, Washington University School of Medicine, St. Louis, Missouri, United States of America
⨯ - John Roote,
Affiliation Department of Genetics, University of Cambridge, Cambridge, United Kingdom
⨯ - Michael Ashburner,
Affiliation Department of Genetics, University of Cambridge, Cambridge, United Kingdom
⨯ - Casey M Bergman
Current address: Faculty of Life Sciences, University of Manchester, Manchester, United Kingdom
Affiliation Department of Genetics, University of Cambridge, Cambridge, United Kingdom
⨯
Principles of Genome Evolution in the Drosophila melanogaster Species Group
- José M Ranz,
- Damien Maurin,
- Yuk S Chan,
- Marcin von Grotthuss,
- LaDeana W Hillier,
- John Roote,
- Michael Ashburner,
- Casey M Bergman
- Published: June 5, 2007
- https://doi.org/10.1371/journal.pbio.0050152
Figures
Abstract
That closely related species often differ by chromosomal inversions was discovered by Sturtevant and Plunkett in 1926. Our knowledge of how these inversions originate is still very limited, although a prevailing view is that they are facilitated by ectopic recombination events between inverted repetitive sequences. The availability of genome sequences of related species now allows us to study in detail the mechanisms that generate interspecific inversions. We have analyzed the breakpoint regions of the 29 inversions that differentiate the chromosomes ofDrosophila melanogaster and two closely related species,D. simulans andD. yakuba, and reconstructed the molecular events that underlie their origin. Experimental and computational analysis revealed that the breakpoint regions of 59% of the inversions (17/29) are associated with inverted duplications of genes or other nonrepetitive sequences. In only two cases do we find evidence for inverted repetitive sequences in inversion breakpoints. We propose that the presence of inverted duplications associated with inversion breakpoint regions is the result of staggered breaks, either isochromatid or chromatid, and that this, rather than ectopic exchange between inverted repetitive sequences, is the prevalent mechanism for the generation of inversions in themelanogaster species group. Outgroup analysis also revealed evidence for widespread breakpoint recycling. Lastly, we have found that expression domains inD. melanogaster may be disrupted inD. yakuba, bringing into question their potential adaptive significance.
Author Summary
The organization of genes on chromosomes changes over evolutionary time. In some organisms, such as fruit flies and mosquitoes, inversions of chromosome regions are widespread. This has been associated with adaptation to environmental pressures and speciation. However, the mechanisms by which inversions are generated at the molecular level are poorly understood. The prevailing view involves the interactions of sequences that are moderately repeated in the genome. Here, we use molecular and computational methods to study 29 inversions that differentiate the chromosomes of three closely related fruit fly species. We find little support for a causal role of repetitive sequences in the origin of inversions and, instead, detect the presence of inverted duplications of ancestrally unique sequences (generally protein-coding genes) in the breakpoint regions of many inversions. This leads us to propose an alternative model in which the generation of inversions is coupled with the generation of duplications of flanking sequences. Additionally, we find evidence for genomic regions that are prone to breakage, being associated with inversions generated independently during the evolution of the ancestors of existing species.
Citation:Ranz JM, Maurin D, Chan YS, von Grotthuss M, Hillier LW, Roote J, et al. (2007) Principles of Genome Evolution in theDrosophila melanogaster Species Group . PLoS Biol 5(6): e152. https://doi.org/10.1371/journal.pbio.0050152
Academic Editor:Mohamed A. F. Noor, Duke University, United States of America
Received:October 20, 2006;Accepted:April 2, 2007;Published: June 5, 2007
Copyright: © 2007 Ranz et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding:This work was supported by a Biotechnology and Biological Sciences Research Council (BBSRC) grant (BBS/B/07705) to MA and JMR, and a Medical Research Council (MRC) Program Grant (G8225539) to MA and Steve Russell.
Competing interests: The authors have declared that no competing interests exist.
Abbreviations:BAC, bacterial artificial chromosome; bp, base pair; kb, kilobase; Mb, megabase; Myr, million years; TE, transposable element; UCSC, the University of California, Santa Cruz
Introduction
“Eventually the story of the chromosomal mechanisms and its evolution will have to be entirely rewritten in molecular terms” [1].
Over the last century, very detailed studies have been made by cytogeneticists of the intra- and interchromosomal changes that characterize genome evolution in groups as different as mammals (e.g., [2]) and flies (e.g., [3]; see [1,4] for reviews). Chromosome rearrangements are thought to play an important role in reproductive isolation between species [5–7] and in the adaptation of species to their environments [8–10]. These rearrangements may affect fitness by effectively reducing recombination in heterozygotes, thereby preserving co-adapted gene complexes [11,12], or by exerting position effects on loci neighboring breakpoints by modifying gene expression [13]. Only now, with the availability of “complete” genome sequences, can these structural changes in genomes be studied in the molecular detail, as foreseen by Michael White [1] over 30 years ago (e.g., [14–16]).
Genomic sequence data are beginning to reveal a remarkable diversity of patterns of genome rearrangement in different taxa ([17–21]; reviewed in [22]). For example, we see evidence for the recurrent presence of repetitive sequences near breakpoints [23–25] and evidence for the nonrandom distribution of genome breakpoints [16,26,27]. Moreover, there is evidence that large-scale gene expression domains are maintained as syntenic regions, perhaps because of a functional co-dependency of the genes that reside in these domains [20,28,29]. Comparative genomic data allow us to reconstruct the state of ancestral genome arrangements at key phylogenetic nodes [17,30] and to identify genomic regions conserved during the process of adaptation and divergence [31,32].
The genusDrosophila has long been a model for cytogenetic studies of genome evolution. Charles Metz's pioneering comparative studies of metaphase karyotypes in the genus [33], combined with subsequent comparative genetic studies, led Muller [34] to conclude that the integrity of chromosomearms is largely preserved in the genusDrosophila, despite a 2-fold variation in haploid chromosome number (see also [35]). The maintenance of the gene content of chromosomal arms is due to the paucity of inter-arm rearrangements (i.e., pericentric inversions and translocations) ([36,37]; see [38] for why this is so). Sturtevant and Dobzhansky [39] first showed how chromosome inversions can be used to study the evolutionary history of a species group, such as has been shown subsequently in the case of the endemic Hawaiian picture-winged group [3] or in the cactophilicrepleta species group of the Americas [40].Drosophila is a species-rich genus—about 1,500 species have been described [41]—and has an evolutionary history of perhaps over 120 million years (Myr;Figure S1; [42]). The wealth of information on genome rearrangement in the genusDrosophila can now be studied at the molecular level, using the genome sequences of 12 different species ofDrosophila that are available (http://rana.lbl.gov/drosophila/). Hitherto, the breakpoint regions of ten well-defined inversions have been characterized in Diptera: eight inDrosophila [25,43–49], and two inAnopheles [50,51]. Here we investigate the genome-wide patterns of rearrangement among three closely related species:D. melanogaster, D. simulans, andD. yakuba.
D. melanogaster, D. simulans, andD. yakuba are all members of themelanogaster species subgroup, a collection of nine species of Afrotropical origin [52].D. melanogaster andD. simulans are cosmopolitan sibling species that split from a common ancestor about 5.4 Myr ago [42] and can form (normally infertile) hybrids. Their polytene chromosome banding patterns are very similar, differing by only one large, and four small, paracentric inversions [53,54]. By contrast,D. yakuba, a species of the African savanna, is completely isolated reproductively fromD. melanogaster andD. simulans. These three species shared a common ancestor about 12.8 Myr ago [42]. The polytene chromosomes ofD. yakuba differ from those ofD. melanogaster by at least 28 fixed inversions [54]. The combination of prior cytological knowledge of inversion history and the close evolutionary distance among species in this group provides an unparalleled opportunity to reconstruct the detailed molecular events underlying genome rearrangements between animal genomes.
We studied the first interspecific inversion ever to be documented,In(3R)84F1;93F6–7, which differentiates chromosome 3 ofD. melanogaster and the species of thesimulans clade [55,56]. We characterized its breakpoint regions at the molecular level, i.e., the genomic regions that encompass both the sites of chromosome breakage and adjacent sequences. We detected inverted duplications of sequences present in the breakpoint regions, a pattern also shown by the breakpoint regions of other chromosomal rearrangements recently characterized [49,51,57]. One of the breakpoint regions associated with this inversion overlaps that of another inversion that took place on the lineage toD. yakuba, suggesting that some genomic regions are repeatedly broken over time. By a large-scale comparison of the molecular organization of the genomes ofD. melanogaster andD. yakuba, we asked if the features associated with inversionIn(3R)84F1;93F6–7 reflect a recurrent pattern of genome rearrangement in themelanogaster species subgroup.
We found that approximately 59% (17/29) of the inversions fixed betweenD. melanogaster andD. yakuba show evidence of inverted duplication of protein-coding genes or other nonrepetitive sequences present at the breakpoint regions. The prevalence of inverted duplications at inversion breakpoint regions suggests a mechanism of staggered breaks, either isochromatid or chromatid, as the most parsimonious explanation for their origin. Computational analyses failed to find support for the generalized presence of dispersed, repetitive sequences in co-occurrent breakpoint regions, i.e., those that set the limits of a particular inversion. We conclude that the generation of chromosomal rearrangements in the lineages studied is not necessarily linked to ectopic recombination events between repetitive sequences. We also find evidence for the independent breakage of the same genomic region in different lineages, i.e., fragile regions [16,25–27], and in one case, we are able, for the first time in Diptera, to reconstruct the reuse of a breakpoint region.
Results/Discussion
Experimental and Computational Analysis of InversionIn(3R)84F1;93F6–7 Fixed betweenD. melanogaster and Species of thesimulans Clade
In a remarkable study, Sturtevant and Plunkett [56] deduced from genetic evidence that the chromosomes ofD. melanogaster andD. simulans differed by an inversion on the right arm of chromosome 3. This inversion was later confirmed by an analysis of the polytene chromosomes of the interspecific hybrids ([58];see also [53]). We have directly cloned the breakpoints of this inversion from the genome ofD. simulans and, by a combination of experimental and computational methods, characterized the breakpoint regions in the genome sequences ofD. melanogaster, D. simulans, andD. yakuba. The structure of the two breakpoint regions of this inversion is illustrated inFigure 1.
These three genomic regions harbor the breakpoints of the paracentric inversions 3R(7) and 3R(8), also known asIn(3R)84F1;93F6–7, and have been reconstructed by BLAST analysis, in situ hybridization, resequencing, and whole-genome alignments at UCSC (http://genome.ucsc.edu/). According to the information inD. erecta and different outgroup species (Table 1),D. simulans (S) is the species that best represents the ancestral (A) configuration for all three regions. Reference genes at the different breakpoint regions have been colored red, blue, and orange. Between some of the reference genes, putatively expressed genes (green and yellow; [60]) and repetitive sequences (pink) are also present. Other surrounding genes are indicated in brown. Top, cytological coordinates of the regions inD. melanogaster (M). Long horizontal lines indicate chromosomes; solid pattern indicates key region; and dashed pattern indicates chromosomal stretch separating key regions. Cen, centromere; Tel, telomere. The head of each colored horizontal arrow represents the 3′ end of each gene or putative gene. Chromosomal segments included in the inversion 3R(7) and 3R(8) are indicated by dotted lines. For both inversions, the sequences between paired staggered breakpoints are indicated by short horizontal solid lines. Roman numerals indicate different chromosomal stretches spanning inversion breakpoints that were sequenced as a control. Vertical arrows indicate the localization in the ancestor(D. simulans) of the four breakpoints(a, b, c, andd) that are necessary to explain the inversion 3R(8) and the duplication ofHDC14862, pfd800, andHDC12400 at 84E9 (3R:3862326–3867817; 3R:3874931–3876653) and 93F6–7 (3R:17554739–17562483) ofD. melanogaster (seeFigure 2). The gene configurationCG7918-CG34034-CG5849 has been disrupted independently in the lineages ofD. melanogaster andD. yakuba (Y) by the inversions 3R(8) and 3R(7), respectively. InD. melanogaster, the gene pairCG2708 (Tom34)-CG31176 is also disrupted, whereas inD. yakuba, CG31286-CG1315 is disrupted. Inversion 3R(8) and its associated duplication event generate an apparently full copy of the putative expressed geneHDC14862 in 3R:93E10-F2 ofD. melanogaster. This contains 56–59 bp from the 3′ UTR of the geneCG2708 (blue triangle) within one of its putative introns.HDC14862 is present as two different fragments both inD. simulans andD. yakuba (see main text for details). Further, the inversion 3R(7) has disrupted the antisense overlap ofCG31286 andCG1315 inD. yakuba: the antisense configuration is conserved at 84A1 ofD. melanogaster andD. simulans, as well as in other species (Table 1). Inversion 3R(7) was accompanied by a duplication ofCG34034 and a complex pattern of rearrangement that also involved a fragment of the 5′ region ofHDC14862. The two open reading frames (ORFs) ofCG34034 are functional according to GENSCAN (http://genes.mit.edu/GENSCAN.html), although the putative protein sequences they encode differ substantially from that of their orthologs inD. melanogaster andD. erecta. Some stretches with significant homology withCG31286 are also detected adjacent toCG1315 inD. yakuba. The reference geneCG31286 is also tandemly duplicated and adjacent toCG34034. InD. yakuba, there are three copies ofCG31286, two of them being pseudogenes (denoted as a red gradient). Only the copy immediately distal toHDC12143 is functional, although it apparently codes for only one of the two isoforms of itsD. melanogaster ortholog. Genes and distances between them are not represented proportionally.
Co-Dependent Breakpoint Regions inD. melanogaster versusD. yakuba and Inferences on the Phylogenetic Occurrences of Inversions
The mechanism is illustrated in relation to the inversion 3R(8), which is fixed in the lineage toD. melanogaster. (A) Ancestral state inD. simulans (Figure 1). (B) Two pairs of staggered single-strand breaks(a-b andc-d) result in long 5′-overhangs (C), which can then be filled in (grey dashed arrow); when followed by nonhomologous end joining, this may result in an inversion flanked by inverted duplications of the sequences between the paired single-strand breaks (D). Landmarks: A,CG2708; B,HDC14862 (3′); C,pfd800; D,HDC12400; E,HDC14861; F,HDC14861; G,CG31176; H,CG7918; I,HDC14862 (5′); J,CG34034; and K,CG5849. Color code as inFigure 1.Figure S2 illustrates the model for the formation of the inversion 3R(7).
To clone theIn(3R)84F1;93F6–7 breakpoints, we performed in situ hybridizations to polytene chromosomes ofD. simulans (and to those ofD. melanogaster OR-R as a control), using fiveD. melanogaster bacterial artificial chromosomes (BACs) that we expected to cross the breakpoints of the majorD. simulans inversion at 84F1 (BACR07M14 and BACR45A07) and at 93F6–7 (BACR16N15, BACR42I20, and BACR08K01) [54]. A BAC that includes an inversion breakpoint must necessarily yield two hybridization signals on chromosome arm 3R ofD. simulans, but only one on that ofD. melanogaster. We determined that BACR07M14 contains the proximal breakpoint and that BACR16N15 contains the distal breakpoint of this inversion. The breakpoints within these BACs were narrowed down by in situ hybridization with probes of genes selected from the predicted cytological coordinates of the breakpoints [54]. We determined that the limits of this inversion were between the protein-coding genesCG2708 andCG7918, proximally, andCG31176 andCG34034, distally.
The gene pairsCG2708-CG7918 andCG31176-CG34034 delimit two breakpoint regions inD. melanogaster of 22.6 and 17.8 kilobases (kb) long at 84E9–10 and 93E10-F2, respectively (Figure 1). Neither region contains any annotated protein-coding genes in theDrosophila genome Release 4.3 annotation (http://chervil.bio.indiana.edu:7092/annot/), with only the non-LTR retrotransposonsBS andCr1a in the region at 84E9–10 as identifiable features [59]. We further characterized the inversion breakpoint regions inD. melanogaster by BLAST analysis and found the presence of four putatively expressed sequences [60] and a sequence said to be related to the mammalian proto-oncogenec-fos (pfd800). The order of the sequences at these breakpoint regions is, from centromere to telomere:HDC14862-pfd800-HDC12400-Cr1a-BS-HDC14862 at 84E9–10, andHDC14860-HDC14861-HDC12400-pfd800-HDC14862 at 93E10-F2 (Figure 1).
Notably, three of these sequences(HDC14862, pfd800, andHDC12400) are present at both breakpoint regions in an inverted orientation with respect to each other (Figure 1). The nucleotide identity between duplicated stretches is about 95% across approximately 6.3 kb of aligned sequence. Their divergence is greater than the divergence of theCr1a andBS sequences from the consensus sequences of these elements, 3.2% and 0.5%, respectively. This suggests that the transposable elements (TEs) inserted more recently than the duplication event.
The location of the inverted duplicated sequences at both breakpoint regions was confirmed by in situ hybridization. Sequences in this duplicated interval are not found elsewhere in the genome ofD. melanogaster, as shown both computationally by BLAST analysis and experimentally by in situ hybridization with appropriate probes. Using probes for theHDC14862, pfd800, andHDC12400 sequences, we found that the duplication is also present in the Zimbabwe 2 strain ofD. melanogaster, which is from an ancestral population relative to cosmopolitan and laboratory strains [61], suggesting the duplication is widespread or fixed inD. melanogaster. Furthermore, BLAST analysis against theD. simulans andD. yakuba genomes suggested (seeMaterials and Methods), and interspecific in situ hybridization confirmed, that the region duplicated inD. melanogaster is present as a single copy in both theD. simulans andD. yakuba genomes. This analysis indicates that the duplication of sequences associated with the breakpoint regions inD. melanogaster represents the derived state relative to that ofD. simulans. A similar pattern of inverted duplicated sequences at breakpoint regions has been reported for the polymorphic inversionIn(3R)P inD. melanogaster [49], the polymorphic inversionIn(2L)a inAnopheles gambiae [51], and for the pericentric inversion fixed betweenPan troglodytes chromosome 10 and the homologousHomo sapiens chromosome 12 [57].
The comparison of the molecular organization of the breakpoint regions ofIn(3R)84F1;93F6–7 betweenD. melanogaster, D. simulans, and the outgroup speciesD. yakuba revealed that a second inversion fixed in the lineage that leads toD. yakuba reused one of theIn(3R)84F1;93F6–7 breakpoint regions. InD. yakuba, theCG2708-CG31176 breakpoint region is identical in molecular organization to that ofD. simulans, further supporting the hypothesis thatIn(3R)84F1;93F6–7 is derived, occurring on theD. melanogaster lineage. In contrast, the geneCG7918 remains adjacent toCG34034, but in a different chromosomal location from that ofCG5849, which is in turn adjacent to a second copy ofCG34034. InD. simulans, D. erecta, and other distantly related species (Table 1), the genesCG7918, CG34034, andCG5849 are collinear andCG34034 is present in a single copy. InD. yakuba, the gene pairsCG7918-CG34034 andCG34034-CG5849 are found close to the genesCG1315 andCG31286, respectively.CG1315 andCG31286 are adjacent inD. melanogaster, D. simulans, and otherDrosophila species (Table 1), indicating this to be the ancestral organization for this region. Therefore, theCG7918-CG34034-CG5849 interval has been independently disrupted by another inversion on theD. yakuba lineage, although the precise breakpoints differ from those associated withIn(3R)84F1;93F6–7. This inversion on theD. yakuba lineage is associated with inverted duplications ofCG34034 andCG31286 (Figure 1; see below). The reuse of the breakpoint regionCG7918-CG34034 is the second example inDrosophila of recurrent breakage, demonstrated at the molecular level [25], and is the first in which the associated inversion events can be unambiguously deciphered.
The association of inverted duplications with these breakpoint regions is not consistent with a model of inversion origin by recombination between two copies of the same TE [62]. We propose a model of staggered breaks. These breaks may either be isochromatid (Figures 2 and S2, see also [57]), occurring during premeiotic mitosis, or chromatid, occurring during meiotic prophase (Figure S3). A potential difficulty of the isochromatid model is the length of DNA that would need to be unwound, presumably by helicase activity. Alternative mechanisms, such as multiple rearrangements or recombination between two independent, but similar, inversions [38], cannot be ruled out, but they are less parsimonious. In either case, the frequent presence of duplications at co-occurrent breakpoint regions argues against a simple “cut-and paste” mechanism of inversion formation [44]. An important implication of our model is that the presence of inverted duplications at co-occurrent breakpoint regions allows the unambiguous determination of the polarity of chromosome change [49,51]. Traditionally, phylogenetic trees ofDrosophila based on inversion analysis have been unrooted (e.g., [3,54]). Outgroup analysis can allow the determination of ancestral and derived states, as realized for polytene chromosome inversion phylogenies ([63]; see also [64]), but the widespread signature of inverted duplications provides another independent source of data for polarizing inversion history (see below).
In the case ofIn(3R)84F1;93F6–7, four breaks (a, b, c, andd inFigure 1) would have occurred in an ancestral chromosomal arrangement that is now best represented in theD. simulans genome. The breakpoint pairsa-c andb-d (which have been confirmed by resequencing;Figure 1) would each represent staggered breaks within a single chromatid inFigure 2.CG2708 andHDC14862 overlap by 56–59 base pairs (bp) inD. simulans. Breakpointa occurred at the 5′ end of this overlap, duplicating this region inD. melanogaster. Breakpointb occurred in the region betweenHDC12400 andHDC14861. Breakpointc occurred downstream of the “exon” 2 of the distal, partial copy ofHDC14862 inD. simulans, which roughly corresponds to the intron between “exons” 2 and 3 of the “complete” copy ofHDC14862 ofD. melanogaster (roughly upstream of the start of the overlapping region withCG2708). The fourth breakpoint,d, is found 1,760–1,764 bp downstream of breakpointc inD. simulans, at 25 bp from the start of the “exon” 1 ofHDC14862. End-filling followed by nonhomologous end joining in the inverted orientation (Figure 2) would result in both the inversionIn(3R)84F1;93F6–7, the duplication of the region includingHDC14862, pfd800, andHDC12400, and the fortuitous formation of what is considered a “complete” copy of the putatively expressed sequenceHDC14862.
Comparative Analysis of Genome Organization betweenD. melanogaster andD. yakuba
We used a computational approach to identify genome-wide disruptions in gene order between the chromosomes ofD. melanogaster andD. yakuba. EachD. melanogaster transcript was used as a query in a high stringency (E < 10−30) BLASTN search against the genomic sequence ofD. yakuba. This allowed us to map unambiguously 12,690 genes (94.4% of those of Release 4.1) ofD. melanogaster on the genome sequence ofD. yakuba. A comparison of the gene orders of the two species identified 55 gene-order disruptions between them, which appear as discontinuities in the coordinates of neighboring genes in one species relative to the other (Tables 1 and S1). All predicted gene-order disruptions identified using this gene-based BLAST approach are also identified as termini of whole-genome global alignments at the University of California, Santa Cruz (UCSC) [65]. These 55 gene-order disruptions define 59 syntenic blocks between these species (since both species have four chromosomes) (Table S2). The location and relative orientation of the syntenic blocks for chromosome 2 ofD. melanogaster andD. yakuba are shown inFigure 3; similar data are shown for chromosomes X and 3 inFigure S4. We do not show the small chromosome 4 (syntenic block 59), since our results indicate that this chromosome is wholly collinear in the two species over the sequenced region [66]. Syntenic blocks 13, 26, and 46 include the centromeric heterochromatic regions for chromosomes X, 2, and 3, respectively. We are unable, given the present sequence data, to detect any chromosome rearrangements within these heterochromatic regions or those on chromosome 4.
Similar plots are shown for Muller's element A (chromosome X) and Muller's elements D and E (chromosome 3) inFigure S4A andS4B, respectively. The outermost protein-coding genes of consecutive syntenic blocks are indicated. Following Bridges [116], syntenic blocks (defined as regions in which the relative gene order is globally conserved betweenD. melanogaster andD. yakuba) are numbered takingD. melanogaster as a reference and in an increasing order from the telomere of chromosome X (number 1) to the telomere of the right arm of chromosome 3 (number 58); an arrowhead indicates the orientation of the segments. Lines between chromosomes match homologous syntenic blocks between species. The pericentric inversion between Muller's elements B and C during the divergence ofD. melanogaster andD. yakuba is shown by a color code, whereas syntenic blocks on the left and right arms ofD. melanogaster appear in orange and green, respectively; the syntenic block that contains the centromere, number 26, is not colored. The solid triangles denote the geneCG6081, whose duplication accompanied the origin of the inversion 2L(2) (Figures 2, S2, and S3).
To obviate possible artifacts of the assembly process (see Material and Methods) on our results, and directly confirm our predictions of the gene order around theD. yakuba breakpoint regions relative to those ofD. melanogaster, we cloned and sequence verified a sample of 27 of the predicted breakpoint regions fromD. yakuba, each containing the transition between adjacent syntenic blocks (seeMaterials and Methods). In every case, our predictions were directly confirmed (Table S1). This result is consistent with the fact that all predicted gene-order disruptions are found in high-quality, contiguous (i.e., ungapped) regions of theD. yakuba assembly. In fact, breakpoint regions inD. yakuba are sequenced to an average depth of 8× and are supported by an average of 14 clone pairs. These results demonstrate that the gene-order disruptions inferred between theD. yakuba andD. melanogaster genomes are not assembly artifacts.
Approximately 117.8 megabases (Mb) of theD. melanogaster genome and about 118.9 Mb of theD. yakuba genome are included in the 59 syntenic blocks as defined by their outermost markers or reference genes. The amount of nonheterochromatic DNA not included in these syntenic blocks is 542 kb of theD. melanogaster genome and 674 kb of theD. yakuba genome. This is an upper estimate because in some cases, there is noncoding homology between the reference genes that define two consecutive syntenic blocks (see below). The median size of syntenic blocks is 1.66 Mb inD. melanogaster, and 1.61 Mb inD. yakuba. Excluding the syntenic blocks that contain centromeric heterochromatin (blocks 13, 26, and 46), the largest (syntenic block 57) is just over 6 Mb (~5.2% of the genome in both species), and the smallest is 161 kb (syntenic block 22, 0.08% of theD. melanogaster genome; and syntenic block 25, 0.08% of theD. yakuba genome). The length of genomic regions in each syntenic block is highly correlated across species (Spearman ρ = 0.997,p = 3.78 × 10−61; blocks 13, 26, and 46 not included), and in only two cases (blocks 26 and 43), do they differ by more than 10%. The DNA content per syntenic block does not differ significantly betweenD. melanogaster andD. yakuba (Wilcoxon signed rank test,Z = −1.273,p = not significant [n.s.]; blocks 13, 26, and 46 not included). A departure of the observed distribution of the lengths of syntenic blocks from that expected if the breakpoints were randomly distributed across the genome (a truncated negative exponential distribution) would allow us to discard the random breakage model of chromosome evolution [26,67]. Based on the comparison of the empirical and theoretical distributions, we cannot reject the random breakage model (Kolmogorov-Smirnov test,D = 0.2,p = n.s.; blocks 13, 26, and 46 not included).
Despite the conservative criteria used in our BLAST analysis, its resolution is sufficient to detect gene sequences that may have “escaped” synteny by transposition, as has been observed inDrosophila both experimentally, e.g., [68], and by genomic analyses [69–71]. We detected 22 potential transposition events betweenD. melanogaster andD. yakuba, with 12 occurring unambiguouslybetween chromosome arms and eight eventswithin chromosome arms (Tables S3 andS4). This number is likely to be an underestimate because we used stringent criteria for paralogy. Of the 22 events that we detected, 20 are duplicative transpositions and two are conservative transpositions.
Reconstruction of the Inversion History betweenD. melanogaster andD. yakuba
Muller [34] defined the six fundamental elements of the karyotype of the genusDrosophila (now referred to as Muller's elements A–F, each corresponding to a chromosome arm ofD. melanogaster). The overall gene content of these elements has been conserved during the evolution of the genus as witnessed by the very few inter-element rearrangements (i.e., pericentric inversions and translocations) that have been reported. Previous analysis of inversion differences betweenD. melanogaster andD. yakuba based on polytene chromosome revealed 28 inversions, of which only one, on chromosome 2 was pericentric [54] (Table 2).
Magnitude of Chromosomal Change betweenD. melanogaster andD. yakuba for the Different Muller's Chromosomal Elements
We established which pairs of breakpoint regions define particular inversions by taking into account the contiguity relationships in both species of the outermost genes of syntenic blocks betweenD. melanogaster andD. yakuba (Figures 3 and S4;Table S1). In general, our computational analysis of the genome sequences of these two species is broadly compatible with previous results based on polytene chromosomes [54]. We inferred that 29 inversions distinguish the chromosomes ofD. melanogaster andD. yakuba, of which 28 are paracentric and one corresponds to the pericentric inversion on chromosome 2 (Table 2). The total number of inversions inferred computationally is just one more than that suggested by polytene chromosome analysis [54], although the greater resolution of the sequence analysis increases the number of breakpoints from 48 to 55 and refines their positions (Tables 1 and2).
Our analysis shows many discrepancies in detail when compared to previous work ([54];Tables 1 and2). This is especially true on the X chromosome, where the banding pattern has diverged greatly in themelanogaster species group. On chromosome 2, there is what Lemeunier and Ashburner [54] interpreted as a single pericentric inversion, which distinguishesD. yakuba and its relatives,D. teissieri, D. erecta, andD. orena, fromD. melanogaster and the three species of theD. simulans clade. As shown inFigure 3, there is a complex mosaic of syntenic blocks between the two arms of chromosome 2. In good agreement with the previous work [54], a single pericentric inversion, 2LR(5), is sufficient to explain this pattern. This inversion has identical limits in bothD. yakuba andD. erecta. Inverted duplications at the breakpoint regions in both species (Table S5, see below) and information on gene order in other outgroup species (Table 1) strongly suggest that this inversion occurred in the common ancestor ofD. yakuba andD. erecta after this lineage split from that leading to themelanogaster-simulans complex.Figure S6 illustrates one of the most parsimonious scenarios that explains the evolution of chromosome 2.
Inversion-Mediated Duplication Is Frequent at Breakpoint Regions
We characterized in detail the sequences of the 55 breakpoint regions ofD. yakuba because genomic and phylogenetic evidence suggested that virtually all inversion events betweenD. melanogaster andD. yakuba occurred on theD. yakuba lineage (Table 1; see below). Remarkably, in 34 of 55 (approximately 62%) breakpoint regions, we detected the presence of duplications of sequences that are only present once in the genome ofD. melanogaster. In each case, these duplications are specifically associated with the pair of breakpoint regions that limit a particular inversion (Table S5; see below). These duplications are not repetitive in theD. yakuba genome (by BLAST analysis), nor do they match any identifiableDrosophila TE. In a control experiment, the genomic regions ofD. melanogaster that correspond to the co-occurrent breakpoint regions ofD. yakuba were compared to each other. Repetitive sequences were found in six cases; in no case other than that ofIn(3R)84F1;93F6–7 (seeFigure 1) were duplications of unique sequences found.
In total, 18 of 29 inversions (approximately 62%) fixed betweenD. melanogaster andD. yakuba are associated with duplications of sequences included at co-occurrent breakpoint regions. These duplicated sequences are in opposite orientations in the co-occurrent breakpoints of 17 inversions; 3R(6) is the only exception, potentially as a result of a subsequent microinversion [72]. These sequence duplications include 22 full or partial duplications of protein coding genes. Most of these (exceptions areCG14817 at Xy(1) and Xy(4),CG6081 at 2y(15) and 2y(18), andCG34034 at 3y(46) and 3y(53)) have accumulated many point and indel mutations, and are presumed to be nonfunctional. The average nucleotide identity (± the standard deviation [SD]) between duplicates is approximately 88% ± 5.4%. For six of the inversions, sequences from both breakpoint regions are present as inverted duplications at each breakpoint. For the remaining 12 inversions, sequences from only one of the two breakpoint regions are duplicated. This may be due either to the evolutionary loss, by sequence change, of one of the copies of an original duplication, or to the fact that only one of the pair of single-stranded breaks was significantly staggered (Figure S5A andS5B, respectively). The size of the duplications varies significantly inD. yakuba (median = 321 bp, coefficient of variation [CV] = 81% counting only one of the copies when in tandem;Table S5), but in no case do they involve more than about 1.9 kb of aligned sequence (the shortest duplication is 46-bp long).
In many taxa, repeated sequences have been found to be associated with rearrangement breakpoints and have been implicated in mediating chromosomal rearrangements by a process of ectopic exchange. This has been the case for tRNAs and ribosomal protein genes in yeasts [73,74], segmental duplications in the human-mouse [24] and human-primate lineages [75–78], and TEs in many organisms [46,79–81]. InD. melanogaster, there is abundant experimental evidence that exchange between TEs can result in chromosome rearrangement (e.g., [82]). Comparative sequence data also indicate that TEs are abundant at interspecific breakpoint regions between Diptera species [25,69], and there is strong evidence implicating TE-mediated ectopic exchange events in four [25,46,47,51] of the ten well-defined inversions whose breakpoint regions have been characterized at the molecular level (Table 3).
Presence of Duplications and Repetitive Sequences at Breakpoint Regions of Characterized Dipteran Inversions
We analyzed the breakpoint regions ofD. yakuba for TE sequences using RepeatMasker with the Release 4.2 TE annotation of theD. melanogaster genome [83] and by BLAST2 analysis using as a query TEs sequences from species other thanD. melanogaster. Over 45% of breakpoint regions (25/55) include repetitive sequences inD. yakuba (Table S6), but only five co-occurrent pairs of breakpoint regions (involving inversions 2LR(5), 2L(6), 2LR(8), 3L(3)/3L(4), and 3R(6)) include a similar repetitive sequence (Table S6). These analyses would fail to detect any repetitive sequence absent from the RepeatMasker library (as would be those exclusive toD. yakuba) or not yet characterized inD. yakuba. For this reason, we manually extracted from theD. yakuba breakpoint regions a set of sequences, each corresponding to the precise transition region between syntenic blocks, and used them as BLAST queries to the entireD. yakuba genome. Similar repetitive sequences were found at the co-occurrent breakpoints of the inversions X(1), 2L(6), 3L(5), and 3R(7), although only in the case of 2L(6) and 3R(7) are the copies of the repetitive sequence inverted with respect to each other. The average length of these sequences was 685 bp and the range 49–3,037 bp. Unfortunately, we can neither date the insertion of these repetitive sequences (with respect to the time of occurrence of the inversion), nor can we assert that theabsence of repetitive sequences at other pairs of co-occurrent breakpoint regions is not due to their decay or loss subsequent to the occurrence of an inversion. Nevertheless, these data provide little direct evidence for the presence of TEs in generating fixed inversions betweenD. melanogaster andD. yakuba and, combined with the recurrent presence of inverted duplications of nonrepetitive sequences, suggests that ectopic recombination between TEs has not been the dominant mechanism of generating inversions in this lineage. These results contrast with the presence of inverted TEs at co-occurrent breakpoints of well-defined inversions (Table 3).
Lineage-Specific Rates of Chromosomal Evolution
We mapped the derived state of the 29 inversions between the two genomes to theD. melanogaster orD. yakuba lineages, using several independent criteria (Table 1): (1) by determining the arrangement of each gene pair disrupted by an inversion inD. melanogaster versusD. yakuba in five other sequencedDrosophila species; (2) by the presence of inverted duplications associated with co-occurrent breakpoints, as discussed above; and (3) by the disruption of a tandem array of related genes, or of a pair of genes whose transcripts show 3′-overlap (see below), which we also consider to be a derived state. In all cases in which we can use more than one of these criteria, all are consistent. Our analyses show that of 29 inversions, 28 have been fixed in the lineage leading toD. yakuba, and only one (3R(8), also known asIn(3R)84F1;93F6–7) on the lineage leading toD. melanogaster (eight of the former inversions occurred before theD. erecta/D. yakuba split). This difference is highly significant (one-tailed binomialp = 5.59 × 10−8) and agrees well with previous interpretations [64], demonstrating that rates of chromosomal evolution can vary by over an order of magnitude even among closely related species. The origin of this very asymmetric rate of fixation cannot stem from differences in the degree of intraspecific polymorphism, as has been proposed forD. pseudoobscura andD. subobscura [84], becauseD. melanogaster is substantially more polymorphic for inversions thanD. yakuba [54]. Rather, it might reflect different effective population sizes between the African populations of the immediate ancestors ofD. melanogaster andD. yakuba [85,86].
We used the number of breakpoints per Mb per Myr to correct for differences in chromosomal size in a comparison of rates of chromosomal evolution between species pairs of differentDrosophila groups (Table 4) in which we assumed a constant rate of evolution as a null hypothesis. In view of the pericentric changes in chromosome 2 (Muller's elements B+C), we combined the data for these elements. The overall rate of breakage in theD. melanogaster/D. yakuba lineage is 0.0183/Mb/Myr. This is slower than that seen in theD. pseudoobscura/D. miranda (Gadj = 38.9;d.f. = 1;p < 4.4 × 10−10) andD. pseudoobscura/D. subobscura (Gadj = 48.5;d.f. = 1;p < 3.4 × 10−12) comparisons, comparable with the rate seen in the comparisonD. virilis/D. montana (Gadj = 0.5;d.f. = 1;p = n.s.) and accelerated with respect to that in therepleta species group (Gadj = 4.3;d.f. = 1;p < 4.3 × 10−9). Across Muller's elements, the rank order of the rate of chromosome evolution is A > (B+C) > E > D, which agrees well with the genus-wide pattern of rates of evolution A > E > D proposed by [87], based on the comparisons ofD. melanogaster andD. repleta [21,87] and ofD. virilis, D. montana, andD. novamexicana [88]. Nevertheless, Muller's elements B+C appear to have evolved faster in theD. melanogaster/D. yakuba lineage than inD. melanogaster/D. repleta, in which element B was the slowest evolving [87]. Thus, in addition to rate variation among lineages, rates of chromosomal evolution may vary across Muller's elements in different groups ofDrosophila, in good agreement with, for example, the fast evolution of the Muller's element E across therepleta species group [40].
Rates of Chromosomal Evolution (Breakpoints/Mb/Myr) between Different Species Pairs of the GenusDrosophila
Breakpoint Clustering
Breakpoint reuse has been reported at the cytological [54,89–91] and the molecular level [16,25–27,92]. Based on our phylogenetic reconstruction of the chromosomal rearrangements of the species considered here (Table 1), it is clear that some ancestral gene configurations have been disrupted independently more than once during the evolution of the subgenusSophophora. Using sequences fromD. ananassae, D. persimilis, andD. pseudoobscura as outgroups to theD. melanogaster species subgroup, we found evidence for breakage in 17 out of the 55 (~31%) regions disrupted in theD. melanogaster/D. yakuba lineage. We also see evidence for nonrandom breakage in theD. melanogaster/D. yakuba complex, i.e., at a relatively short phylogenetic distance. For each of the three pairs of inversions 3L(3)/3L(4), 3R(7)/3R(8), and 3R(10)/3R(11), three, instead of four, breakpoint regions are involved. This recurrent breakage might denote structural instability of particular genomic regions. For example,CG9579, one of the genes adjacent to the breakpoints of the inversion X(5), is also linked to a remarkable set of molecular reorganizations associated with the birth of a multigene family of a chimeric gene,Sdic, on theD. melanogaster lineage [93]. Additional support for structural instability of inversion breakpoint regions comes from the fact that one breakpoint region of inversion 2LR(4), which occurs on theD. yakuba lineage, uses the same genomic interval that has independently permitted the recent evolution of an unusually high TE density in theD. melanogaster lineage (HDR13 in [94]).
A related issue to breakpoint reuse is the possibility that the same inversion can arise twice. The unique origin of inversions has been challenged (see [37] for discussion), but in the two cases considered to be the most convincing, experimental evidence has not supported a polyphyletic origin of inversions [51,92]. Fourteen breakpoint regions are associated with shared inversions betweenD. yakuba andD. erecta (Table 1), which indicates that the same gene pairs have been disrupted and reorganized in the same way, suggesting a common origin in the ancestor ofD. erecta andD. yakuba. Comparative sequence analysis at the nucleotide level for those 14 junctions failed to find evidence of an independent origin of these inversions in the lineages that lead to theD. yakuba andD. erecta, although it must be noted that our power of detection can be compromised by the time elapsed sinceD. yakuba andD. erecta shared an ancestor.
Inversion Breakpoints Can Disrupt Large- and Small-Scale Gene Domains
Expression profiling of the genomes of several species has shown that co-expressed genes tend to co-locate in the genome (for review, see [28]). The biological significance of co-expression clustering is still poorly understood, but if these “transcriptional territories” represent functional associations among neighboring genes, natural selection should prevent their disruption. Conservation of clusters across lineages differentiated by the accumulation of multiple chromosomal rearrangements has been interpreted as support for the functional association of clusters of co-expressed genes in mammals [95] and flies [29].
InD. melanogaster, the preferential clustering of genes, by the time or place of their expression, has been reported based on both expressed sequence tag (EST) and microarray data [96–99]. In a study of the distribution of sex-biased gene expression [97], 75% of the genes on Release 3.1 of theD. melanogaster genome were assayed. Fifteen gene clusters that are expressed either in testis, in ovary, or in the soma were found. Despite the relatively small number of gene-order interruptions betweenD. melanogaster andD. yakuba, one of the clusters identified by Parisi et al. [97], containing theTry multigene family, is broken in the lineage ofD. yakuba by inversion 2LR(8). At least eight out of ten members of the disrupted gene cluster are highly expressed in the soma. The disruption of this transcriptional territory may be related to the fact that the chromosomal breakage occurred between a member of the cluster,CG12388 (kappaTry), which is soma-biased in expression, andCG12387 (zetaTry), which is not.
Transcriptional territories have been found to be correlated with the DNA replication program inD. melanogaster [100]. Specifically, 7.5% of theD. melanogaster genome, distributed in 52 well-defined regions, is under-replicated in polytene chromosomes, and 50 of these regions also replicate late during the S period in cultured Kc cells; other regions present a non-delayed replication status in at least one of the two tissues. Sixty percent (30/50) of these late or under-replicating regions are associated with previously defined transcriptional territories; these domains account for 20% of theD. melanogaster genome [98]. Globally, transcriptional territories with a delayed pattern of DNA replication seem to be enriched for genes expressed in the testis and during pupal development, and depleted of genes expressed in the ovary and embryonic development [100]. Are the 55 gene pairs disrupted by inversion breakpoints in theD. melanogaster/D. yakuba lineages randomly distributed across the genome with regard to their replication status? We did not find a significant deviation from the random expectation (Gadj= 5.29;d.f. = 3;p = 0.15); however, we did find that three out of the 53 ancestral gene pairs disrupted inD. yakuba (Xm(8), 2m(19), and 3m(45)) are embedded in regions that are under-replicated in salivary glands and late replicated in Kc cells. These results show that at least some of the regions of theD. melanogaster genome, within which genes have a similar expression profile and/or replication program, are not necessarily conserved between this species andD. yakuba. This suggests that either those domains have little adaptive value, supporting the idea of accidental co-expression, or that their adaptive value has evolved recently, relative to the time of the divergence betweenD. melanogaster andD. yakuba.
Some 1,027 pairs of genes inD. melanogaster have overlapping transcripts in opposite strands [101]. Antisense overlap can play an important role in regulating gene expression at the post-transcriptional level [102,103]. Five of these genes pairs are disjunct inD. yakuba, as a consequence of an inversion breakpoint. Comparison across lineages (Table 1) indicates that the disruption inD. yakuba represents the derived state. The five inversions that disrupt antisense pairs are all associated with inverted duplications (Table S5). Our model for the origin of inversions (Figure 2) can account for the conservation of sequences of decoupled antisense pairs of genes. At least in two of these cases(CG9578-CG9579 andCG31142-CG5289), the 3′ UTR sequences of the independent gene pairs ofD. yakuba are very similar in sequence and in length to their corresponding 3′ UTRs inD. melanogaster. In the other three cases, theD. yakuba 3′ UTR of one of the members of each pair is truncated.
Conclusion
This work unveils novel aspects of the evolution of the molecular organization of theDrosophila genome in particular and of the genomes of insects in general. The use of genome sequence data ofD. melanogaster andD. yakuba has proven to be useful in reconstructing the history of genome rearrangements in these species. The lineage that leads toD. yakuba is evolving substantially faster at the chromosomal level thanD. melanogaster (28:1); nevertheless, the mechanism that underlies the generation of many inversions (~59%) in both lineages is the same, and it seems to be initiated by the presence of staggered breaks, which in turn enables the generation of duplications in inverted orientation of sequences at co-occurrent breakpoint regions. These duplications diverge mainly by both nucleotide substitutions and small deletions [104,105], and can contribute, as do segmental duplications in mammals, to the diversification of gene function [106]. A model of inversion generation based on staggered breaks, either isochromatid or chromatid, contrasts with a model of ectopic recombination between repetitive sequences [46,75,76]. Our data also give clear evidence, at the molecular level, of the reuse of the same breakpoint region and that expression domains inD. melanogaster may be disrupted in other species, bringing into question their potential adaptive significance.
The availability of complete sequences from 12Drosophila species now offers the opportunity to extend the analysis of chromosome evolution at a molecular level. Several fundamental questions remain: whether or not mechanisms of inversion formation are general across taxa; and whether there are functional constraints on chromosomal evolution, and, if so, at what level do these operate.
Materials and Methods
Flies.
The following species and strains were used:D. melanogaster (OR-R from the Department of Genetics, University of Cambridge, and Zimbabwe 2 from D. L. Hartl's laboratory);D. simulans (Sim-1 from Chapel Hill, North Carolina); andD. yakuba (Tai18E2 from the Tucson Stock Center). In the case of Zimbabwe 2 and Tai18E2, we checked whether they were homokaryotypic by visually examining salivary gland polytene chromosome preparations stained with orcein. In the case of Zimbabwe 2, we detected two paracentric inversions in a sample of 20 autosomal genomes and 16 X chromosome genomes. No gross chromosomal polymorphisms were detected in a sample of 20 autosomal genomes and 16 X chromosome genomes of Tai18E2.
In situ hybridization of molecular probes onto polytene chromosomes.
Five BACs and 11 genomic clones were used as molecular probes. The BAC clones (BACR07M14, BACR45A07, BACR16N15, BACR42I20, and BACR08K01) were obtained from the Children's Hospital Oakland Research Institute. Genomic clones were PCR amplified using the primers described inTable S7. The genomic DNA used for the PCR amplifications was from the sequenced strain ofD. melanogaster: y; cn bw sp [107]. The genomic fragments generated correspond to the protein-coding genesCG2708 (Tom34), CG7918, CG31176, CG34034, CG5289, andCG6576 (Glec); the putatively transcribed genesHDC14860, HDC14861, HDC14862, andHDC12400 [60]; and the sequence ofpdf800, which is said to be related to the mammalian proto-oncogene c-fos. Cloning of PCR products and preparation of DNA from recombinant clones was performed using conventional methods. In the case of BAC clones, we used the methods described athttp://bacpac.chori.org/bacpacmini.htm. In situ hybridization of probes to polytene chromosomes was done as in [108]. Detection of the hybridization signals was done by phase contrast with a Zeiss Axioskop 2 (Carl Zeiss,http://www.zeiss.com). Chromosomal localization was determined using the photographic polytene chromosome maps ofD. melanogaster [109]. All the probes yielded one or two hybridization signals with the exception of those forHDC14860 andHDC14861, which failed to generate a detectable hybridization signal inD. yakuba under the experimental conditions used.
Assembly ofD. yakuba supercontigs into chromosomal sequences.
The sequencing and assembly of theD. yakuba genome will be described elsewhere (D. J. Begun, A. K. Holloway, K. Stevens, L. W. Hillier, Y.-P. Poh, M. W. Hahn, P. M. Nista, C. D. Jones, A. D. Kern, C. Dewey, L. Pachter, E. Myers, and C. H. Langley, unpublished data). To create chromosomal assignments and ordering of “supercontigs” (gapped scaffolds of ungapped contigs as defined by mate pairs) along the chromosomes for theD. yakuba genome assembly, contigs from theD. yakuba assembly that uniquely aligned with theD. melanogaster genome were identified and then ordered by their positions along the assignedD. melanogaster chromosomes. This process resulted in someD. yakuba supercontigs with contigs that aligned to different regions of aD. melanogaster chromosome. To assemble supercontigs into chromosome arms inD. yakuba, reversals of the tiling path of mapped contigs were introduced to “rejoin” those supercontigs that had been split by the alignments toD. melanogaster. The overall goal was to minimize the total number of reversals required to rejoin allD. yakuba supercontigs previously assigned to disjoint chromosomal regions based onD. melanogaster alignments. We note that reversals were introduced only between contigs (not within contigs) and the process was not gene based.
Gene-order reconstruction inD. yakuba.
The complete set of transcripts of theD. melanogaster Release 4.1 annotation was downloaded from UCSC Genome Browser (http://genome.ucsc.edu/). This set represents 13,449 annotated genes. EachD. melanogaster transcript was used as a query against the assembly of theD. yakuba genome release 2.0 (WUSTL November 2005, the droYak2 assembly) using BLASTN 2.2.2 with default settings and then filtered for the top hit for each transcript with a cutoff E-value of 10−30; the nonfiltered output can be found asTable S8. This approach localized 12,690 genes on the genome sequence ofD. yakuba with a best hit on the same chromosome arm (with exceptions made for genes inside the pericentric inversion on chromosome 2); 320 genes had no BLASTN hit higher than 10−30, and 429 genes hit unmapped scaffolds or gave multiple hits with equal E-value in more than one chromosome arm. Genes unambiguously localized were sorted into chromosome order (centromere to telomere) for the six Muller's elements ofD. yakuba. The gene order inD. yakuba was compared with that ofD. melanogaster, and gene-order interruptions between the two species were inferred; the two genes flanking each gene-order interruption were taken as the limits of different syntenic blocks. This method will not reliably detect very small rearrangements, although we know that these occur (e.g.,Figure S7; see also [72]). For calculating the minimum number of inversions necessary to transform the gene order ofD. melanogaster into that ofD. yakuba, we used GRIMM [110]. Estimates on the size of syntenic blocks and regions between them inD. yakuba were obtained by taking into account the coordinates of the BLASTN hits of the outermost markers of each syntenic block. In the case of transposition events, we examined the nonfiltered output for genes whose BLAST hits were surrounded by different pairs of flanking genes inD. melanogaster andD. yakuba, especially those with unambiguous hits in different Muller's elements.
One complicating factor in our analysis is that BLASTN of a region including 3R:3862326–3867817 was highly similar to two different regions of theD. yakuba assembly: one on Contig690 (currently assembled into chromosome arm 3R), and one, with a slightly lower match, on Contig706 (currently assigned to the “random” bin of chromosome arm 3R because it seemed to overlap Contig690). Contig690 has a sequence coverage of 5.8–8.3×, Contig706 of 3–4.7×. The overall coverage of the genome is 9.4×, but the supercontigs of chromosome arms 2R and 3R have approximately 12× coverage. Were this region to be truly duplicated in the genome ofD. yakuba, we would expect the sum of the coverage of Contigs 690 and 706 to be at the very least 18×, rather than (at most) 13×. In situ hybridization to polytene chromosomes of probes from this region shows only a single site, that expected on chromosome arm 3R. Residual heterozygosity for other regions of theD. yakuba sequence has been experimentally verified (J. Comeron and C. Langley, personal communication), and we interpret these two hits as being the consequence of heterozygosity in the genome.
Experimental verification of the molecular organization at breakpoint regions inD. yakuba.
To confirm the predicted gene-order interruptions betweenD. melanogaster andD. yakuba, we cloned and sequence verified the transition between adjacent syntenic blocks of 27 (49%) of the breakpoint regions inD. yakuba, namely Xy(9), Xy(10), 2y(19–24), 2y(26–28), 3y(35–43), 3y(46–51), and 3y(53) (Table S1). We extracted genomic DNA from the sequenced strain Tai18E2 by conventional methods. We designed primers to amplify the sequence that spans the transition between syntenic blocks. In a few cases, either because of the size of the region between the neighboring reference genes or because of technical difficulties, we amplified sets of overlapping segments that ensured coverage of the transition between adjacent syntenic blocks. PCR products were cloned into a pCR2.1 Topo Vector (Invitrogen,http://www.invitrogen.com). Sequencing reactions of the two ends of each clone were done, and the reads were aligned by BLAST against theD. melanogaster genome. Primers used are listed inTable S7.
Sequence analysis of breakpoint regions inD. yakuba andD. melanogaster.
Because not all the genes ofD. melanogaster were mapped to theD. yakuba assembly, and because there may have been transpositions of regions during the evolution of these genomes, we extracted the sequences of the 55 genomic discontinuities ofD. yakuba, relative toD. melanogaster, and aligned these by BLASTN against theD. melanogaster genome. This refined the limits of the syntenic blocks and allowed their ends to be precisely mapped. To identify duplicates at co-occurrent breakpoint regions, we used PipMaker [111], and BLAST2 [112] with their default parameters. Sequences from all local alignments spanning more than 40 bp from PipMaker were used as queries in a BLASTN analysis against theD. melanogaster genome, thereby verifying their identities and genomic locations. We did the same with the BLAST2 output for those sequences with hits whose E-value were lower than 10−8 and were at least 40-bp long. Both approaches provided essentially the same results. Nucleotide identities between particular duplicates and their reference sequences were derived from the BLAST2 analysis. For genes that are adjacent to breakpoints and/or are affected by them, we did an additional BLAST2 analysis, using as queries theD. melanogaster sequences of their transcripts. Sequences that are now found as inverted duplications at co-occurrent breakpoint regions may not necessarily have been in this orientation immediately after the occurrence of the inversion, because subsequent events may have taken place. For this reason, we reconstructed the most parsimonious history of each inversion in an attempt to establish the sequence immediately after each had occurred. We analyzed the presence of TE sequences using the RepeatMasker track from UCSC (RepBase libraries: RepBase Update 9.11 and RM database version 20050112) and subsequently by BLAST2 analysis using a collection of TE sequences that includes those in differentDrosophila species other thanD. melanogaster. All the significant hits found by our BLAST2 analysis correspond to footprints of TEs ofD. melanogaster previously detected with RepeatMasker. For duplications that spanned noncoding regions, we did a BLASTN analysis against theD. yakuba genome, in order to determine that they did not include repetitive sequences. When necessary, we proceeded in an identical manner with breakpoint regions ofD. melanogaster, D. simulans, andD. erecta.
Phylogenetic status of the gene configurations at breakpoint regions ofD. melanogaster andD. yakuba.
In order to determine whether the gene configuration in the breakpoint regions inD. melanogaster or inD. yakuba is ancestral or derived, i.e., the result of a chromosomal rearrangement, we tookD. melanogaster as a reference, and we determined whether or not the reference genes within a particular breakpoint region were adjacent in a set of species selected on the basis of their phylogenetic relationships withD. melanogaster andD. yakuba. Specifically, we used:D. melanogaster (Release 4.1; FlyBase);D. simulans (release 1.0 Apr. 2005; UCSC);D. yakuba (droYak2 Nov. 2005);D. erecta (droEre1 Aug. 2005; UCSC);D. ananassae (droAna2 Aug. 2005; UCSC);D. persimilis (droPer1 Oct. 2005 UCSC); andD. pseudoobscura (Release 1.0; S. W. Schaeffer, personal communication). We used PipMaker to analyze the breakpoint regions apparently shared betweenD. yakuba andD. erecta. If these breakpoint regions were ofindependent origin, then we would expect to see discontinuities and indels between them. In fact, in all cases, the evidence suggests that these “shared” breakpoints were the consequence of a single ancestral event.
Supporting Information
Figure S1.Phylogenetic Relationships in the GenusDrosophila and Its Subgenera:Drosophila andSophophora
The phylogenetic relationships among the species used in the present study are shown in detail. All belong to the subgenusSophophora. Themelanogaster species subgroup comprises nine species, which have been commonly clustered into two complexes by the criteria of gene sequences, polytene chromosome banding pattern, and the structures of the male genitalia [54,113–115]. One of the complexes includesD. melanogaster and the trioD. mauritiana, D. sechellia, andD. simulans, and the secondD. erecta, D. orena, D. santomea, D. teissieri, andD. yakuba. All the divergences times are according to [42].
https://doi.org/10.1371/journal.pbio.0050152.sg001
(12 KB PDF)
Figure S2.The Isochromatid Model with Staggered Single-Strand Breaks in the Case of the Inversion 3R(7)
(A) Relative to the gene order ofD. simulans, the region fromCG15179 toCG17603 is inverted, due to a prior event (dotted line).
(B and C) Inversion 3R(7) originates from two pairs of staggered single-strand breaks (short horizontal solid lines), proximally on either side ofCG31286, and distally on either side ofCG34034. The resulting 5′-overhangs are filled in (grey dashed arrow) and followed by a nonhomologous end joining.
(D) As a consequence, bothCG34034 andCG31286 were duplicated at both breakpoints.
(E) Subsequently, bothCG34034 andCG31286 tandemly duplicated, before other mutations affected both copies ofCG31286, one copy ofCG34034, and theHDC14862(5′) sequence.
These events illustrate the complexity of some inversion breakpoint regions as a consequence of events that occur subsequent to the original inversion. Color code as inFigure 1. For the sake of simplicity, two putatively expressed genes(HDC12142 andHDC12143) and insertions of repetitive sequences have not been included.
https://doi.org/10.1371/journal.pbio.0050152.sg002
(28 KB PDF)
Figure S3.A Chromatid Model with Staggered Double-Strand Breaks Can Also Give Rise to an Inversion Accompanied by Inverted Duplications of Sequences Included in the Breakpoint Regions
The mechanism is illustrated by the inversion 3R(8), which is fixed in the lineage toD. melanogaster.
(A) Sister chromatids in meiotic prophase showing the gene order and orientation assumed to be ancestral, which is currently best represented byD. simulans (Figure 1).
(B) Two pairs of staggered double-strand breaks(a-b andc-d) are indicated.
(C) Nonhomologous end joining results in two chromatids: one carrying an inversion flanked by inverted duplications of the sequences between the paired double-strand breaks, and a second with reciprocal deletions.
Landmarks: A,CG2708; B,HDC14862 (3′); C,pfd800; D,HDC12400; E,HDC14861; F,HDC14861; G,CG31176; H,CG7918; I,HDC14862 (5′); J,CG34034; and K,CG5849. Color code as inFigure 1. Black circle indicates the centromere.
https://doi.org/10.1371/journal.pbio.0050152.sg003
(35 KB PDF)
Figure S4.Large-Scale Comparison of the Muller's Elements A, D, and E betweenD. melanogaster andD. yakuba
(A), Muller's element A (chromosome X); (B) Muller's elements D and E (chromosome 3). The outermost protein-coding genes of consecutive syntenic blocks are indicated. Following [116], syntenic blocks (defined as regions in which the relative gene order is globally conserved betweenD. melanogaster andD. yakuba) are numbered takingD. melanogaster as a reference and in an increasing order from the telomere of chromosome X (number 1) to the telomere of the right arm of chromosome 3 (number 58); an arrowhead indicates the orientation of the segments. Lines between chromosomes match homologous syntenic blocks between species. Solid triangles correspond to genes that were duplicated during the generation of inversions in the lineage that leads toD. yakuba following a model of staggered strand breaks (Figures 2, S2, and S3). Those genes areCG14187, which was generated by the inversion X(1) in (A), andCG34034, which was generated by the inversion 3R(7) in (B). Open triangle denotes geneCG9925, whose relocation can be explained by a conservative transposition event or, alternatively, by two paracentric inversions that overlap by one gene,CG9925. The fact thatCG9925 is flanked both inD. melanogaster andD. yakuba by genes that, in their turn, are the outermost markers of different syntenic blocks strongly supports the second explanation.
https://doi.org/10.1371/journal.pbio.0050152.sg004
(34 KB PDF)
Figure S5.Different Evolutionary Scenarios That Can Lead to the Presence of Inversion-Mediated Duplications at Only One of the Two Co-occurrent Breakpoint Regions
The inversion X(1) is used as an example.D. melanogaster (top gene configuration) andD. yakuba (bottom gene configuration).
(A) Scenario involving four staggered breakpoints (arrows). In this case, the duplication ofCG14817 andHDC18578 is coupled with the generation of the inversion. Subsequently, one of the copies ofHDC18578 degenerates by accumulating nucleotide substitutions and indels so that it is no longer recognizable.
(B) Scenario involving staggered breakpoints at one genomic region and a single-strand break at the other. In this case, onlyCG14817 becomes duplicated as a result of the inversion.
The outcome of both scenarios is identical. Coding sequences that have undergone an inversion-mediated duplication in the lineage that leads toD. yakuba, CG14817 (in green) andHDC18578 (in pink) are indicated by a gradient.
C, centromere; T, telomere.
https://doi.org/10.1371/journal.pbio.0050152.sg005
(25 KB PDF)
Figure S6.Inversions Required to Transform the Gene Arrangement of Chromosome 2 betweenD. melanogaster andD. yakuba
The diagram shows 11 inversions, one pericentric and ten paracentric. Other scenarios obtained with GRIMM involve the same number of reversals of gene order [110]. Duplications at breakpoint regions, disruption of multigene families and antisense overlapping, and gene organization in outgroup species are the criteria used to infer the polarization (Table 1). Using this information, the inversions 2L(3), 2R(11), and 2LR(5) occurred first because all are shared betweenD. yakuba andD. erecta. Note that the order of these inversions is arbitrary. The other inversions took place after the split of the lineage that lead toD. yakuba andD. erecta. The numbering of the syntenic blocks follows that ofFigure 3; the blocks ofD. yakuba appear with a minus sign if inverted in relation toD. melanogaster.
https://doi.org/10.1371/journal.pbio.0050152.sg006
(52 KB PDF)
Figure S7.Dot Plot for the Genomic Sequence of the Syntenic Block 42 betweenD. melanogaster andD. yakuba
A few cases of departures from perfect collinearity are observed denoting small rearrangements. The one on the upper right corner is an inversion involving at least four genes:CG12284, CG5895, CG13076, andCG5830. The dot plot was generated with PipMaker [111]. The genome sequences spanning from the geneCG6749 to the geneCG32147, both inD. melanogaster and inD. yakuba, were extracted from UCSC. The sizes of block 42 in each species are indicated on the corresponding axes.
https://doi.org/10.1371/journal.pbio.0050152.sg007
(311 KB PDF)
Table S1.Co-occurrent Breakpoint Regions of the Inversions betweenD. melanogaster andD. yakuba
https://doi.org/10.1371/journal.pbio.0050152.st001
(200 KB RTF)
Table S2.Size of Syntenic Blocks betweenD. melanogaster andD. yakuba with the Number of Genes That Have, or Have Not, Been Mapped between Them
https://doi.org/10.1371/journal.pbio.0050152.st002
(132 KB RTF)
Table S3.Conservative Transposition Events Detected betweenD. melanogaster andD. yakuba
https://doi.org/10.1371/journal.pbio.0050152.st003
(17 KB RTF)
Table S4.Duplicative Transposition Events Detected betweenD. melanogaster andD. yakuba
https://doi.org/10.1371/journal.pbio.0050152.st004
(52 KB RTF)
Table S5.List of Duplications of Nonrepetitive DNA Sequences Present at Breakpoint Regions of Inversions betweenD. melanogaster andD. yakuba
https://doi.org/10.1371/journal.pbio.0050152.st005
(47 KB XLS)
Table S6.Repeat Composition Characterization by RepeatMasker in Co-occurrent Breakpoint Regions ofD. melanogaster andD. yakuba
https://doi.org/10.1371/journal.pbio.0050152.st006
(211 KB RTF)
Table S7.Primers Used in This Work for Cloning and/or Sequencing
https://doi.org/10.1371/journal.pbio.0050152.st007
(135 KB RTF)
Table S8.Nonfiltered Output of the BLASTN of theD. melanogaster Transcripts against theD. yakuba Assembly
https://doi.org/10.1371/journal.pbio.0050152.st008
(17.5 MB XLS)
Accession Numbers
The GenBank (http://www.ncbi.nlm.nih.gov/Genbank) accession number for theD. melanogaster DNA sequencepfd800 discussed in this paper is Z16407. The accession numbers for the sequences generated in this paper are EF569486–EF569554.
Acknowledgments
We thank the following centers for providing genomic data: Genome Sequencing Center at the Washington University School of Medicine in St. Louis(D. simulans andD. yakuba); Agencourt(D. ananassae andD. erecta); the Broad Institute(D. persimilis); and the Baylor Genome Sequencing Center(D. pseudoobscura). We are very grateful for the initiative of Charles Langley and David Begun in writing the White Paper that led to the funding of the sequencing ofD. simulans andD. yakuba. We also thank: Cahir O'Kane for his help in microscopy; Steve Russell for providing the genomic DNA ofD. melanogaster; Rosa Bautista-Llacer, Theresa Heffernan, and Edward Ryder for technical assistance; Françoise Balloux, Rhona Borts, Kevin Hiom, Steve Jackson, John Parsch, and Sebastian Ramos-Onsins for advice on different aspects of the analyses; and Stepan Belyakin, Craig Nelson, Michael Parisi, and Stephen Schaeffer for providing unpublished datasets. Finally, we are indebted to Walter Eanes, Evan Eichler, Jeffrey Powell, Stephen Schaeffer, the Academic Editor, and three anonymous reviewers for helpful comments on the manuscript, and especially to Igor Sharakhov for pointing out that staggered double-strand breaks of paired chromatids can generate a pattern of inverted duplications indistinguishable from those that would result from a model of staggered single-strand breaks of an isochromatid. JMR was supported by a European Molecular Biology Organization (EMBO) long-term fellowship, and CMB was supported by a USA Research Fellowship from the Royal Society.
Author Contributions
JMR and JR devised the characterization of the breakpoint regions of the inversionIn(3R)84F1;93F6–7 inD. melanogaster, D. simulans, and D. yakuba. LWH was responsible for curation and generation of theD. yakuba andD. simulans assemblies and chromosomal assignments, ordering, and orientation (including developing methods for comparative alignments and introducing appropriate inversions for creation of the chromosomal files). CMB conceived of and performed the genome-wide mapping ofD. melanogaster genes againstD. yakuba to detect breakpoint regions and verified breakpoint regions using UCSC whole-genome alignment. JMR participated in all the in silico and in vivo comparative analyses with the support of DM and with specific contributions by YSC, MvG, LWH, MA, and CMB. JMR, MA and CMB wrote the paper.
References
- 1.White M (1973) Animal cytology and evolution. 3rd edition. Cambridge: University Press. 961 p.
- 2.Murphy WJ, Pevzner PA, O'Brien SJ (2004) Mammalian phylogenomics comes of age. Trends Genet 20: 631–639.
- 3.Carson HL (1992) Inversions in HawaiianDrosophila. In: Krimbas CB, Powell JR, editors. Drosophila inversion polymorphism. Boca Raton (Florida): CRC Press. pp. 407–439.
- 4.Levin DA (2002) The role of chromosomal change in plant evolution. Oxford (United Kingdom): Oxford University Press. 230 p.
- 5.Delneri D, Colson I, Grammenoudi S, Roberts IN, Louis EJ, et al. (2003) Engineering evolution to study speciation in yeasts. Nature 422: 68–72.
- 6.Noor MA, Grams KL, Bertucci LA, Reiland J (2001) Chromosomal inversions and the reproductive isolation of species. Proc Natl Acad Sci U S A 98: 12084–12088.
- 7.Rieseberg LH (2001) Chromosomal rearrangements and speciation. Trends Ecol Evol 16: 351–358.
- 8.Colson I, Delneri D, Oliver SG (2004) Effects of reciprocal chromosomal translocations on the fitness ofSaccharomyces cerevisiae. EMBO Rep 5: 392–398.
- 9.Powell JR, Petrarca V, della Torre A, Caccone A, Coluzzi M (1999) Population structure, speciation, and introgression in theAnopheles gambiae complex. Parassitologia 41: 101–113.
- 10.Weeks AR, McKechnie SW, Hoffmann AA (2002) Dissecting adaptive clinal variation: Markers, inversions and size/stress associations inDrosophila melanogaster from a central field population. Ecol Lett 5: 756–763.
- 11.Dobzhansky T (1970) Genetics of the evolutionary process. New York: Columbia University Press. 505 p.
- 12.Schaeffer SW, Goetting-Minesky MP, Kovacevic M, Peoples JR, Graybill JL, et al. (2003) Evolutionary genomics of inversions inDrosophila pseudoobscura: Evidence for epistasis. Proc Natl Acad Sci U S A 100: 8319–8324.
- 13.Perez-Ortin JE, Querol A, Puig S, Barrio E (2002) Molecular characterization of a chromosomal rearrangement involved in the adaptive evolution of yeast strains. Genome Res 12: 1533–1539.
- 14.Feuk L, MacDonald JR, Tang T, Carson AR, Li M, et al. (2005) Discovery of human inversion polymorphisms by comparative analysis of human and chimpanzee DNA sequence assemblies. PLoS Genet 1: e56..
- 15.Belda E, Moya A, Silva FJ (2005) Genome rearrangement distances and gene order phylogeny in gamma-Proteobacteria. Mol Biol Evol 22: 1456–1467.
- 16.Murphy WJ, Larkin DM, Everts-van der Wind A, Bourque G, Tesler G, et al. (2005) Dynamics of mammalian chromosome evolution inferred from multispecies comparative maps. Science 309: 613–617.
- 17.Bourque G, Zdobnov EM, Bork P, Pevzner PA, Tesler G (2005) Comparative architectures of mammalian and chicken genomes reveal highly variable rates of genomic rearrangements across different lineages. Genome Res 15: 98–110.
- 18.Bowers JE, Chapman BA, Rong J, Paterson AH (2003) Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422: 433–438.
- 19.Coghlan A, Wolfe KH (2002) Fourfold faster rate of genome rearrangement in nematodes than inDrosophila. Genome Res 12: 857–867.
- 20.Fischer G, Rocha EP, Brunet F, Vergassola M, Dujon B (2006) Highly variable rates of genome rearrangements between hemiascomycetous yeast lineages. PLoS Genet 2: e32..
- 21.Ranz JM, Casals F, Ruiz A (2001) How malleable is the eukaryotic genome? Extreme rate of chromosomal rearrangement in the genusDrosophila. Genome Res 11: 230–239.
- 22.Coghlan A, Eichler EE, Oliver SG, Paterson AH, Stein L (2005) Chromosome evolution in eukaryotes: A multi-kingdom perspective. Trends Genet 21: 673–682.
- 23.Armengol L, Pujana MA, Cheung J, Scherer SW, Estivill X (2003) Enrichment of segmental duplications in regions of breaks of synteny between the human and mouse genomes suggest their involvement in evolutionary rearrangements. Hum Mol Genet 12: 2201–2208.
- 24.Bailey JA, Baertsch R, Kent WJ, Haussler D, Eichler EE (2004) Hotspots of mammalian chromosomal evolution. Genome Biol 5: R23.
- 25.Richards S, Liu Y, Bettencourt BR, Hradecky P, Letovsky S, et al. (2005) Comparative genome sequencing ofDrosophila pseudoobscura: Chromosomal, gene, and cis-element evolution. Genome Res 15: 1–18.
- 26.Pevzner P, Tesler G (2003) Human and mouse genomic sequences reveal extensive breakpoint reuse in mammalian evolution. Proc Natl Acad Sci U S A 100: 7672–7677.
- 27.Zhao S, Shetty J, Hou L, Delcher A, Zhu B, et al. (2004) Human, mouse, and rat genome large-scale rearrangements: Stability versus speciation. Genome Res 14: 1851–1860.
- 28.Hurst LD, Pal C, Lercher MJ (2004) The evolutionary dynamics of eukaryotic gene order. Nat Rev Genet 5: 299–310.
- 29.Stolc V, Gauhar Z, Mason C, Halasz G, van Batenburg MF, et al. (2004) A gene expression map for the euchromatic genome ofDrosophila melanogaster. Science 306: 655–660.
- 30.Bourque G, Pevzner PA, Tesler G (2004) Reconstructing the genomic architecture of ancestral mammals: Lessons from human, mouse, and rat genomes. Genome Res 14: 507–516.
- 31.della Torre A, Costantini C, Besansky NJ, Caccone A, Petrarca V, et al. (2002) Speciation withinAnopheles gambiae—The glass is half full. Science 298: 115–117.
- 32.Turner TL, Hahn MW, Nuzhdin SV (2005) Genomic islands of speciation inAnopheles gambiae.. PLoS Biol 3: e285..
- 33.Metz CW (1914) Chromosome studies in the Diptera I. A preliminary survey of five different types of chromosome groups in the genusDrosophila. J exp Zool 17: 45–59.
- 34.Muller HJ (1940) Bearings of theDrosophila work on systematics. In: Huxley J, editor. The new systematics. Oxford (United Kingdom): Clarendon Press. pp. 185–268.
- 35.Sturtevant AH, Novitski E (1941) The homologies of the chromosome elements in the genusDrosophila. Genetics 26: 517–541.
- 36.Clayton FE, Guest WC (1986) Overview of chromosomal evolution in the family Drosophilidae. In: Ashburner MA, Carson HL, Thompson JN, editors. The genetics and biology of Drosophila. New York: Academic Press. pp. 1–38.
- 37.Powel JR (1997) Progress and prospects in evolutionary biology: TheDrosophila model. Oxford (United Kingdom): Oxford University Press. 576 p.
- 38.Sturtevant AH, Beadle GW (1936) The relations of inversions in the X chromosome ofDrosophila melanogaster to crossing over and disjunction. Genetics 21: 554–604.
- 39.Sturtevant AH, Dobzhansky T (1936) Inversions in the third chromosome of wild races ofDrosophila pseudoobscura, and their use in the study of the history of the species. Proc Natl Acad Sci USA 22: 448–450.
- 40.Wasserman M (1992) Cytological evolution of theDrosophila repleta species group. In: Krimbas CB, Powell JR, editors. Drosophila inversion polymorphism. Boca Raton (Florida): CRC Press. pp. 455–552.
- 41.Ashburner MA, Golic KG, Hawley RS (2005)Drosophila: A laboratory manual. 2nd edition. Cold Spring Harbor (New York): Cold Spring Harbor Laboratory Press. 1409 p.
- 42.Tamura K, Subramanian S, Kumar S (2004) Temporal patterns of fruit fly(Drosophila) evolution revealed by mutation clocks. Mol Biol Evol 21: 36–44.
- 43.Andolfatto P, Wall JD, Kreitman M (1999) Unusual haplotype structure at the proximal breakpoint of In(2L)t in a natural population ofDrosophila melanogaster. Genetics 153: 1297–1311.
- 44.Cirera S, Martin-Campos JM, Segarra C, Aguade M (1995) Molecular characterization of the breakpoints of an inversion fixed betweenDrosophila melanogaster andD. subobscura. Genetics 139: 321–326.
- 45.Cirulli ET, Noor MA (2006) Localization and characterization of X chromosome inversion breakpoints separatingDrosophila mojavensis andDrosophila arizonae. J Hered.https://doi.org/10.1093/jhered/esl065
- 46.Caceres M, Ranz JM, Barbadilla A, Long M, Ruiz A (1999) Generation of a widespreadDrosophila inversion by a transposable element. Science 285: 415–418.
- 47.Casals F, Caceres M, Ruiz A (2003) The foldback-like transposon Galileo is involved in the generation of two different natural chromosomal inversions ofDrosophila buzzatii. Mol Biol Evol 20: 674–685.
- 48.Wesley CS, Eanes WF (1994) Isolation and analysis of the breakpoint sequences of chromosome inversion In(3L)Payne inDrosophila melanogaster. Proc Natl Acad Sci USA 91: 3132–3136.
- 49.Matzkin LM, Merritt TJ, Zhu CT, Eanes WF (2005) The structure and population genetics of the breakpoints associated with the cosmopolitan chromosomal inversion In(3R)Payne inDrosophila melanogaster. Genetics 170: 1143–1152.
- 50.Mathiopoulos KD, della Torre A, Predazzi V, Petrarca V, Coluzzi M (1998) Cloning of inversion breakpoints in theAnopheles gambiae complex traces a transposable element at the inversion junction. Proc Natl Acad Sci USA 95: 12444–12449.
- 51.Sharakhov IV, White BJ, Sharakhova MV, Kayondo J, Lobo NF, et al. (2006) Breakpoint structure reveals the unique origin of an interspecific chromosomal inversion (2La) in theAnopheles gambiae complex. Proc Natl Acad Sci U S A 103: 6258–6262.
- 52.Lachaise D, Cariou ML, David JR, Lemeunier F, Tsacas L (1988) Biogeographie historique des especes deDrosophila du sous-groupemelanogaster. Rapp d'Activ, LGBE, Gif 1985–1987: 47–51.
- 53.Horton IH (1939) A comparison of the salivary gland chromosomes ofDrosophila melanogaster andDrosophila simulans. Genetics 24: 234–243.
- 54.Lemeunier F, Ashburner MA (1976) Relationships within the melanogaster species subgroup of the genusDrosophila (Sophophora). II. Phylogenetic relationships between six species based upon polytene chromosome banding sequences. Proc R Soc Lond B Biol Sci 193: 275–294.
- 55.Sturtevant AH (1921) A case of rearrangement of genes inDrosophila. Proc Natl Acad Sci U S A 7: 235–237.
- 56.Sturtevant AH, Plunkett CR (1926) Sequence of corresponding third chromosome genes inDrosophila melanogaster andDrosophila simulans. Biol Bull 50: 56–60.
- 57.Kehrer-Sawatzki H, Sandig CA, Goidts V, Hameister H (2005) Breakpoint analysis of the pericentric inversion between chimpanzee chromosome 10 and the homologous chromosome 12 in humans. Cytogenet Genome Res 108: 91–97.
- 58.Patau K (1935) Chromosomenmorphologie beiDrosophila melanogaster undDrosophila simulans und ihre genetische Bedeutung. Naturwissenschaften 23: 537–543.
- 59.Kaminker JS, Bergman CM, Kronmiller B, Carlson J, Svirskas R, et al. (2002) The transposable elements of theDrosophila melanogaster euchromatin: A genomics perspective. Genome Biol. 3. RESEARCH0084.
- 60.Hild M, Beckmann B, Haas SA, Koch B, Solovyev V, et al. (2003) An integrated gene annotation and transcriptional profiling approach towards the full gene content of theDrosophila genome. Genome Biol 5: R3.
- 61.Begun DJ, Aquadro CF (1993) African and North American populations ofDrosophila melanogaster are very different at the DNA level. Nature 365: 548–550.
- 62.Finnegan DJ (1989) Eukaryotic transposable elements and genome evolution. Trends Genet 5: 103–107.
- 63.Green CA (1982) Cladistic analysis of mosquito chromosome data (Anopheles (Cellia) Myzomyia). J Hered 73: 2–11.
- 64.Lemeunier F, Ashburner M (1984) Relationships within themelanogaster species subgroup of the genusDrosophila (Sophophora). IV. The chromosomes of two new species. Chromosoma 89: 343–351.
- 65.Kent WJ, Baertsch R, Hinrichs A, Miller W, Haussler D (2003) Evolution's cauldron: Duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A 100: 11484–11489.
- 66.Podemski L, Ferrer C, Locke J (2001) Whole arm inversions of chromosome 4 inDrosophila species. Chromosoma 110: 305–312.
- 67.Nadeau JH, Taylor BA (1984) Lengths of chromosomal segments conserved since divergence of man and mouse. Proc Natl Acad Sci U S A 81: 814–818.
- 68.Chia W, McGill S, Karp R, Gubb D, Ashburner M (1985) Spontaneous excision of a large composite transposable element ofDrosophila melanogaster. Nature 316: 81–83.
- 69.Bergman CM, Pfeiffer BD, Rincon-Limas DE, Hoskins RA, Gnirke A, et al. (2002) Assessing the impact of comparative genomic sequence data on the functional annotation of theDrosophila genome. Genome Biol. 3. RESEARCH0086.
- 70.Ranz JM, Gonzalez J, Casals F, Ruiz A (2003) Low occurrence of gene transposition events during the evolution of the genusDrosophila. Evolution Int J Org Evolution 57: 1325–1335.
- 71.Betran E, Thornton K, Long M (2002) Retroposed new genes out of the X inDrosophila. Genome Res 12: 1854–1859.
- 72.Macdonald SJ, Long AD (2006) Fine scale structural variants distinguish the genomes ofDrosophila melanogaster andD. pseudoobscura. Genome Biol 7: R67.
- 73.Szankasi P, Gysler C, Zehntner U, Leupold U, Kohli J, et al. (1986) Mitotic recombination between dispersed but related rRNA genes ofSchizosaccharomyces pombe generates a reciprocal translocation. Mol Gen Genet 202: 394–402.
- 74.Kellis M, Patterson N, Endrizzi M, Birren B, Lander ES (2003) Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423: 241–254.
- 75.Goidts V, Szamalek JM, Hameister H, Kehrer-Sawatzki H (2004) Segmental duplication associated with the human-specific inversion of chromosome 18: A further example of the impact of segmental duplications on karyotype and genome evolution in primates. Hum Genet 115: 116–122.
- 76.Locke DP, Archidiacono N, Misceo D, Cardone MF, Deschamps S, et al. (2003) Refinement of a chimpanzee pericentric inversion breakpoint to a segmental duplication cluster. Genome Biol 4: R50.
- 77.Mortlock DP, Portnoy ME, Chandler RL, Green ED (2004) Comparative sequence analysis of the Gdf6 locus reveals a duplicon-mediated chromosomal rearrangement in rodents and rapidly diverging coding and regulatory sequences. Genomics 84: 814–823.
- 78.Stankiewicz P, Park SS, Inoue K, Lupski JR (2001) The evolutionary chromosome translocation 4;19 inGorilla gorilla is associated with microduplication of the chromosome fragment syntenic to sequences surrounding the human proximal CMT1A-REP. Genome Res 11: 1205–1210.
- 79.Daveran-Mingot ML, Campo N, Ritzenthaler P, Le Bourgeois P (1998) A natural large chromosomal inversion inLactococcus lactis is mediated by homologous recombination between two insertion sequences. J Bacteriol 180: 4834–4842.
- 80.Daviere JM, Langin T, Daboussi MJ (2001) Potential role of transposable elements in the rapid reorganization of theFusarium oxysporum genome. Fungal Genet Biol 34: 177–192.
- 81.Fischer G, James SA, Roberts IN, Oliver SG, Louis EJ (2000) Chromosomal evolution inSaccharomyces. Nature 405: 451–454.
- 82.Lim JK, Simmons MJ (1994) Gross chromosome rearrangements mediated by transposable elements inDrosophila melanogaster. Bioessays 16: 269–275.
- 83.Quesneville H, Bergman CM, Andrieu O, Autard D, Nouaud D, et al. (2005) Combined evidence annotation of transposable elements in genome sequences. PLoS Comput Biol 1: e22..
- 84.Papaceit M, Aguade M, Segarra C (2006) Chromosomal evolution of elements B and C in theSophophora subgenus ofDrosophila: evolutionary rate and polymorphism. Evolution Int J Org Evolution 60: 768–781.
- 85.Ometto L, Glinka S, De Lorenzo D, Stephan W (2005) Inferring the effects of demography and selection onDrosophila melanogaster populations from a chromosome-wide scan of DNA variation. Mol Biol Evol 22: 2119–2130.
- 86.Llopart A, Lachaise D, Coyne JA (2005) Multilocus analysis of introgression between two sympatric sister species ofDrosophila: Drosophila yakuba andD. santomea. Genetics 171: 197–210.
- 87.Gonzalez J, Betran E, Ashburner M, Ruiz A (2000) Molecular organization of theDrosophila melanogaster Adh chromosomal region inD. repleta andD. buzzatii, two distantly related species of theDrosophila subgenus. Chromosome Res 8: 375–385.
- 88.Vieira J, Vieira CP, Hartl DL, Lozovskaya ER (1997) Discordant rates of chromosome evolution in theDrosophila virilis species group. Genetics 147: 223–230.
- 89.Krivshenko JD (1963) The chromosomal polymorphism ofDrosophila busckii in natural populations. Genetics 48: 1239–1258.
- 90.Dobzhansky T, Socolov D (1939) Structure and variation of of the chromosomes inDrosophila azteca. J Hered 16: 291–304.
- 91.Coluzzi M, Sabatini A, Petrarca V, Di Deco MA (1979) Chromosomal differentiation and adaptation to human environments in theAnopheles gambiae complex. Trans R Soc Trop Med Hyg 73: 483–497.
- 92.Goidts V, Szamalek JM, de Jong PJ, Cooper DN, Chuzhanova N, et al. (2005) Independent intrachromosomal recombination events underlie the pericentric inversions of chimpanzee and gorilla chromosomes homologous to human chromosome 16. Genome Res 15: 1232–1242.
- 93.Nurminsky DI, Nurminskaya MV, De Aguiar D, Hartl DL (1998) Selective sweep of a newly evolved sperm-specific gene inDrosophila. Nature 396: 572–575.
- 94.Bergman CM, Quesneville H, Anxolabehere D, Ashburner M (2006) Recurrent insertion and duplication generate networks of transposable element sequences in theDrosophila melanogaster genome. Genome Biol 7: R112.
- 95.Singer GA, Lloyd AT, Huminiecki LB, Wolfe KH (2005) Clusters of co-expressed genes in mammalian genomes are conserved by natural selection. Mol Biol Evol 22: 767–775.
- 96.Boutanaev AM, Kalmykova AI, Shevelyov YY, Nurminsky DI (2002) Large clusters of co-expressed genes in theDrosophila genome. Nature 420: 666–669.
- 97.Parisi M, Nuttall R, Edwards P, Minor J, Naiman D, et al. (2004) A survey of ovary-, testis-, and soma-biased gene expression inDrosophila melanogaster adults. Genome Biol 5: R40.
- 98.Spellman PT, Rubin GM (2002) Evidence for large domains of similarly expressed genes in theDrosophila genome. J Biol 1: 5.
- 99.Thygesen HH, Zwinderman AH (2005) Modelling the correlation between the activities of adjacent genes inDrosophila. BMC Bioinformatics 6: 10.
- 100.Belyakin SN, Christophides GK, Alekseyenko AA, Kriventseva EV, Belyaeva ES, et al. (2005) Genomic analysis ofDrosophila chromosome underreplication reveals a link between replication control and transcriptional territories. Proc Natl Acad Sci U S A 102: 8269–8274.
- 101.Misra S, Crosby MA, Mungall CJ, Matthews BB, Campbell KS, et al. (2002) Annotation of theDrosophila melanogaster euchromatic genome: A systematic review. Genome Biol. 3. RESEARCH0083.
- 102.Vanhee-Brossollet C, Vaquero C (1998) Do natural antisense transcripts make sense in eukaryotes? Gene 211: 1–9.
- 103.Hastings ML, Ingle HA, Lazar MA, Munroe SH (2000) Post-transcriptional regulation of thyroid hormone receptor expression by cis-acting sequences and a naturally occurring antisense RNA. J Biol Chem 275: 11507–11513.
- 104.Petrov DA, Chao Y-C, Stephenson EC, Hartl DL (1998) Pseudogene evolution inDrosophila suggests a high rate of DNA loss. Mol Biol Evol 15: 1562–1567.
- 105.Petrov DA, Lozovskaya ER, Hartl DL (1996) High intrinsic rate of DNA loss inDrosophila. Nature 384: 346–349.
- 106.Bailey JA, Eichler EE (2006) Primate segmental duplications: Crucibles of evolution, diversity and disease. Nat Rev Genet 7: 552–564.
- 107.Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, et al. (2000) The genome sequence ofDrosophila melanogaster. Science 287: 2185–2195.
- 108.Ranz JM, Segarra C, Ruiz A (1997) Chromosomal homology and molecular organization of Muller's elements D and E in theDrosophila repleta species group. Genetics 145: 281–295.
- 109.Lefevre G (1976) A photographic representation and interpretation of the polytene chromosomes ofDrosophila melanogaster salivary glands. In: Ashburner MA, Novitski E, editors. The genetics and biology of Drosophila. London: Academic Press. pp. 31–66.
- 110.Tesler G (2002) GRIMM: Genome rearrangements web server. Bioinformatics 18: 492–493.
- 111.Schwartz S, Zhang Z, Frazer KA, Smit A, Riemer C, et al. (2000) PipMaker—A web server for aligning two genomic DNA sequences. Genome Res 10: 577–586.
- 112.Tatusova TA, Madden TL (1999) BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences. FEMS Microbiol Lett 174: 247–250.
- 113.Ko WY, David RM, Akashi H (2003) Molecular phylogeny of theDrosophila melanogaster species subgroup. J Mol Evol 57: 562–573.
- 114.Lachaise D, Harry M, Solignac M, Lemeunier F, Benassi V, et al. (2000) Evolutionary novelties in islands:Drosophila santomea, a new melanogaster sister species from Sao Tome. Proc Biol Sci 267: 1487–1495.
- 115.Tsacas L, Bocquet C (1976) L'espece chez les Drosophilidae. In: Bocquet C, Genermont J, Lamotte M, editors. Les problemes de l'espece dans le regne animal. pp. 203–247.
- 116.Bridges CB (1935) Salivary chromosome maps with a key to banding of the chromosomes ofDrosophila melanogaster. J Hered 26: 60–64.
- 117.Bartolome C, Charlesworth B (2006) Rates and patterns of chromosomal evolution inDrosophila pseudoobscura andD. miranda. Genetics 173: 779–791.
- 118.Patterson JT, Stone WS (1952) Evolution in the genusDrosophila. New York: Macmillan. 610 p.
- 119.Vieira J, Vieira CP, Hartl DL, Lozovskaya ER (1997) A framework physical map ofDrosophila virilis based on P1 clones: Applications in genome evolution. Chromosoma 106: 99–107.
- 120.Gonzalez J, Ranz JM, Ruiz A (2002) Chromosomal elements evolve at different rates in theDrosophila genome. Genetics 161: 1137–1154.
- 121.Russo CA, Takezaki N, Nei M (1995) Molecular phylogeny and divergence times of drosophilid species. Mol Biol Evol 12: 391–404.
- 122.Spicer GS (1988) Molecular evolution among someDrosophila species groups as indicated by two-dimensional electrophoresis. J Mol Evol 27: 250–260.
- 123.Barrio E, Latorre A, Moya A, Ayala FJ (1992) Phylogenetic reconstruction of theDrosophila obscura group, on the basis of mitochondrial DNA. Mol Biol Evol 9: 621–635.
- 124.Ramos-Onsins S, Segarra C, Rozas J, Aguade M (1998) Molecular and chromosomal phylogeny in theobscura group ofDrosophila inferred from sequences of the rp49 gene region. Mol Phylogenet Evol 9: 33–41.
- 125.Nurminsky DI, Moriyama EN, Lozovskaya ER, Hartl DL (1996) Molecular phylogeny and genome evolution in theDrosophila virilis species group: Duplications of the alcohol dehydrogenase gene. Mol Biol Evol 13: 132–149.
- 126.Schulze DH, Lee CS (1986) DNA sequence comparison among closely relatedDrosophila species in themulleri complex. Genetics 113: 287–303.
- 127.Hartl DL, Lozovskaya ER (1995) TheDrosophila genome map: A practical guide. New York: Springer-Verlag. 240 p.
Subject Areas?For more information about PLOS Subject Areas, clickhere.
We want your feedback. Do these Subject Areas make sense for this article? Click the target next to the incorrect Subject Area and let us know. Thanks for your help!
For more information about PLOS Subject Areas, clickhere.
We want your feedback. Do these Subject Areas make sense for this article? Click the target next to the incorrect Subject Area and let us know. Thanks for your help!- Drosophila melanogaster
Is the Subject Area"Drosophila melanogaster" applicable to this article?
Thanks for your feedback.
- Invertebrate genomics
Is the Subject Area"Invertebrate genomics" applicable to this article?
Thanks for your feedback.
- Drosophila
Is the Subject Area"Drosophila" applicable to this article?
Thanks for your feedback.
- Chromosomal inversions
Is the Subject Area"Chromosomal inversions" applicable to this article?
Thanks for your feedback.
- Genome analysis
Is the Subject Area"Genome analysis" applicable to this article?
Thanks for your feedback.
- Polytene chromosomes
Is the Subject Area"Polytene chromosomes" applicable to this article?
Thanks for your feedback.
- Evolutionary genetics
Is the Subject Area"Evolutionary genetics" applicable to this article?
Thanks for your feedback.
- Gene disruption
Is the Subject Area"Gene disruption" applicable to this article?
Thanks for your feedback.