Movatterモバイル変換


[0]ホーム

URL:


Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
Thehttps:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

NIH NLM Logo
Log inShow account info
Access keysNCBI HomepageMyNCBI HomepageMain ContentMain Navigation
pubmed logo
Advanced Clipboard
User Guide

Full text links

eLife Sciences Publications, Ltd full text link eLife Sciences Publications, Ltd Free PMC article
Full text links

Actions

Share

.2023 Dec 18:12:RP90656.
doi: 10.7554/eLife.90656.

Major patterns in the introgression history ofHeliconius butterflies

Affiliations

Major patterns in the introgression history ofHeliconius butterflies

Yuttapong Thawornwattana et al. Elife..

Abstract

Gene flow between species, although usually deleterious, is an important evolutionary process that can facilitate adaptation and lead to species diversification. It also makes estimation of species relationships difficult. Here, we use the full-likelihood multispecies coalescent (MSC) approach to estimate species phylogeny and major introgression events inHeliconius butterflies from whole-genome sequence data. We obtain a robust estimate of species branching order among major clades in the genus, including the 'melpomene-silvaniform' group, which shows extensive historical and ongoing gene flow. We obtain chromosome-level estimates of key parameters in the species phylogeny, including species divergence times, present-day and ancestral population sizes, as well as the direction, timing, and intensity of gene flow. Our analysis leads to a phylogeny with introgression events that differ from those obtained in previous studies. We find thatHeliconius aoede most likely represents the earliest-branching lineage of the genus and that 'silvaniform' species are paraphyletic within the melpomene-silvaniform group. Our phylogeny provides new, parsimonious histories for the origins of key traits inHeliconius, including pollen feeding and an inversion involved in wing pattern mimicry. Our results demonstrate the power and feasibility of the full-likelihood MSC approach for estimating species phylogeny and key population parameters despite extensive gene flow. The methods used here should be useful for analysis of other difficult species groups with high rates of introgression.

Keywords: BPP; Heliconius; chromosome inversion; evolutionary biology; gene flow; introgression; multispecies coalescent.

© 2023, Thawornwattana et al.

PubMed Disclaimer

Conflict of interest statement

YT, FS, ZY, JM No competing interests declared

Figures

Figure 1.
Figure 1.. Ancestral gene flow at the base ofHeliconius.
(A) Posterior probabilities of species trees for majorHeliconius clades inferred frombpp analysis of 200-locus blocks (each spanning 1–3 Mb) across the genome under the multispecies coalescent (MSC) model with no gene flow. The y-axis shows the posterior probability of species trees and ranges from 0 to 1. Colors correspond to the nine most common maximum a posteriori (MAP) trees, summarized by lumping species in the melpomene-silvaniform clade (Bes, Num, Mel) into a single tip (BNM); see Figure 1—figure supplement 1 for full trees. Proportions of coding and noncoding blocks with each tree as a MAP tree are shown in parentheses. (B, C) Two scenarios of early divergence ofHeliconius: (B) erato-early versus (C) aoede-early. Each tree is a MAP tree from a block having the MAP tree supporting one of these scenarios, with estimates of branch lengths (posterior means). (D, E) Phylogenetic and introgression histories estimated under an MSC model with introgression (MSC-I) corresponding to the two scenarios based on coding loci in chromosome 1 (3517 coding loci; see Figure 1—figure supplements 5 and 6). (F) Proportions of trees of scenarios 1 or 2 in each chromosome in order of increasing number of loci (used as a proxy for chromosome length). The Z chromosome (chr. 21) is placed at the right end. Number of blocks is shown on top of each bar. Tal:Eueides tales; Mel:H. melpomene; Bes:H. besckei; Num:H. numata; Bur:H. burneyi; Dor:H. doris; Aoe:H. aoede; Era:H. erato; Sar:H. sara.
Figure 1—figure supplement 1.
Figure 1—figure supplement 1.. Posterior estimates of species trees from four versions of the 'etales-9spp' dataset (see ‘Methods’), using eitherH. erato (A,C) orH. melpomene (B,D) as a reference, and a read depth cutoff either 12 (A,B) or 20 (C,D).
See legend to Figure 1.
Figure 1—figure supplement 2.
Figure 1—figure supplement 2.. Posterior estimates of species trees from four versions of the 'etales-9spp' dataset (see ‘Methods’), using eitherH. erato (A,C) orH. melpomene (B,D) as a reference, and a read depth cutoff either 12 (A,B) or 20 (C,D), with the clade containing three species (Bes, Num, Mel) in the melpomene-silvaniform clade collapsed as a single species.
See legend to Figure 1.
Figure 1—figure supplement 3.
Figure 1—figure supplement 3.. Posterior means and 95% highest posterior density (HPD) intervals of introgression probabilities (φ), population sizes (θ), and divergence or introgression times (τ) under scenario 1 and scenario 2 multispecies coalescent model introgression (MSC-I) model in Figure 1D and E obtained frombpp analysis of the 'etales-8spp' dataset (see Supplementary file 1c for the number of loci).
A few chromosomal regions (2a, 2b, 6b, 6c, 6d, 13b, 13c) yielded too few loci (<100) and were excluded. Chromosome 21 is the sex (Z) chromosome.
Figure 1—figure supplement 4.
Figure 1—figure supplement 4.. Posterior means and 95% HPD intervals of divergence times estimated under the multispecies coalescent with introgression (MSC-I) model of Figure 1D (erato-early, x-axis) versus estimates under the MSC-I model of Figure 1E (aoede-early, y-axis) using coding (A) or noncoding loci (B) of the 'etales-8spp' dataset.
Red points are divergence times unique to each model.
Figure 1—figure supplement 5.
Figure 1—figure supplement 5.. Posterior estimates of the species tree under the introgression model in Figure 1D (scenario 1: erato-early) obtained frombpp analysis of the 'etales-8spp' dataset for each chromosomal region.
Figure 1—figure supplement 6.
Figure 1—figure supplement 6.. Posterior estimates of the species tree under the introgression model in Figure 1E (scenario 2: aoede-early) obtained frombpp analysis of the 'etales-8spp' dataset for each chromosomal region.
Figure 1—figure supplement 7.
Figure 1—figure supplement 7.. Maximum likelihood estimates (MLEs) of all parameters in the MSC-M model as well as internal branch lengths (Δτ =τ0τ1) by chromosomal region for all pairs ofHeliconius species in the 'etales-9spp' dataset obtained from3s analysis.
Error bars indicate two standard errors (not shown if standard errors were not reliably estimated). Columns are parameters in the model; rows are species pairs.
Figure 2.
Figure 2.. Major introgression events in the melpomene-silvaniform clade.
(A) Blockwise estimates of species trees of the melpomene-silvaniform clade inferred from 200-locus blocks across the genome under the multispecies coalescent (MSC) model with no gene flow usingbpp (see Supplementary file 1l for data, Supplementary file 1m and n and Figure 2—figure supplement 1 for full results). Maximum a posteriori (MAP) trees are labeled in decreasing order of frequency among blocks. Proportions of coding and noncoding blocks with each tree as a MAP tree are shown in parentheses. (B) The multispecies coalescent with introgression (MSC-I) model events that can explain the three major groups of trees in (A). Branch lengths are based on posterior means of divergence/introgression times estimated from 5341 noncoding loci on chromosome 1 (Supplementary file 1q). Each internal node is given a label, which is used to refer to a population above the node, for example, the population between nodes r and bn is referred to as branch bn. Each horizontal arrow represents a unidirectional introgression event, for example, the arrow from tcmeph1 to n3 represents tcmeph→Num introgression at timeτtcmeph1 =τn3 with probabilityφtcmeph1→n3. (C) Continuous migration (IM) model for the pardalinus-hecale clade, allowing bidirectional gene flow among the three species. (D) Multispecies coalescent with migration (MSC-M) model for the cydno-melpomene clade. For (C) and (D), branch lengths are based on estimates from noncoding loci in chromosome 1 (Figure 2—figure supplement 6, Supplementary file 1s–u), and arrow sizes are proportional to estimated migration rate (M =Nm). Bes:H. besckei; Num:H. numata; Par:H. pardalinus; Ele:H. elevatus; Hec:H. hecale; Tim:H. timareta; Cyd:H. cydno; Mel:H. melpomene.
Figure 2—figure supplement 1.
Figure 2—figure supplement 1.. Posterior estimates of species trees of the melpomene-silvaniform clade inferred from 200-locus blocks across the genome from thebpp multispecies coalescent (MSC) analysis with no gene flow of the 'hmelv25-res' dataset.
Figure 2A shows a simplified version with fewer trees. See Supplementary file 1m and n.
Figure 2—figure supplement 2.
Figure 2—figure supplement 2.. Maximum likelihood estimates (MLEs) of all parameters in the multispecies coalescent with migration (MSC-M) model as well as internal branch lengths (Δτ =τ0τ1) by chromosomal region from species pairs in the 'hmelv25-all' dataset obtained from3s analysis using (A)H.aoede or (B)H. erato as an outgroup.
Error bars indicate two standard errors (not shown if standard errors were not reliably estimated). Columns are parameters in the model; rows are species pairs. Full results are in Supplementary file 1o and p.
Figure 2—figure supplement 3.
Figure 2—figure supplement 3.. Estimated introgression history.
Estimated introgression history for each chromosomal region obtained under the model of Figure 2B and Figure 2—figure supplement 4D using (A) noncoding and (B) coding loci. Intensity of the horizontal edges is proportional to posterior mean of introgression probability, while the y-axis represents the divergence times in the units of the expected number of mutations per site. For posterior estimates of all parameters, see Supplementary file 1q and r.
Figure 2—figure supplement 4.
Figure 2—figure supplement 4.. Posterior means and 95% HPD intervals of (A) introgression probabilities, (B) divergence or introgression times (τ), and (C) population sizes (θ) under the multispecies coalescent with introgression (MSC-I) model of Figure 2B using coding (blue) and noncoding data (yellow).
'main' (circle) is the main posterior peak; 'alt' (triangle) is an alternative posterior peak.
Figure 2—figure supplement 5.
Figure 2—figure supplement 5.. Posterior means and 95% HPD intervals of species divergence and introgression times (τ) obtained from coding loci (y-axis) versus noncoding loci (x-axis) under the multispecies coalescent with introgression (MSC-I) models of Figure 2B (for chromosomal regions other than 15b) and of (D) (for 15b).
We used the divergence times (black points) only to fit a regression liney =c x. For the number of loci used, see Supplementary file 1l or Figure 2—figure supplement 4.
Figure 2—figure supplement 6.
Figure 2—figure supplement 6.. Posterior means and 95% HPD intervals of migration rates (M), divergence times (τ), and population sizes (θ) under the multispecies coalescent with migration (MSC-M) model in three species obtained usingbpp.
(A) Pardalinus-hecale clade: ((Par, Ele), Hec). (B) Pardalinus-elevatus clade: (Par, Ele), assuming all three populations have the same sizeθ. (C) Cydno-melpomene clade: ((Tim, Cyd), Mel). Full results for these three models are given in Supplementary file 1s–u.
Figure 3.
Figure 3.. Introgression history of the chromosome 15 inversion region (15b).
(A) Blockwise estimates of species trees for the inversion (15b) and the remnant flanking regions (15a and 15c) of chromosome 15. Trees were inferred from 200-locus blocks and 100-locus blocks under the multispecies coalescent (MSC) model without gene flow usingbpp (Supplementary file 1v). Tree legends are grouped by whetherH. numata clusters withH. ismenius (blue), orH. numata with P1 inversion (Num11) clusters withH. pardalinus (red), or other relationships (green). (B–E) Four possible scenarios of the origin and introgression route of the P1 inversion. Star indicates the origin of the P1 inversion. Green lineages have the inversion, and green arrows indicates introgression of the inversion. Ism:H. ismenius; Num00:H. numata homozygous uninverted 15b (H. n. laura andH. n. silvana); Num11:H. numata homozygous for inversion P1 (H. n. bicoloratus); Eth:H. ethilla. See Figure 2 legend for codes for other species. For the direction of melpomene-cydno clade→H. elevatus, see Figure 3—figure supplement 3.
Figure 3—figure supplement 1.
Figure 3—figure supplement 1.. Mean heterozygosity per site of the 15b inversion region and flanking regions (15a and 15c) ofH.numata (Num) andH. pardalinus (Par) individuals.
Both individuals are homozygotes for the P1 inversion.
Figure 3—figure supplement 2.
Figure 3—figure supplement 2.. Posterior estimates of divergence time for (Ism, Num00) and (Par, Num11) in the 15b region.
Only estimates from trees with posterior probabilities >0.1 are shown. Trees are in order of decreasing frequency among coding or noncoding data. See legend to Figure 3 for full species names.
Figure 3—figure supplement 3.
Figure 3—figure supplement 3.. Estimated introgression history under five introgression models (m1–m5) of the 15b region.
Posterior estimates of all parameters under each model are in Supplementary file 1w.
Figure 4.
Figure 4.. Revised phylogeny and introgression history ofHeliconius.
Phylogeny of the erato-sara clade was previously estimated using a similar approach (Thawornwattana et al., 2022). Arrows represent introgression events. Introgression of chromosomal inversions are region-specific, indicated by a label (13b or 15b) above the arrow. Status of inversions 13b and 15b in each species is indicated in the two rows at the bottom, with + indicating inversion, – indicating standard orientation, and +/– inversion polymorphism. Gray shading represents a period of continuous gene flow. Triangle represents the origin of pollen feeding near the base ofHeliconius. Star indicates a possible origin of the P1 inversion on chromosome 15, and green branches indicate lineages with the inversion (polymorphic in Num, fixed in Par). Him:H. himera; Sia:H. hecalesia; Tel:H. telesiphe; Dem:H. demeter; see Figures 1—3 legends for other species codes.
Appendix 1—figure 1.
Appendix 1—figure 1.. Isolation-with-migration (IM) analysis of the 'etales-9spp' dataset using 3s.
(A) Model specification in 3s. There are three types of parameters: divergence times (τ1,τ0), effective population sizes (θ1,θ2,θ4, andθ5) and migration rates (M12,M21). (B) Maximum-likelihood estimates (MLEs) of the pairwise migration rates (M12,M21), with donor species on the y-axis and recipient species on the x-axis. We usedEueides tales as the outgroup (S3 inA). Top and bottom quadrants are results from noncoding and coding loci, respectively. Left quadrants show results from all autosomal loci. Right quadrants show results from all Z chromosome loci (21a + 21b + 21c). For the number of loci, see Supplementary file 1c (H. erato reference, minDP (d) = 12). For pairs without displayed numbers, the likelihood ratio test (LRT) was not significant at 0.1%, thereby failing to reject a null model of zero gene flow. (C) Mutation-scaled migration rates (w12 = 4M12/θ2 and w21 = 4M21/θ1) as a measure of expected admixture fraction in the recipient genome. (D) Root age (τ0). Estimates from coding loci are shown in the upper triangle while estimates from noncoding loci are in the lower triangle. (E) Divergence time of the ingroup species pair (τ1). (F) Internal branch length (Δτ =τ0τ1). Full results are in Supplementary file 1k and Figure 1—figure supplement 7.
Appendix 1—figure 2.
Appendix 1—figure 2.. Maximum likelihood estimates (MLEs) of migration rates (M), divergence times (τ0,τ1), and the internal branch length (Δτ =τ0τ1) from 3s analysis of the 'hmelv25-all' dataset, using (A)H. aoede or (B)H. erato as an outgroup.
See legend to Appendix 1—figure 1.
Appendix 1—figure 3.
Appendix 1—figure 3.. Generation of multilocus data.
Coordinates of coding and noncoding loci were obtained from a reference genome with an annotation of coding sequences (CDS).
Appendix 1—figure 4.
Appendix 1—figure 4.. Raw (black) and adjusted (red) estimates of the difference in the mean log likelihood from the two models (y-axis).
Raw and adjusted estimates of the Bayes factor are in Supplementary file 1h.
See this image and copyright information in PMC

Update of

  • doi: 10.1101/2023.06.21.545923
  • doi: 10.7554/eLife.90656.1
  • doi: 10.7554/eLife.90656.2

Similar articles

See all similar articles

Cited by

  • Hybrid speciation driven by multilocus introgression of ecological traits.
    Rosser N, Seixas F, Queste LM, Cama B, Mori-Pezo R, Kryvokhyzha D, Nelson M, Waite-Hudson R, Goringe M, Costa M, Elias M, Mendes Eleres de Figueiredo C, Freitas AVL, Joron M, Kozak K, Lamas G, Martins ARP, McMillan WO, Ready J, Rueda-Muñoz N, Salazar C, Salazar P, Schulz S, Shirai LT, Silva-Brandão KL, Mallet J, Dasmahapatra KK.Rosser N, et al.Nature. 2024 Apr;628(8009):811-817. doi: 10.1038/s41586-024-07263-w. Epub 2024 Apr 17.Nature. 2024.PMID:38632397Free PMC article.
  • Evolutionary dynamics of genome size and content during the adaptive radiation of Heliconiini butterflies.
    Cicconardi F, Milanetti E, Pinheiro de Castro EC, Mazo-Vargas A, Van Belleghem SM, Ruggieri AA, Rastas P, Hanly J, Evans E, Jiggins CD, Owen McMillan W, Papa R, Di Marino D, Martin A, Montgomery SH.Cicconardi F, et al.Nat Commun. 2023 Sep 12;14(1):5620. doi: 10.1038/s41467-023-41412-5.Nat Commun. 2023.PMID:37699868Free PMC article.

References

    1. Barton N, Bengtsson BO. The barrier to genetic exchange between hybridising populations. Heredity. 1986;57 (Pt 3):357–376. doi: 10.1038/hdy.1986.135. - DOI - PubMed
    1. Beltrán M, Jiggins CD, Brower AVZ, Bermingham E, Mallet J. Do pollen feeding, pupal-mating and larval gregariousness have a single origin in Heliconius butterflies? Inferences from multilocus DNA sequence data. Biological Journal of the Linnean Society. 2007;92:221–239. doi: 10.1111/j.1095-8312.2007.00830.x. - DOI
    1. Benson WW, Brown KS, Gilbert LE. Coevolution of plants and herbivores: Passion flower butterflies. Evolution; International Journal of Organic Evolution. 1975;29:659–680. doi: 10.1111/j.1558-5646.1975.tb00861.x. - DOI - PubMed
    1. Brower AVZ, Egan MG. Cladistic analysis of Heliconius butterflies and relatives (Nymphalidae: Heliconiiti): a revised phylogenetic position for Eueides based on sequences from mtDNA and a nuclear gene. Proceedings of the Royal Society of London. Series B. 1997;264:969–977. doi: 10.1098/rspb.1997.0134. - DOI
    1. Brown KS. The heliconians of Brazil (Lepidoptera: Nymphalidae). Part III. Ecology and biology of Heliconius nattereri, a key primitive species neat extinction, and comments on the evolutionary development of Heloconius and Eueides. Zoologica. 1972;57:41–69. doi: 10.5962/p.203235. - DOI

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full text links
eLife Sciences Publications, Ltd full text link eLife Sciences Publications, Ltd Free PMC article
Cite
Send To

NCBI Literature Resources

MeSHPMCBookshelfDisclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.


[8]ページ先頭

©2009-2025 Movatter.jp