Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
                                  NCBI home page
Search in PMCSearch
As a library, NLM provides access to scientific literature. Inclusion in an NLM database does not imply endorsement of, or agreement with, the contents by NLM or the National Institutes of Health.
Learn more:PMC Disclaimer | PMC Copyright Notice
Genetics logo

Evolution of Recombination Due to Random Drift

N H Barton*,Sarah P Otto†,1
*School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom
Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4
1

Corresponding author: Department of Zoology, University of British Columbia, 6270 University Blvd., Vancouver, BC, Canada V6T 1Z4. E-mail:otto@zoology.ubc.ca

Communicating editor: M. K. Uyenoyama

Received 2004 Jun 24; Accepted 2005 Jan 10.

Copyright © 2005, Genetics Society of America
PMCID: PMC1449609  PMID:15687279

Abstract

In finite populations subject to selection, genetic drift generates negative linkage disequilibrium, on average, even if selection acts independently (i.e., multiplicatively) upon all loci. Negative disequilibrium reduces the variance in fitness and hence, by Fisher's (1930) fundamental theorem, slows the rate of increase in mean fitness. Modifiers that increase recombination eliminate the negative disequilibria that impede selection and consequently increase in frequency by “hitchhiking.” Thus, stochastic fluctuations in linkage disequilibrium in finite populations favor the evolution of increased rates of recombination, even in the absence of epistatic interactions among loci and even when disequilibrium is initially absent. The method developed within this article allows us to quantify the strength of selection acting on a modifier allele that increases recombination in a finite population. The analysis indicates that stochastically generated linkage disequilibria do select for increased recombination, a result that is confirmed by Monte Carlo simulations. Selection for a modifier that increases recombination is highest when linkage among loci is tight, when beneficial alleles rise from low to high frequency, and when the population size is small.


RECOMBINATION breaks down the linkage disequilibrium generated between genes. In the absence of mutation and in a constant environment, the linkage disequilibrium present in a population will largely be a product of selection favoring specific gene combinations. In this case, one would expect that recombination would be detrimental and that genes that modify recombination rates would evolve to reduce recombination. This intuition is confirmed by the “reduction principle”: in models at equilibrium under viability selection, modifier alleles that reduce recombination are favored in the absence of mutation (Feldman 1972;Feldmanet al. 1980;Altenberg and Feldman 1987). How then is recombination maintained at appreciable levels in most higher organisms? In this article, we explore one possibility: that recombination is favored because it reduces the linkage disequilibrium generated stochastically in a population by random genetic drift.

For recombination to have an influence on the process of evolution, linkage disequilibria must exist within the genome. Although disequilibria can be generated by stochastic as well as deterministic forces (Felsenstein 1988;Kondrashov 1993), most work on the evolution of recombination has focused on deterministic models that assume an infinite population size. In a single unstructured population of infinite size, the spread of modifier alleles that increase recombination has been observed under three selective regimes. When epistatic interactions fluctuate over time, recombination can be favored, but only if fluctuations occur over a timescale of a few generations, such that linkage disequilibria that were initially advantageous soon become disadvantageous (Charlesworth 1976,1990;Maynard Smith 1978;Barton 1995a). When there is directional selection, increased recombination rates can evolve under a wider range of conditions: essentially, there must be weak negative epistasis among favorable alleles (Maynard Smith 1980,1988;Charlesworth 1993;Barton 1995a). Weak negative epistasis implies that the fitness of a genotype with several advantageous alleles is slightly less than would be expected from the product of the fitnesses of each allele measured separately. In this case, selection generates negative linkage disequilibrium between favorable alleles, which reduces the genetic variance present in the population. Recombination destroys this disequilibrium, increasing the variance in the population and improving the response of the population to selection. Finally, purifying selection against recurrent deleterious mutations also favors the evolution of recombination (Feldmanet al. 1980;Kondrashov 1982,1984,1988;Barton 1995a;Otto and Feldman 1997). Again, increased recombination is favored if there are weak negative epistatic interactions among mutations, implying that the mutations interact synergistically to cause a further than expected reduction in fitness.

While negative epistasis can provide the raw material (negative linkage disequilibrium) for recombination to be favored at a modifier locus, it is not clear why epistasis should generally be negative—that is, why fitness interactions should impede selection. Experimental evidence for negative epistasis among random mutations is limited (Rice 2002). Although negative epistasis has been observed in some cases (e.g.,Mukai 1969 inDrosophila melanogaster;de Visseret al. 1996 inChlamydomonas moewusii; but seeWestet al. 1998), positive epistasis has been observed in other cases (e.g.,Seager and Ayala 1982,Seageret al. 1982 inD. melanogaster; but seeCharlesworth 1990). Further studies inEscherichia coli (Elena and Lenski 1997) andAspergillus niger (de Visseret al. 1997) have found significant evidence for both positive and negative epistasis but without a preponderance of either form. On theoretical grounds, negative epistasis has been invoked to account for the survival of populations in the face of a high rate of deleterious mutations. In the absence of epistasis, mean fitness would be exp(−U), whereU is the genome-wide rate of deleterious mutations. Thus, mean fitness would be quite low in organisms with a high genomic mutation rate. The mean fitness of a population can be increased, however, in sexual populations if deleterious mutations interact synergistically (i.e., exhibit negative epistasis;Kimura and Maruyama 1966;Kondrashov 1982,1988). Yet synergistic interactions need not be invoked to explain the persistence of populations if the rate of deleterious mutation is low (Keightley 1997;Keightley and Caballero 1997;Keightley and Eyre-Walker 2000), if compensatory mutations are common, or if selection is “soft” such that the mean fitness of a population has little impact on the number of surviving individuals (Turner and Williamson 1968). Another approach has been to try to predict the form of epistasis,a priori, from models of the underlying biological processes affecting fitness. For example, using metabolic control theory,Szathmary (1993) found that negative epistasis among mutations predominates when flux through a metabolic pathway is under stabilizing selection but that positive epistasis predominates when maximal flux is favored. At this stage, both the experimental evidence and theoretical arguments concerning the nature of interactions among mutations remain equivocal and form a weak basis for explaining the predominance of sex and recombination.

If negative epistasis fails to explain the evolution of sex and recombination, what other process might be responsible? For insight into this question, it is worth returning to some of the earliest arguments for the evolution of sex.Morgan (1913),Fisher (1930), andMuller (1932) argued that an important advantage of sex is that beneficial mutations that arise in separate individuals can be brought together within the same individual. In the absence of sex, only one lineage will leave descendants in the long term, and beneficial mutations in all other asexual lineages will be lost. Thus, the probability of fixation of a new beneficial mutation is severely limited in the absence of sex by the fate of the rest of the genome in which it arises.

The source of disequilibrium implicit in the words of Morgan, Fisher, and Wright is not epistasis but random sampling. When a beneficial allele first arises, it occurs on (“samples”) one particular genetic background and, unless recombination occurs, the fate of the beneficial allele is tied to the evolutionary fate of that background. If the population were infinitely large, this sampling problem would disappear because all possible combinations of alleles would be generated instantaneously by mutation. Simulations of finite populations byHill and Robertson (1966) demonstrated that the probability of fixation of beneficial alleles is limited by linkage among selected loci, a phenomenon now known as the Hill-Robertson effect (see alsoFelsenstein 1974). Furthermore, bothHill and Robertson (1966) andFelsenstein (1974) demonstrated that, even when epistasis is absent, the disequilibria that develop in a finite population tend to slow the spread of beneficial alleles, especially when recombination is rare. That is, even when fitness is multiplicative across loci, drift generates negative disequilibria among selected loci such that beneficial alleles at one locus become associated with deleterious alleles at other loci. In subsequent simulations,Felsenstein and Yokoyama (1976) found that modifiers that increase the rate of recombination spread in response to the negative disequilibria created by drift in the presence of selection.

The Hill-Robertson effect provides an alternative source for the negative linkage disequilibria that cause indirect selection for modifiers that increase recombination. The way in which random drift generates negative associations is most easily seen by considering two beneficial mutations that arise at different loci in the same generation. LetD measure the gametic linkage disequilibrium between the two loci in a diploid population of sizeN. Initially, the expected linkage disequilibrium is zero, because the very high chance that the alleles arise on different chromosomes [D = −1/(2N)2 with probability 1 − 1/(2N)] is counterbalanced by the extremely small chance of a strong positive association if the mutations arise in coupling [D = (1 − 1/(2N))1/(2N) with probability 1/(2N)]. Consider what would happen to the disequilibrium in the absence of recombination assuming that the mutations rise deterministically in frequency after their appearance. At first, both the negative and the positive disequilibria grow exponentially as the beneficial alleles increase within the population. Eventually, however, the growth of the positive disequilibrium slows down as the coupled beneficial alleles reach fixation. That is, beneficial alleles that arise in coupling rise rapidly and fix within the population, leading to the disappearance of the positive disequilibrium. The negative disequilibrium persists for a much longer period of time, until one or the other allele becomes fixed. Therefore, with multiplicative selection, the variance in disequilibrium present in the first generation ultimately leads to negative disequilibrium, on average.

This heuristic argument applies not only when disequilibrium is generated by the random appearance of mutations on particular genetic backgrounds, but also more generally when disequilibrium is generated by random genetic drift (Qureshi 1963;Latter 1965;Hill and Robertson 1966;Felsenstein 1974). As shown by Hill and Robertson, negative disequilibrium accumulates, on average, whenever genetic drift occurs in the presence of multiplicative selection. Although drift can generate both positive and negative disequilibria, which cancel out on average, selection does not act symmetrically upon the disequilibria thus created. Negative associations persist for longer simply because such associations (“good alleles on bad genetic backgrounds”) impede selection. Conversely, positive associations are relatively transient, as selection fixes the best combinations (good with good) and eliminates the worst ones (bad with bad). When averaged across replicates or loci or time, the overall level of disequilibrium expected between selected loci becomes more negative than expected in an infinite population. As we shall see, this process can be significant even in fairly large populations as long as the allele frequencies are initially low and subject to drift.

Analyzing the Hill-Robertson effect is made difficult by the fact that it requires a stochastic model tracking the frequencies of several chromosomes, each of which changes over time in a nonlinear fashion. While difficult, incorporating stochasticity into models of the evolution of sex and recombination is essential, because the fate of a modifier allele depends critically on whether or not drift is included (Felsenstein and Yokoyama 1976;Otto and Barton 1997,2001). Previous analytical results have used branching processes with multiplicative selection to examine the probability that beneficial alleles are lost or fixed from very large populations.Barton (1995b) calculated the amount by which recombination increases the probability of fixation of beneficial alleles. We extended this analysis to determine the extent to which increased recombination would be favored at a modifier locus (Otto and Barton 1997). Because beneficial mutations are more likely to fix when associated with modifier alleles that increase recombination, these modifier alleles get dragged along as the beneficial alleles spread through the population. This process of genetic hitchhiking causes the modifier alleles, and hence the recombination rate, to increase within the population.

These analyses focused only on the linkage disequilibrium generated by the random occurrence of mutations on specific genetic backgrounds. Even when beneficial alleles are sufficiently common that their fixation is assured, however, random fluctuations in genotype frequencies generate linkage disequilibria among selected loci. Indeed, in our computer simulations, linkage disequilibria generated among beneficial mutations after they first arose affected the dynamics of a modifier of recombination (Otto and Barton 1997). Thus, we must consider the dynamics of disequilibrium generated by random genetic drift over the entire time course as beneficial alleles spread through a population. Because drift tends to generate negative linkage disequilibria, on average (with good alleles residing on bad genetic backgrounds), modifier alleles that increase the rate of recombination bring together good alleles from different individuals, generating indirect selection for recombination. We showed through simulation that this indirect selection is often stronger than the indirect selection for recombination generated by epistasis in populations of small (2N = 100) to intermediate (2N = 10,000) size (Otto and Barton 2001). More importantly, modifiers that increased the frequency of sex and recombination spread, on average, across a broad range of epistasis, from weak negative epistasis to positive epistasis, when drift was incorporated.

While previous analyses have shown that fluctuations in linkage disequilibrium can substantially increase the effects of a favorable allele on linked neutral loci (Stephanet al. 1992;Barton 1998), we lack an analytical framework to predict the dynamics of disequilibrium between selected loci in the presence of drift and selection. This article develops such a framework, allowing us to track fluctuations in linkage disequilibrium during the spread of beneficial alleles and to measure the impact on a modifier allele of recombination.

Our analysis makes two crucial simplifications: that the population size (N) is large enough that only leading terms (O(1/N)) need be included and that no allele is very rare. Furthermore, because we wish to concentrate on stochastic effects, we assume multiplicative fitnesses throughout. Including epistasis would induce linkage disequilibria and selection on recombination even in an infinite population. These simplifications allow us to approximate the full stochastic dynamics with recursion equations giving the mean and variance for the allele frequencies and disequilibria over time. Approximate solutions to these recursions are then obtained under the assumption of weak selection. This analysis shows that genetic drift creates variance in linkage disequilibrium, which in turn generates negative linkage disequilibrium, on average. The amount of linkage disequilibrium is larger when the beneficial alleles start at low frequency, when selection favoring the alleles is strong, and when linkage is tight. Under these circumstances, a small amount of linkage disequilibrium generated by random genetic drift while the alleles are rare is amplified by selection and only slowly destroyed by recombination. Finally, we determine the amount of indirect selection for recombination generated by drift in the presence of selection by examining evolutionary changes at a third locus that modifies recombination rates. By destroying the negative disequilibrium built up by drift in the presence of directional selection, a modifier allele that increases recombination rates improves the response of its carriers to selection and rises in frequency as a consequence.

TWO-LOCUS MODEL

We begin by examining the linkage disequilibrium generated in a model with two loci,j andk, separated by a recombination rate ofrjk. We count genotypic frequencies immediately after meiosis and assume that multiplicative viability selection acts within the population. At locusj, allelesQj,Pj segregate at frequenciesqj andpj and are assumed to have relative fitnesses (1 −sj/2):(1 +sj/2), respectively. Similarly, allelePk has a selective advantage ofsk over alleleQk at locusk. This assumes that either the population is haploid or fitnesses in a diploid population are multiplicative within as well as between loci [i.e.,QQ:PQ:PP have fitnesses (1 −s/2)2:(1 −s2/4):(1 +s/2)2]. This notation is consistent with that ofBarton and Turelli (1991) andBarton (1995a) and simplifies the algebra.

In an infinite population, if gametic linkage disequilibrium (Djk = freq(PjPk)freq(QjQk) − freq(PjQk)freq(QjPk)) is initially absent, it remains zero under multiplicative selection (Maynard Smith 1968). In this case, the allele frequencies at each locus increase logistically. Specifically, the ratiop/q increases by a factor (1 +s/2)/(1 −s/2) every generation or by approximately exp(st) aftert generations. The technique that we develop in this article assumes that the trajectory of allele frequencies closely follows the deterministic equations for this model but that small perturbations from this trajectory are caused by random drift. We develop recursion equations for the first and second moments of these perturbations to determine the combined effects of drift and selection over time.

Fluctuations around the deterministic trajectory:

We assume a discrete-generation model in which a population is chosen by random sampling from the propagules produced by the previous generation. The frequencies of the chromosomes among the propagules are given by the deterministic recursions for either a haploid population (with selection on haploids, followed by random mating and meiosis, to produce haploid propagules) or a diploid population (with selection on diploids, followed by meiosis and random mating to produce diploid propagules). From these propagules, either 2N haploid orN diploid juveniles are sampled with replacement, implying that the chromosome frequencies follow a multinomial distribution. We census the population immediately after sampling occurs.

Random sampling in any particular generation causes genotype frequencies to differ from that expected on the basis of the propagule frequencies. We measure the perturbations generated by a single round of sampling by the random variables ζj, ζk (for perturbations in the allele frequencies at locij,k) and ζjk (for perturbations in the linkage disequilibrium). The exact equations for change in the allele frequencies and linkage disequilibrium due to selection, recombination, and random genetic drift are given inappendix A.

The moments of the multinomial distribution can be used to determine the expected values of the perturbations generated by a single round of random sampling, as well as their variances and covariances. For instance, because random genetic drift does not cause any directional change in allele frequency,Ej] = 0, but the allele frequency will vary around its deterministic trajectory by an amount that is inversely proportional to the population size with Var[ζj] =pjqj/(2N). The first and second moments for the perturbations are given inappendix A. Higher-order moments are of the order 1/N2 or smaller and are ignored in our analysis.

To describe the dynamics over multiple generations, we must keep track of the cumulative deviations from the deterministic trajectory, which we do using the vectorz = (δpj, δpk, δDjk). The random variable δpj measures the difference between the actual allele frequency at any point in time and the allele frequency predicted in the absence of random genetic drift. Similarly, δDjk measures the difference between the actual linkage disequilibrium and the disequilibrium predicted deterministically. Throughout, we assume that the perturbations inz remain small and keep only the leading-order terms.

In the text, we assume that there is no epistasis and that the initial level of disequilibrium, if present, is of the orderN−1. Under these assumptions, the disequilibrium predicted deterministically can be set to zero, so that δDjk measures the actual amount of disequilibrium in the presence of drift. The method can be extended, however, to account for epistasis or for strong initial disequilibrium by keeping track of perturbations from the deterministic trajectory forDjk (seeappendix A).

Recursions for cumulative effects of drift in the presence of selection can be written asz* =f(z) and can be determined from (A2). Assuming that the perturbations are small,z* can be expanded in a Taylor series,

graphic file with name M1.gif1

where the subscripts (a,b,c) are set toj ork when referring to deviations in the allele frequencies and tojk when referring to the disequilibrium. The partial derivatives in (1) measure the sensitivity of each recursion to the perturbations caused by random sampling. They were derived from the recursions (A2) using the computer algebra packageMathematica (Wolfram 1991) and are given in supplementary information (S1;http://www.genetics.org/supplemental/). Note that all partial derivatives are evaluated along the deterministic trajectory (z =0) and are therefore independent of the perturbations. Taking the expectation of (1) gives

graphic file with name M2.gif2

Similarly, second moments are given by the expectation of the product of (1) with itself:

graphic file with name M3.gif3

Contributions to the first- and second-order moments arise by drift and are ofO(N−1) as long as the allele frequencies are not too small (appendix A). In contrast, starting from a population with a specific genotypic composition, higher moments are initially absent, are generated by drift only toO(N−2), and are henceforth ignored. Note that the recursionequation (3) forEz*az*b also describes the change in the covariance,Inline graphic, becauseEz*aEz*b isO(N−2) and can be ignored. Similarly,Ez*a2 describes the change in the variance, Varz*a, of the perturbations around the deterministic trajectory. Finally, theE[zcζa] in (3) are alsoO(N−2) and are dropped.

Having dropped terms ofO(N−2) in (2) and (3) and using the fact that many of the partial derivatives in supplementary information (S1) equal zero, fluctuations in the allele frequencypj around the deterministic trajectory satisfy the following recursions:

graphic file with name M5.gif4a
graphic file with name M6.gif4b

Fluctuations in the allele frequencypk are described by analogous equations, with subscriptsj andk interchanged. Of greater relevance to the evolution of recombination are fluctuations that occur in the disequilibrium, δDjk, around the deterministic equilibrium ofDjk = 0:

graphic file with name M7.gif5a
graphic file with name M8.gif5b
graphic file with name M9.gif5c

Eδp*kδD*jk is given by (5c), with subscriptsj andk interchanged. These equations apply even if selection coefficients vary through time, provided that there is no frequency dependence. Frequency-dependent selection could be modeled in a similar fashion, taking into account that the partial derivatives will contain additional components depending on the form of selection. With either constant or variable selection coefficients, recursions (4) and (5) can be evaluated numerically (Mathematica package available upon request) to predict the means, variances, and covariances of the perturbations over time.

The recursions for the perturbations can be further analyzed under the assumption that selection is weak (appendix B). Solution (B3) for the perturbations involving the disequilibrium is derived assuming that linkage is sufficiently loose relative to selection that the terms involving disequilibria reach a steady-state level over a faster timescale than the changes in allele frequencies; this steady-state level is known as the quasi-linkage equilibrium or QLE (Kimura 1965;Nagylaki 1993;Barton 1995a). Solution (B5) is derived assuming that both selection coefficients and recombination rates are small enough that the recursions can be well approximated by differential equations, which are then solved.

Development of disequilibria under directional selection:

We now consider how the above equations can be used to infer how disequilibrium develops in finite populations. Variance in linkage disequilibrium is generated directly by random drift, contributing thepjqjpkqk/2N term to (5b). As long as the direction of selection remains the same, all of the terms in (5c) are positive ∂δp*j/∂δDjk, ∂δp*k/∂δDjk, ∂δD*jk/∂δDjk > 0, indicating that a positive covariance between the frequency of each beneficial allele and linkage disequilibrium will develop as a result of this variance in linkage disequilibrium. This covariance arises because allele frequencies increase more rapidly in populations that happen to have positive linkage disequilibrium. The presence of both variance in disequilibrium and covariance between allele frequencies and linkage disequilibrium then causes the expected linkage disequilibrium (5a) to depart from zero. Because ∂2δD*jk/∂δD2jk is negative (appendix A), variance in disequilibrium across replicate populations leads, on average, to negative disequilibrium. Mathematically, this occurs because the mean fitness,Inline graphic, given by (A1) is higher with positive disequilibrium than with negative disequilibrium for any given set of allele frequencies. Consequently, the amount of positive disequilibrium gets divided by a larger quantity (Inline graphic2) in (A2b) and becomes relatively smaller in the next generation. Another, perhaps more intuitive, explanation, is that the expected disequilibrium across replicates becomes negative because selection is more efficient when good alleles are found on good genetic backgrounds (i.e., when disequilibrium is positive by chance), causing positive disequilibrium to decay more rapidly than negative disequilibrium whenever it occurs. The positive covariance generated between each allele frequency and the linkage disequilibrium leads the expected disequilibrium to become even more negative (∂2δD*jk/∂δpj∂δDjk, ∂2D*jk/∂δpk∂δDjk < 0). This occurs because those populations that happen to have positive disequilibrium and, consequently, faster rising allele frequencies (generating the positive covariance) also have higher mean fitness values in future generations. The linkage disequilibrium in future generations will then continue to be divided by a larger quantity (Inline graphic2) in (A2b), compounding the reduction in positive disequilibrium relative to negative disequilibrium. As a consequence of all of these effects, the expected value of linkage disequilibrium becomes negative even though it was initially zero and even though selection acts independently on the two loci.

Provided that allele frequencies are intermediate [i.e.,p,q areO(1)], the average amount of negative disequilibrium that develops is always proportional to 1/N: if the population size is doubled, the amount of disequilibrium generated by drift and selection is halved. When linkage is loose, disequilibrium does not accumulate appreciably over time. With tight linkage, however, substantial amounts of disequilibrium can accumulate within a population, especially when allele frequencies are initially low. This is a consequence of the fact that a small amount of disequilibrium generated by drift between rare alleles becomes a large amount of disequilibrium as the alleles approach a frequency of1/2. If the initial allele frequency is too small [O(1/N)], as is the case with a single favorable mutation, then higher-order terms in the Taylor series approximation (1) become appreciable and can no longer be ignored.

Figure 1 shows the distribution of allele frequencies over time in a population of size 2N = 10,000 when allelesPj andPk are initially at frequencyp0 = 0.01 (sj =sk =rjk = 0.1). The thick curve shows the mean trajectory (which is nearly equal to the deterministic trajectory), while the thin curves indicate the range within which allele frequencies lie 95% of the time as calculated from (4). These predictions were compared to simulation results based on a Monte Carlo simulation that sampled 2N chromosomes each generation using a multinomial distribution, with parameters equal to the frequency of each chromosome given by the deterministic recursions. Simulations (dots) indicate that (4) accurately captures the effects of drift on allele frequencies. Note that drift causes substantial variation in the trajectory of allele frequencies even in a population of size 10,000. This variation arises primarily during the early stages when only a small number of beneficial alleles are present (2Np0 = 100) and represents random accelerations or decelerations in the spread of favorable alleles. On average, beneficial alleles spread slightly less rapidly than expected in an infinite population, indicating that the Hill-Robertson effect is present, if small, in populations of size 10,000. The effects of drift become even more important when there are initially fewer alleles within the population (e.g., 2Np0 ≤ 10). In this case, however, iterating (4) and (5) overestimates the Hill-Robertson effect (results not shown), because higher-order moments in the perturbations are ignored and yet play an important role whenever allele frequencies are near zero.

Figure 1.—

Figure 1.—

Allele frequency change in a finite population. The thick curve shows the average frequency over time of the allelePj, with selective advantagesj = 0.1 and initial frequencypj0 = 0.01, in a population of size 2N = 10,000, usingEquation 4a to determine the departure from the deterministic trajectory (pj/qj =esjtpj0/qj0). Although drift causesPj to rise less rapidly on average than expected in an infinitely large population, the effect is small, shifting the trajectory to the right by an amount approximately equal to the thickness of the curve. We therefore use the deterministic trajectory to approximatepj throughout this article, which is a reasonable assumption provided thatNsj ≫ 1. The thin curves show the expected allele frequency ±2 standard deviations based on (4b) (a weak selection approximation generates indistinguishable results and is not shown). Dots show simulation results for the mean allele frequency ±2 standard deviations, based on 1,000,000 replicates.

Figure 2 shows how drift affects (a) the variance in disequilibrium scaled relative to the allele frequencies,EδD2jk/pjqjpkqk, (b) the correlation between δpj and disequilibrium,EδpjδDjk/Inline graphic, and (c) the average disequilibrium. The average magnitude of the linkage disequilibrium is greater when the initial frequency of the beneficial alleles is lower, because random associations generated in the early stages persist and are amplified as the beneficial alleles rise in frequency (compare dotted, dashed, and solid curves, which correspond to initial frequencies 0.001, 0.01, and 0.1). In this example, wheresj =sk =rjk = 0.1, the QLE approximation (B3) fails to capture the influence of the initial allele frequency, which is pronounced when selection is strong relative to recombination [e.g., the QLE prediction (B3a) for the scaled variance is constant at 0.00053]. On the other hand, the weak selection approximation (B5) overestimates the impact of initial fluctuations (compare thin to thick curves), although calculations with weaker selection (0.01) show that the approximation then agrees closely.

Figure 2.—

Figure 2.—

Effects of drift and selection on disequilibrium in a finite population under directional selection. (a) Variance in disequilibrium scaled topjqjpkqk; (b) correlation between effects of drift on allele frequency and on disequilibrium (Corr[δpjδDjk]); (c) average disequilibrium over time. In each case, the initial allele frequencies arepj0 =pk0 = 0.001 (dotted curves), 0.01 (dashed curves), or 0.1 (solid curves), where the latter curves are shifted to the right by 23.1 and 47.1 generations, respectively, so that allele frequencies are equivalent for all curves at any point in time. The thick curves give the exact calculations from (5), while the thin curves show the weak selection approximation (B5). Results are shown forsj =sk = 0.1,rjk = 0.1, and 2N = 10,000.

Figure 3 shows that the expected disequilibrium scales with 1/(2N) and matches the disequilibrium observed in simulations. The only exception is when the initial number of copies is very small (in this example, for 2Np0 = 10). With few initial copies of the beneficial alleles, three issues arise that are not accounted for by our analysis. First, the fluctuations around the deterministic trajectory become so large that the Taylor approximation toEquation 1 is no longer adequate. Second, with few initial copies, beneficial alleles do not necessarily fix within the population. Third, those beneficial alleles that do fix are more likely to be positively associated with other beneficial alleles. Therefore, during the first few generations, rare beneficial alleles are more likely to be lost when negative disequilibrium develops (thereby destroying the disequilibrium) and more likely to persist when positive disequilibrium develops. Therefore, the average disequilibrium among beneficial alleles that have survived loss while rare might very well become positive, at least during the initial spread of the beneficial alleles. There is some evidence for this effect within the simulations: in the fifth generation, significantly positive disequilibrium was found when 2N = 10,000 and initial frequencies were 0.001 but in no other case. These positive associations will counteract the accumulation of negative disequilibrium to some extent, explaining, in part, the smaller amount of disequilibrium observed for this case (Figure 3,p0 = 0.001).

Figure 3.—

Figure 3.—

Observed and expected disequilibrium under directional selection. The disequilibria expected from (5) (curves) and observed in simulations with 1,000,000 replicates (dots, with bars indicating ±2 standard errors) are shown for (a) 2N = 10,000 and (b) 2N = 100,000. Equation 5 accurately estimates the amount of disequilibrium generated except when there are only 10 initial copies of each allele (pj0 =pk0 = 0.001 and 2N = 10,000). Otherwise, the disequilibrium scales directly with 1/(2N). With only 10 copies initially, both beneficial alleles did not always fix, and we report only the disequilibrium observed in those cases in which both beneficial alleles remain. Remaining parameters and symbols are as inFigure 2.

Figure 4 shows the most extreme value of the expected linkage disequilibrium observed during the spread of allelesPj andPk as a function ofrjk. Results from Monte Carlo simulations are well approximated by iterating Equations 4 and 5 (compare dots to thick dashed curve forp0 = 0.01 and thick solid curve forp0 = 0.1). The QLE approximation (B3c; thin curve) is accurate only for loose linkage (rjk >sj,sk = 0.1) and fails to capture the dependence of disequilibrium on initial allele frequencies. With tight linkage, random fluctuations produced when small numbers of the favorable alleles are present persist to produce large fluctuations when the alleles become common. Consequently, the most extreme disequilibrium observed is very sensitive to starting allele frequencies when linkage is tight. Whenp0 = 0.01 andrjk < 0.01, drift and selection are substantial enough to drive the expected linkage disequilibrium to near its minimum value (−0.25), even though disequilibrium is initially absent and even though selection acts independently on all loci (sj =sk = 0.1).

Figure 4.—

Figure 4.—

The most extreme value of the average disequilibrium under directional selection, plotted on a log-log scale. During the spread of beneficial alleles, the expected disequilibrium from (5) (thick curves) becomes negative, taking on the minimum values shown. More extreme disequilibria are observed when the initial allele frequencies are low [comparepj0 =pk0 = 0.01 (dashed curves) topj0 =pk0 = 0.1 (solid curves)] and when recombination is rare, in which case the expected disequilibrium approaches the minimum value thatDjk may take (−0.25, corresponding to the bottom axis of the graph). The thin curve gives the most extreme value of the disequilibrium from the QLE approximation (B3c), which is accurate only whenrjk > 0.1. Dots show the most extreme value of the mean disequilibrium observed in 1,000,000 replicate simulations forrjk = 0.001, 0.01, and 0.1. X's represent a numerical evaluation of the weak selection approximation, (B5c). Remaining parameters and symbols are as inFigure 2.

Including a modifier of recombination:

The negative disequilibrium generated by drift in the presence of directional selection reduces the genotypic variability within a population and slows the response of that population to selection. A modifier allele that increases recombination among the selected genes will regenerate the genetic variation in fitness hidden in linkage disequilibrium. In particular, individuals carrying such a modifier allele are more likely to produce offspring of the fittest genotype. This effect can select for increased recombination at modifier loci, an effect examined in this section. Suppose that a modifier of recombination segregates at locusi. We assume that the modifier has no direct effect on fitness and alters recombination rates by a small amount δr. If terms ofOr2) are neglected, then the only significant terms involve the change in recombination between the selected loci (denoted δrj,k|i). The modifier may also alter its own rates of recombination with other loci (δri,j|i, etc.) and may show dominance (δrj,k|i,i). However, these effects are negligible for weak modifiers (Barton 1995a).

The equations for the effects of drift on the modifier frequency and its associationsDij,Dik,Dijk are given by (A2c–e). Disequilibria are measured as central moments of effects (Barton 1995a). For two loci, this definition coincides with the standard two-locus measure of gametic disequilibrium. For three loci,Dijk is equal to ∑g[X](Xipi)(Xjpj)(Xkpk), where the sum is taken over all chromosomes, each at frequencyg[X], whereX is a vector indicating whether or not an allele is present at each locus (i.e.,Xi is 0 if allelePi is absent and 1 if it is present). A modifier allele that increases recombination between locij andk becomes associated with the fittest genotype,PjPk, if that genotype is underrepresented within the population (i.e., ifDjk is negative). This association is represented by the first term in (A2e), which generatesDijk in proportion to −δrj,k|iDjk. This three-way association then generates positive two-way associations,Dij andDik, between a modifier allele that increases recombination and the beneficial alleles at locij andk (A2d). All three associations,Dij,Dik,Dijk, cause changes in the frequency of the modifier allele (A2c). Note that because the modifier is assumed to have a small effect, these equations are linear inDij,Dik,Dijk, and all will change in proportion to δrj,k|i.

Equations 2 and3 can be used to find the mean and variance of the three-locus system, as before. The first step is to find the first and second differentials of (A2c–e), such as ∂δp*i/∂δDjk. These differentials are calculated as in supplementary information (S1) and are available upon request. The second step is to calculate the expected values and the covariances for the perturbations involving the modifier by incorporating these differentials intoEquations 2 and3. Following this method and dropping terms ofO(N−2), the recursion describing the cumulative effects of drift and selection at locij andk on the frequency of a modifier allele at locusi is

graphic file with name M14.gif6

where φj = 1 +sj(pjqj)/2 and φk = 1 +sk(pkqk)/2. The expected change in the modifier is driven both by expectation termsEDij] and by covariances such asEpjδDij], each of which is proportional to δrj,k|i/(2N). The recursions for these expectation terms are complicated but can be calculated symbolically in a straightforward way from (2) and (3). These recursions may be iterated numerically or may be approximated by assuming that selection is weak as described in supplementary information (S2;http://www.genetics.org/supplemental/).

Assuming weak selection and loose linkage, the system approaches quasi-linkage equilibrium, and the per generation change in the frequency of a modifier at QLE can be found (supplementary information, S2). This gives a complicated function of the recombination rates (S2.4), which is proportional to δrj,k|ipiqiVjVk/2N, whereInline graphic andInline graphic are the additive genetic variances in diploid fitness at the two selected loci. Modifier alleles that increase recombination (δrj,k|i > 0) thus increase in frequency even in the absence of epistasis in finite populations.

Expression (S2.4) takes a simple form if there is no genetic interference and if recombination rates are small in absolute terms but large relative to the selection coefficients, as required for QLE (i.e.,sj,skrjk,rij,rik,rijk ≪ 1). Assuming, without loss of generality, that locusk is the right-most locus, we can describe the relative position of the modifier by defining α =rik/rjk; α varies between zero and one when the gene order isjik and is greater than one when the gene order isijk. Then, keeping only the leading-order term in (S2.4), the per generation change in the modifier becomes

graphic file with name M17.gif7a

where

graphic file with name M18.gif7b
graphic file with name M19.gif7c

Figure 5 illustratesg(α) and shows how evolution of a modifier of recombination depends on the relative positions of the loci, α. The functiong(α) rises rapidly near α = 0 and α = 1, where it is approximatelyg(α) ≈ 2/α2 and 2/(1 − α)2, respectively, although the QLE approximation breaks down if recombination rates become too small relative to selection. For gene orderjik,g(α) reaches a minimum value of 371/3, when the modifier is halfway between the selected loci. For gene orderijk,g(α) lies between 2/(1 − α)2 and 4/(1 − α)2 and decreases rapidly for relatively distant modifier loci.

Figure 5.—

Figure 5.—

The influence of the relative position of the modifier, α =rik/rjk, ong(α). According to the QLE approximation (7), the change in frequency of a modifier is proportional tog(α) (solid curve) when recombination rates are small but larger than the selection coefficients. The dashed curve compares this approximation to the full QLE result [the complicated functionf(r) in (S2.4) with no interference], which applies even when recombination rates are high.

Equation 7a demonstrates that the effect that drift and selection have on a modifier of recombination falls precipitously as recombination becomes more frequent within the genome (∼r−5jk). If selected loci are scattered randomly over a genome, then the effect of a modifier on tightly linked loci will dominate its dynamics. As recombination rates become comparable with the selection coefficients, however, the QLE assumptions break down, making it impossible to determine the average force on the modifier when selected loci are scattered randomly across a genome. Note that as selection and recombination decrease together toward zero in (7a), selection on the modifier approaches a positive limit, inversely proportional to population size (i.e., ifs,r, and δr are all small and of the same order, ΔEpi] ∼ 1/N). This limit appears puzzling, because the average frequency of the modifier should not change in the absence of selection. There is no contradiction here, because we have assumed throughout thats ≫ 1/N when dropping terms ofO(N−2). Thus, our approximations leading to (7) require thatN increases toward infinity ass decreases toward zero, causing ΔEpi] to decrease implicitly as selection weakens.

InFigures 69, we describe changes in the frequency of the modifier using the standardized function, βi = Δpi/(δrj,k|ipiqi), which we call theselection gradient acting on the modifier allele. When multiplied by the genetic variance in recombination within the population,Inline graphic, the selection gradient describes the expected increase in recombination (= 2Δpiδrj,k|i). The selection gradient can also be related to the effective selection coefficient,se, defined as the function that causes Δpi to equalsepiqi. For a modifier of effect δrj,k|i, the effective selection coefficient isse = βiδrj,k|i. It is important to recall that the modifier does not directly experience selection. Rather, the selection gradient measures the indirect selection arising from genetic associations between the modifier locus and the directly selected locij andk. To describe the total effect of selection at linked loci on recombination, we use thecumulative selection gradient, ∑βi, which represents the sum of βi from the initial generation up until a given generation, or thenet selection gradient, βi,net, which represents the sum of βi over the course of selection at locij andk until fixation of the beneficial alleles.

Figure 6.—

Figure 6.—

The cumulative selection gradient, ∑βi, on a modifier of recombination as a function of time under directional selection for (a) 2N = 10,000 and (b) 2N = 100,000. The selection gradient per generation, βi, is given by the rise in these curves over a single generation.Equations 2 and3 are used to generate the curves forpj0 =pk0 = 0.001 (dotted curves), 0.01 (dashed curves), and 0.1 (thick solid curves), where the latter two are shifted to the right as inFigure 2. Simulations (dots) are based on 10,000,000 replicates for a modifier that changes recombination by δrj,k|i = 0.05 and that starts at frequencypi0 = 0.5. Withpj0 =pk0 = 0.001 and 2N = 10,000, both alleles did not always fix, and we report changes in the modifier frequency conditional on fixation. The QLE prediction (S2.4) is shown as a thin solid curve and fails to account for the sensitivity to initial allele frequencies when recombination is not large relative to selection. Remaining parameters aresj =sk = 0.1 andrij =rjk = 0.1, with gene orderijk and no crossover interference.

Figure 9.—

Figure 9.—

The selection gradient on a modifier of recombination multiplied by the population size, 2Nβi, at steady state under sinusoidal selection, plotted on a log-log scale. Thex-axis is the recombination rate between adjacent loci, assumed to lie in the orderijk, withr =rij =rjk. The thick curve illustrates the predicted change in the modifier from iteration of (2) and (3). The thin solid curve illustrates (8), which is a low-recombination approximation to the recursion equations. The long dashed curve illustrates the QLE solution (S2.4). The dot marks the simulation result reported inFigure 8a. Parameter values were set to αj = αk = 0.1,pj0 =pk0 = 0.5, with (a) period τ = 120 across a range of recombination rates and (b) a range of periods, τ, withrτ = α = 0.1. The per generation change in the modifier (thick curves) is much higher when the period of fluctuating selection is longer; in this case, alleles pass through both low and high frequency within each period, increasing the fluctuations in disequilibrium that drive selection for the modifier.

As a consequence of the fact that disequilibrium is more likely to develop by drift when allele frequencies are initially low at selected loci (Figure 3), the selection gradient acting on a modifier is larger under these conditions, as shown inFigure 6. Simulation results for the change in frequency of the modifier are in general agreement with iteration of (2) and (3), even though the assumptions of weak selection and a weak modifier are violated in the simulations (wheresj =sk = 0.1 and δrj,k|i =rjk/2). The largest discrepancies are observed when the initial number of alleles is low (e.g., when 2N = 10,000 andp0 = 0.001), such that the perturbations due to drift become larger than allowed by our analysis. It is worth emphasizing that the simulation results presented inFigure 6 represent the mean ± standard errors based on 106 replicates. Consequently, the standard deviations are enormous (multiply the length of the bars by 1000). Thus, while our analysis accurately predicts the mean change in frequency of a modifier, the effect of one bout of selection on a modifier is highly variable with only two selected loci and one modifier locus. Only when many loci affect fitness and recombination does the frequency of a modifier of recombination increase consistently (Otto and Barton 2001;Ileset al. 2003).

The net selection gradient, βi,net, acting on a modifier over the entire time course of substitution at two linked loci is shown inFigure 7. The net increase of the modifier can be large with tight linkage: for example, withrij =rjk = 0.01, Δpi = 38piqiδrj,k|i. For very tight linkage, the change in the modifier is underestimated by iteration of (2) and (3) because the disequilibria caused by drift become substantial and higher-order effects begin to contribute [e.g., forrjk = 0.001 andrij = 0.01, Δpi = 334piqiδrj,k|i from simulations but only 75piqiδrj,k|i from (2) and (3)]. The QLE approximation (S2.4) is fairly accurate for loose linkage (rij,rjk >sj,sk = 0.1) but substantially overestimates the change in modifier frequency for tight linkage, as shown by the thin lines inFigure 7.

Figure 7.—

Figure 7.—

The net selection gradient, βi,net, on a modifier of recombination as a function of the recombination rates under directional selection, plotted on a log-log scale.Equations 2 and3 are used to generate the thick curves forrij = 0.01, 0.1, and 0.5 (from top to bottom). The thin curves give the net change in modifier frequency using the QLE approximation (S2.4) for the same range ofrij. Dots show simulation results based on 1,000,000 replicates forrjk = 0.001, 0.01, and 0.1 andrij = 0.01 and 0.1 (from top to bottom; standard errors were too large forrij = 0.5 to assess the effect). In the simulations, the modifier changes recombination by an amount δrj,k|i =rjk/2 and starts at frequencypi0 = 0.5. The gene order isijk, and there is no interference. Remaining parameters aresj =sk = 0.1,pj0 =pk0 = 0.01, and 2N = 10,000, with gene orderijk and no crossover interference.

Fluctuating polymorphisms:

The analysis described in this article can be used whether selection coefficients are constant or vary over time, as long as the population size is reasonably large and the numbers of each allele are never very small. With directional selection, indirect selection on a modifier ceases with the fixation of beneficial alleles. With fluctuating selection, on the other hand, polymorphism at the selected loci can be maintained over longer periods of time, prolonging selection for increased recombination. Even in the absence of epistatic interactions among loci and even when a population is initially in linkage equilibrium, drift in a finite population subject to fluctuating selection will lead to the accumulation of disequilibrium that reduces genetic variability in fitness. We have usedEquations 2 and3 to track the change in modifier frequency in a population subject to sinusoidal fluctuations in selection at two loci withsj[t] = αjCos(2π ·t/τ) andsk[t] = αkCos(2π ·t/τ + Φ), where τ is the period (assumed to be the same for both loci) and Φ measures the shift in phase of selection at the two loci (Figure 8). Note that when the direction of selection changes, the selection gradient on the modifier weakens (Figure 8a) or becomes negative (Figure 8b). Nevertheless, recombination rates rise over successive cycles.

Figure 8.—

Figure 8.—

The cumulative selection gradient, ∑βi, on a modifier of recombination under fluctuating selection. The strength of selection on locij andk varies sinusoidally over time, with the maximum strength of selection set to αj = αk = 0.1 and with both loci in phase. The gene order isijk, there is no interference, and δrj,k|i =rjk/2.Equations 2 and3 are used to generate the expected change in the modifier (solid curves) starting withpi0 =pj0 =pk0 = 0.5 in a population of size 2N = 100,000. Dots show simulation results based on 20,000,000 replicates (±2 standard errors). (a) Period (τ) is 120 generations,rjk = 0.1, andrij = 0.1. (b) Period (τ) is 60 generations,rjk = 0.02, andrij = 0.01.

The QLE result (S2.4) continues to apply when selection fluctuates over time, as long as recombination is frequent relative to the strength of selection and to the changes over time in selection (sj[t],sk[t], 1/τ ≪rij,rjk,rik,rijk). Under sinusoidal selection, the product of the additive genetic variances in fitness at the two loci (VjVk) is ∼Inline graphicα2jpjqjα2kpkqk if the allele frequencies do not change substantially over a cycle. This result indicates that, when linkage is loose, indirect selection on a modifier will be stronger when the allele frequencies remain near1/2, such that the genetic variance in fitness remains high.

For weak selection and very tight linkage (rτ ≪ 1), we can obtain an approximate solution to the recursion equations (S2.2) and (S2.3). Assuming that several cycles of fluctuating selection have passed such that the dynamical system has reached a steady state, the solution to these recursions may be approximated as described in supplementary information (S3;http://www.genetics.org/supplemental/). The change in the modifier per generation averaged over one cycle is, to leading order in the recombination rates,

graphic file with name M22.gif8

whereHjk is the harmonic mean value ofpjqjpkqk, Varpj and Varpk measure the variance in the rate of allele frequency change over the cycle, and Covpj,pk measures the covariance in these rates at the two loci:

graphic file with name M23.gif9a
graphic file with name M24.gif9b

Figure 9 comparesEquation 8 to the iterated solution of (2) and (3) as well as to the QLE solution.Equation 8 is accurate only when selection is approximately the same strength as the product of the recombination rates and the period (α ≈rτ as inFigure 9b), which reflects the order assumptions made in supplementary information (S3). In other cases, the iterated solution of (2) and (3) should be used.

As with the QLE solution (S2.4),Equation 8 indicates that the change in allele frequency at a locus that modifies recombination is proportional to the effect of the modifier on recombination (δrj,k|ipiqi), the inverse of the population size (2N), and the square of the selection coefficients at each selected locus (through Varpj Varpk). Unlike the QLE solution for loose linkage,Equation 8 is not maximized when the allele frequencies remain near1/2, because the harmonic mean of the allele frequencies in the denominator is much smaller when the allele frequencies approach zero or one. Consequently, with tighter linkage, selection on a modifier is maximized when there are high-amplitude fluctuations in allele frequencies, such that beneficial alleles reach low frequencies (where drift is stronger) as well as intermediate frequencies (where selection on a modifier is more effective). Selection on the modifier is also strongest when changes in allele frequency at the two loci occur at similar times, maximizing Covpj,pk2.

Our analysis indicates that there are two qualitatively different regimes with respect to selection on a modifier of recombination, depending on whether recombination rates are high relative to selection [the QLE regime, (S2.4)] or not [the tight linkage regime, (8)]. In the former case, the disequilibrium generated by drift dissipates rapidly, so that disequilibrium can be approximated as the consequence of recent selection (within a time frame proportional to 1/r), which is well captured by the current additive genetic variance in fitness. When recombination rates are weak relative to selection, however, disequilibrium generated while alleles are at low frequency persists as the alleles rise to higher frequency, and the current additive genetic variance in fitness no longer determines the change in frequency of the modifier. Instead, the force on the modifier is strongest whenever the harmonic mean allele frequency at the two loci,Hjk, is small yet the selected alleles spend time at intermediate frequency (with high Varpj, Varpk, and Covpj,pk). We also see a qualitatively different relationship between the recombination rates and the change in the modifier frequency in these two regimes.Equation 8 shows that a modifier of recombination changes in frequency by an amount proportional tor−3 for tight linkage, while the QLE analysis (see Equations 7) predicts a change that is proportional tor−5. Thus, while both analyses predict that selection for recombination is strongest under tight linkage, the selection gradient is overestimated by the QLE analysis (Figure 9).

DISCUSSION

By breaking down linkage disequilibria that limit the response of a population to selection, sex and recombination can hasten the rate of adaptive evolution of a population. Consequently, genotypes that are best adapted to a novel environment are more likely to be recombinant and more likely to carry modifiers that increase the recombination rate. As these adaptive genotypes sweep through a population, modifier alleles that increase recombination also rise in frequency. Disequilibria do not, however, always limit the response of a population to selection. In fact, the genetic variance for a trait and consequently the response to selection are increased by positive disequilibria among alleles contributing to that trait. If positive disequilibria predominate, recombination is selected against, because the genetic variance and the rate of adaptation are lower among recombinant individuals. In short, a necessary condition for the evolution of increased recombination in a single unstructured population is that negative disequilibria predominate among favored alleles.

Within an unstructured population, negative disequilibrium is expected to arise under two different scenarios. First, negative disequilibrium develops when there is negative epistasis among selected loci, implying that a genotype carrying multiple beneficial alleles has a lower fitness than the product of the fitnesses of the genotypes carrying only one of the beneficial alleles. Second, negative disequilibrium arises by the combined action of random genetic drift and selection, even in the absence of epistasis. Sampling error during reproduction of a finite population causes some genotypes to become more or less common than expected, generating variance in disequilibria. Selection eliminates positive disequilibria more efficiently than it eliminates negative disequilibria, because negative disequilibrium reduces the genetic variance upon which selection acts. Consequently, the average disequilibrium over loci or over replicate populations becomes negative over time.

In this article, we have demonstrated that the random genetic drift caused by sampling in finite populations selects for increased recombination even in populations that are initially in linkage equilibrium and that are subject to selection acting independently upon all loci. Our method keeps track of the perturbations in genetic associations caused by genetic drift and acted upon by selection, using recursion equations for the first and second moments of these perturbations. The main assumptions of the method are that (a) recombination is altered by only a small amount by a modifier gene, (b) the population is large, and (c) each allele is initially present in several copies. Although we have simplified the analysis by ignoring epistasis, the method can be extended to include epistasis by explicitly tracking the deterministic trajectory for the disequilibriumDjk. Using recursions for the moments describing the distribution of a population around its expected trajectory, the effects of drift and selection can be easily and efficiently studied. Simulations of selection in finite populations require large amounts of computer time, particularly since millions of replicates are needed to obtain accurate measurements of the frequency of a modifier. The simulations that we performed (see figures) indicate thatEquations 2 and3 provide accurate estimates of the average change that occurs at a modifier locus, as long as alleles are always present in multiple copies. It should be kept in mind that we used strong selection (s = 0.1) and a strong modifier (δrj,k|i =rjk/2) in the simulations to magnify the amount of indirect selection acting on the modifier locus. The approximations should be more accurate with weaker selection and weaker modifiers, although it then becomes exceedingly difficult to detect an effect on the modifier with only two selected loci and one modifier locus. Nevertheless, the method provides a promising route to predicting the evolution of recombination when multiple loci of weak effect underlie the selected trait, using the approximation that each pair of selected loci independently asserts an influence on the modifier(s) (seeBarton 1995a).

We have calculated the expected change in frequency of the modifier caused by random linkage disequilibria with selected loci, presented in the figures using the selection gradient, βi = Δpi/(δrj,k|ipiqi). For a modifier of given effect, the effective selection acting on a modifier of recombination isse = βiδrj,k|i. Although absolute values ofse are small, they can still be much stronger than the effect of random drift, since 2Nse can be large. For example, fors = 0.1,rij =rjk = 0.1, 2N = 10,000,p0 = 0.01, the effective selection is at mostse = 0.008 δrj,k|i (on the basis of the simulations reported inFigure 6a), but 2Nse is 80 δrj,k|i. Even stronger selection for recombination is expected when beneficial alleles start at lower frequency and/or when recombination rates are lower than the selection coefficients (seeFigures 7 and9 andEquation 8). This argument is complicated by the extra stochasticity introduced by the hitchhiking process itself, which reduces the effective population size witnessed by the modifier locus; this effect can be substantial when the rate of selective sweeps is comparable to the recombination rate (Barton 2000, Equation 4).

The QLE approximation for the change in the modifier (7, or more accurately S2.4) shows that when selection is weaker than recombination, the advantage of recombination is proportional to the product of the additive variance in fitness at the directly selected loci and decreases steeply with recombination. This implies that when we average over selective sweeps that are scattered across the genome, the main contribution will come from selection on loci that are most closely linked to the modifier. We cannot use the QLE results to calculate this average, however, because the QLE approximation breaks down for tight linkage. However, it is possible to calculate this average assuming weak selection using integral solutions similar to (A2.4) and (A2.5); this leads to an expression proportional to the square of the total variance in fitness (N. H. Barton, unpublished results).

By comparingFigure 6a and 6b, it can be seen that a modifier allele that increases recombination rates is most strongly selected in smaller populations. Although our approach breaks down when the population size becomes so small that some alleles are present in only a few copies (specifically, approximatingEquation 1 to order 1/N becomes inadequate), we have shown through simulation that randomly generated disequilibria generate strong selection for increased recombination in populations of small to intermediate size (Otto and Barton 2001). In fact, we found that modifiers of recombination were more influenced by disequilibria generated by drift than by disequilibria generated by epistasis unless the population size was fairly large. Directional selection experiments, such as the one performed byKorol and Iliadi (1994) inD. melanogaster (seeOtto and Barton 2001 for further review), have demonstrated that recombination rates can increase substantially in small populations subject to strong selection. Our work has shown that the evolution of recombination in such cases can be explained by stochastically generated disequilibria that, over time, accumulate to oppose the action of selection. Furthermore, the stochastic advantage of recombination can even favor the evolution of sex in the face of a twofold cost of sex as long as modifiers increase the amount of sex by small amounts (Otto and Lenormand 2002).

For large populations, our results indicate that selection on recombination should be inversely proportional to the population size. Thus, while drift in the presence of selection can account for increased recombination rates in populations of small size, it would appear to provide a weak basis for explaining the maintenance of sex and recombination in natural populations whose effective sizes are tens of thousands or more. There are three reasons, however, to believe that drift might drive the evolution of sex and recombination even in large populations. First, if population sizes tend to be small when conditions are harsh and selection is strong, the advantage of breaking down randomly generated disequilibria could drive the evolution of sex and recombination during those rare periods in time that selection is intense. Second, the cumulative selective force on a modifier of recombination might be substantial if there are many loci responding to selection within the genome. Indeed, simulation results reported byIleset al. (2003) demonstrate that selection acting on a modifier of recombination increases as a function of the number of selected loci, even when the initial additive genetic variance in fitness is held constant. Third, and most importantly, the forces described in this article can be substantial even in very large populations as long as those populations are spatially structured (seeMartinet al. 2005). That is, the disequilibrium generated by drift in the presence of selection depends strongly on the local (deme) size and not just the total size of a population. Because every population is finite, spatial structure is common and selection acts at multiple loci throughout a genome, and because drift in the presence of selection causes an accumulation of good alleles on bad genetic backgrounds, the ubiquity of sex and recombination might well be explained by the fact that genetic mixing allows beneficial alleles in different individuals to be brought together, as initially proposed byMorgan (1913),Fisher (1930), andMuller (1932).

Acknowledgments

We are grateful to Aneil Agrawal, Mark Kirkpatrick, Thomas Lenormand, Guillaume Martin, Marcy Uyenoyama, and three anonymous reviewers for comments on the manuscript. Financial support for this research was provided by the Biotechnology and Biological Science Research Council, the Natural Environment Research Council, and the Natural Sciences and Engineering Research Council of Canada.

APPENDIX A:

EXACT EQUATIONS FOR TWO AND THREE LOCI

We use the methods ofBarton and Turelli (1991) to derive exact recursions for two selected loci (labeledj,k), and a modifier of small effect at locusi. We census immediately after juveniles have been randomly sampled from the propagules produced by the previous generation. After the census, haploid populations undergo selection, random mating, and meiosis to produce the next generation of haploid propagules, where we assume that the number of potential propagules is so large that these processes can be treated deterministically. The model also applies to a diploid population, as long as fitness is the product of the fitness contribution of each haplotype. For diploids, the life cycle involves a census of diploid juveniles, selection, meiosis, syngamy of gametes to produce diploid propagules, and random sampling ofN juveniles. Equations A1.2 ofBarton (1995a) differ slightly, because diploid fitness was assumed instead to be the sum of the fitness contribution of each haplotype. We assume that the modifier allelePi increases recombination between locij andk by a small amount δrj,k|i. For ease of analysis, we followBarton (1995a) and set the recombination rate between locij andk torjk + 2qiδrj,k|i inPiPi individuals,rjk + (qipirj,k|i inPiQi individuals, andrjk − 2piδrj,k|i inQiQi individuals. (Note that theeffect of the modifier, measured by the difference between genotypes, is not frequency dependent.) Letrij andrik be the recombination rates between the modifier locus and the selected loci,j andk, respectively, and letrijk be the probability of recombination between any of the locii,j,k; these terms are used only to determine the offspring of heterozygousPiQi individuals and so their values in other modifier genotypes are immaterial. Disequilibrium terms involving the modifier locus (Dij,Dik,Dijk) will be dominated by the indirect effects of changing recombination between the selected loci and will be proportional to δrj,k|i. In the following, we assume that the modifier is weak and keep only linear-order terms in δrj,k|i.

Let φj = 1 +sj(pjqj)/2 and φk = 1 +sk(pkqk)/2 measure the mean fitness of each locus considered separately. Accounting also for the disequilibrium within a population, the mean fitness is equal to

graphic file with name M25.gifA1

The three-locus recursions are

graphic file with name M26.gifA2a
graphic file with name M27.gifA2b
graphic file with name M28.gifA2c
graphic file with name M29.gifA2d
graphic file with name M30.gifA2e

whererijk = (rij +rik +rjk)/2 for any gene order with and without interference.p*k andD*ik can be obtained from (A2a) and (A2d), respectively, by interchanging subscriptsj andk. These equations assume only that δrj,k|i is small and are valid for strong selection (sj,sk) and strong linkage disequilibrium between the selected loci (Djk). (A2) can be derived from the three-locus recursion equations presented inFeldman (1972). A supplementary Mathematica package is available that derives (A2) as well as the other main results given inappendixesa andb.

We assume that 2N haploid (orN diploid) juveniles are sampled from the propagules produced by the parental generation according to a multinomial distribution as in the standard Wright-Fisher model (Ewens 1979). The moments of the multinomial distribution can therefore be used to determine the expected values of the perturbations in (A2) and the covariances between them. Although there is no expected change in allele frequencies due to drift,Ei] =Ej] =Ek] = 0, the expected change in linkage disequilibrium isEjk] = −Djk/(2N). Because disequilibrium terms involving the modifier locus are proportional to δrj,k|i, the expected amount of drift in these terms will be of the order δrj,k|i/N, which is assumed to be very small and is ignored. Thus, we need to quantify the effects of drift only on the following variances and covariances, which are written to order (1/N):

graphic file with name M31.gif

BecauseEa] is either 0 or order 1/N, theEaζb] required inEquation 3 are, to order 1/N, equal to the cov(ζa, ζb) given here. Note that the allele frequencies and disequilibrium in these terms are evaluated after the parental generation produces propagules. When selection is weak, however, the allele frequencies and disequilibrium from the previous census can be used to obtain leading-order approximations.

These derivations apply for any amount of disequilibrium, allowing the method to be extended to models where there is a deterministic source of disequilibrium (e.g., epistasis or an initially high level of disequilibrium). In the text andappendix B, we assume that the disequilibria remainO(1/N) throughout the process. In this case,Ejk] and the covariance terms become negligible. In the numerical iterations of (2) and (3), however, no assumptions are made about the order of the disequilibrium, so that the Mathematica package (available upon request) can be used regardless of the initial level of disequilibrium.

APPENDIX B:

WEAK SELECTION APPROXIMATIONS FOR TWO SELECTED LOCI

When selection is weak (sj andsk of orders, wheres ≪ 1), the recursions for the perturbations given by (1) can be further approximated by

graphic file with name M32.gifB1a
graphic file with name M33.gifB1b

Throughout thisappendix, values for allelePk can be obtained from values given for allelePj by interchanging subscriptsj andk. In (B1b), we have included the leading-order terms for each of the perturbations with respect tos even when these areO(s2). These lower-order terms are critical to the initial development of disequilibria when δDjk is zero.

Taking expectations as in (4) and (5), assuming that the perturbations including the disequilibria areO(N−1), ignoring terms that areO(N−2), and using the results ofappendix A for the contributions due to drift gives

graphic file with name M34.gifB2a
graphic file with name M35.gifB2b
graphic file with name M36.gifB2c
graphic file with name M37.gifB2d
graphic file with name M38.gifB2e

Supplementary information (S2) describes similar weak selection recursions for the three-locus case.

Equations B2 can be solved explicitly. First, consider loose linkage (rjksj,sk). In this case, the disequilibrium rapidly approaches a QLE distribution, withEδD2jk andEDjk] changing slowly relative to the action of recombination. One can therefore setEδD*jk toEDjk], etc., in (B2) and solve for the QLE values. For the disequilibrium:

graphic file with name M39.gifB3a
graphic file with name M40.gifB3b
graphic file with name M41.gifB3c, B3c

shows that negative disequilibrium is generated on average in a finite population subject to multiplicative selection. The magnitude of this disequilibrium is quite small with loose linkage but increases rapidly asrjk becomes small. It also increases with the strength of selection acting on locij andk and with the amount of genetic variability at these loci. Note that the covariance between allele frequencies and linkage disequilibrium (B3b) always contributes more than the variance in linkage disequilibrium to the accumulation of negative linkage disequilibrium by a factor 2(1 −rjk)/rjk.

With tight linkage (1/Nsrjk ≪ 1), disequilibrium decays slowly over time, and we can no longer assume quasi-linkage equilibrium. We can, however, move from the discrete recursion equations to continuous-time approximations, which can be more readily solved. We introduce the method in general and then apply the solution to (B2).

In the ODE approach, the discrete recursion equations (B2) all have the form

graphic file with name M42.gifB4a

whereC is a constant involving recombination rates,F[t] is a function of time involving terms such as −s(pq), andG[t] is a function of time involving the expectations,E[ ]. When selection and recombination are rare, the change inx[t] over a short time period can be approximated byx[t + Δt] −x[t], wherex[t + Δt] is obtained from (B4a) under the assumption that the amount of selection and recombination in a shorter interval of time, Δt, scales with Δt assΔt andrΔt. Taking the limit of (x[t + Δt] −x[t])/Δt as Δt goes to zero yields an ordinary differential equation:

graphic file with name M43.gifB4b

wherec,f, andg are used to denoteC,F, andG to linear order ins andr, using a Taylor series that ignores any cross products or higher-order terms. The solution to differential equations of the form (B4b) is

graphic file with name M44.gifB4c

The integrals involvingf[t] in (B4c) can be solved by noting that, when selection is weak and frequency independent, selection can be defined according to the change in allele frequency that it causes:p′ =dp/dt =spq. Whenf[t] = −s(pq), for example, applying this definition and integrating yields

graphic file with name M45.gif

Lettingv[t] be the product of the appropriate allele frequencies (depending on which terms are included inf[t]), (B4c) becomes

graphic file with name M46.gifB4d

The solution (B4d) applies regardless of how selection varies over time. Barton (2000) used a similar method to find the net hitchhiking effect of fluctuating selection on linked neutral loci and gave an approximation valid for tight linkage (rs; his Equation 4).

Following this procedure and assuming that terms of orderrjksj are negligible relative torjk andsj, we obtain the following weak selection approximations:

graphic file with name M47.gifB5a
graphic file with name M48.gifB5b
graphic file with name M49.gifB5c

We have focused on the effects of drift on the disequilibrium; analogous equations can be obtained for the effects on allele frequencies. Note that if allele frequencies increase from some low value faster than linkage disequilibrium is dissipated by recombination (sj,sk >rjk), the initial distribution (E0[ ]) will make a substantial contribution. Even when disequilibrium is initially absent, substantial disequilibrium can build up if the beneficial alleles start at low frequency, because periods with low values ofpjqjpkqk contribute disproportionately to the integrals in (B5).

References

  1. Altenberg, L., and M. W. Feldman, 1987. Selection, generalized transmission and the evolution of modifier genes. I. The reduction principle. Genetics 117: 559–572. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Barton, N. H., 1995. a A general model for the evolution of recombination. Genet. Res. 65: 123–144. [DOI] [PubMed] [Google Scholar]
  3. Barton, N. H., 1995. b Linkage and the limits to natural selection. Genetics 140: 821–841. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Barton, N. H., 1998. The effect of hitch-hiking on neutral genealogies. Genet. Res. 72: 123–133. [Google Scholar]
  5. Barton, N. H., 2000. Genetic hitchhiking. Philos. Trans. R. Soc. Lond. B 355: 1553–1562. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Barton, N. H., and M. Turelli, 1991. Natural and sexual selection on many loci. Genetics 127: 229–255. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Charlesworth, B., 1976. Recombination modification in a fluctuating environment. Genetics 83: 181–195. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Charlesworth, B., 1990. Mutation-selection balance and the evolutionary advantage of sex and recombination. Genet. Res. 55: 199–221. [DOI] [PubMed] [Google Scholar]
  9. Charlesworth, B., 1993. Directional selection and the evolution of sex and recombination. Genet. Res. 61: 205–224. [DOI] [PubMed] [Google Scholar]
  10. de Visser, J. A. G. M., R. F. Hoekstra and H. van den Ende, 1996. The effect of sex and deleterious mutations on fitness inChlamydomonas. Proc. R. Soc. Lond. Ser. B 263: 193–200. [Google Scholar]
  11. de Visser, J. A. G. M., R. F. Hoekstra and H. van den Ende, 1997. Test of interaction between genetic markers that affect fitness inAspergillus niger. Evolution 51: 1499–1505. [DOI] [PubMed] [Google Scholar]
  12. Elena, S. F., and R. E. Lenski, 1997. Test of synergistic interactions among deleterious mutations in bacteria. Nature 390: 395–398. [DOI] [PubMed] [Google Scholar]
  13. Ewens, W. J., 1979Mathematical Population Genetics. Springer-Verlag, Berlin.
  14. Feldman, M. W., 1972. Selection for linkage modification. I. Random mating populations. Theor. Popul. Biol. 21: 430–439. [DOI] [PubMed] [Google Scholar]
  15. Feldman, M. W., F. B. Christiansen and L. D. Brooks, 1980. Evolution of recombination in a constant environment. Proc. Natl. Acad. Sci. USA 77: 4838–4841. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Felsenstein, J., 1974. The evolutionary advantage of recombination. Genetics 78: 737–756. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Felsenstein, J., 1988 Sex and the evolution of recombination, pp. 74–86 inThe Evolution of Sex, edited by R. E. Michod and B. R. Levin. Sinauer Press, Sunderland, MA.
  18. Felsenstein, J., and S. Yokoyama, 1976. The evolutionary advantage of recombination. II. Individual selection for recombination. Genetics 83: 845–859. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Fisher, R. A., 1930The Genetical Theory of Natural Selection. Oxford University Press, Oxford.
  20. Hill, W. G., and A. Robertson, 1966. The effect of linkage on the limits to artificial selection. Genet. Res. 8: 269–294. [PubMed] [Google Scholar]
  21. Iles, M. M., K. Walters and C. Cannings, 2003. Recombination can evolve in large finite populations given selection on sufficient loci. Genetics 165: 333–337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Keightley, P. D., 1996. Nature of deleterious mutation load inDrosophila. Genetics 144: 1993–1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Keightley, P. D., and A. Caballero, 1997. Genomic mutation rates for lifetime reproductive output and lifespan inCaenorhabditis elegans. Proc. Natl. Acad. Sci. USA 94: 3823–3827. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Keightley, P. D., and A. Eyre-Walker, 2000. Deleterious mutations and the evolution of sex. Science 290: 331–333. [DOI] [PubMed] [Google Scholar]
  25. Kimura, M., 1965. Attainment of quasi-linkage equilibrium when gene frequencies are changing by natural selection. Genetics 52: 875–890. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Kimura, M., and T. Maruyama, 1966. The mutational load with epistatic gene interactions in fitness. Genetics 54: 1303–1312. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Kondrashov, A. S., 1982. Selection against harmful mutations in large sexual and asexual populations. Genet. Res. 40: 325–332. [DOI] [PubMed] [Google Scholar]
  28. Kondrashov, A. S., 1984. Deleterious mutations as an evolutionary factor. I. The advantage of recombination. Genet. Res. 44: 199–217. [DOI] [PubMed] [Google Scholar]
  29. Kondrashov, A. S., 1988. Deleterious mutations and the evolution of sexual reproduction. Nature 336: 435–441. [DOI] [PubMed] [Google Scholar]
  30. Kondrashov, A. S., 1993. Classification of hypotheses on the advantage of amphimixis. J. Hered. 84: 372–387. [DOI] [PubMed] [Google Scholar]
  31. Korol, A. B., and K. G. Iliadi, 1994. Recombination increase resulting from directional selection for geotaxis in Drosophila. Heredity 72: 64–68. [DOI] [PubMed] [Google Scholar]
  32. Latter, B. D. H., 1965. The response to artificial selection due to autosomal genes of large effect. II. The effects of linkage on limits to selection in finite populations. Aust. J. Biol. Sci. 18: 1009–1023. [DOI] [PubMed] [Google Scholar]
  33. Martin, G., S. P. Otto and T. Lenormand, 2005. Selection for recombination in structured populations. Genetics (in press). [DOI] [PMC free article] [PubMed]
  34. Maynard Smith, J., 1968. Evolution in sexual and asexual populations. Am. Nat. 102: 469–473. [Google Scholar]
  35. Maynard Smith, J., 1978The Evolution of Sex. Cambridge University Press, Cambridge, UK.
  36. Maynard Smith, J., 1980. Selection for recombination in a polygenic model. Genet. Res. 35: 269–277. [DOI] [PubMed] [Google Scholar]
  37. Maynard Smith, J., 1988. Selection for recombination in a polygenic model: the mechanism. Genet. Res. 51: 59–63. [DOI] [PubMed] [Google Scholar]
  38. Morgan, T. H., 1913Heredity and Sex. Columbia University Press, New York.
  39. Mukai, T., 1969. The genetic structure of natural populations ofDrosophila melanogaster. VII. Synergistic interactions of spontaneous mutant polygenes controlling viability. Genetics 61: 749–761. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Muller, H. J., 1932. Some genetic aspects of sex. Am. Nat. 66: 118–138. [Google Scholar]
  41. Nagylaki, T., 1993. The evolution of multilocus systems under weak selection. Genetics 134: 627–647. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Otto, S. P., and N. H. Barton, 1997. The evolution of recombination: removing the limits to natural selection. Genetics 147: 879–906. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Otto, S. P., and N. H. Barton, 2001. Selection for recombination in small populations. Evolution 55: 1921–1931. [DOI] [PubMed] [Google Scholar]
  44. Otto, S. P., and M. W. Feldman, 1997. Deleterious mutations, variable epistatic interactions, and the evolution of recombination. Theor. Popul. Biol. 51: 134–147. [DOI] [PubMed] [Google Scholar]
  45. Otto, S. P., and T. Lenormand, 2002. Resolving the paradox of sex and recombination. Nat. Rev. Genet. 2: 252–261. [DOI] [PubMed] [Google Scholar]
  46. Qureshi, A. W., 1963A Monte Carlo Evaluation of the Role of Finite Population Size and Linkage in Response to Continuous Mass Selection. Statistical Laboratory, Iowa State University, Ames, IA.
  47. Rice, W. R., 2002. Experimental tests of the adaptive significance of sexual recombination. Nat. Rev. Genet. 3: 241–251. [DOI] [PubMed] [Google Scholar]
  48. Seager, R. D., and F. J. Ayala, 1982. Chromosome interactions inDrosophila melanogaster. I. Viability studies. Genetics 102: 467–483. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Seager, R. D., F. J. Ayala and R. W. Marks, 1982. Chromosome interactions inDrosophila melanogaster. II. Total fitness. Genetics 102: 485–502. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Stephan, W., T. H. E. Wiehe and M. W. Lenz, 1992. The effect of strongly selected substitutions on neutral polymorphism: analytical results based on diffusion theory. Theor. Popul. Biol. 41: 237–254. [Google Scholar]
  51. Szathmary, E., 1993. Do deleterious mutations act synergistically? Metabolic control theory provides a partial answer. Genetics 133: 127–132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Turner, J. R. G., and M. H. Williamson, 1968. Population size, natural selection, and the genetic load. Nature 218: 700. [DOI] [PubMed] [Google Scholar]
  53. West, S. A., A. D. Peters and N. H. Barton, 1998. Testing for epistasis between deleterious mutations. Genetics 149: 435–444. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Wolfram, S., 1991Mathematica. Addison Wesley, New York.

Articles from Genetics are provided here courtesy ofOxford University Press

ACTIONS

RESOURCES


[8]ページ先頭

©2009-2025 Movatter.jp