Epistasis is a phenomenon ingenetics in which the effect of agenemutation is dependent on the presence or absence of mutations in one or more other genes, respectively termedmodifier genes. In other words, the effect of the mutation is dependent on the genetic background in which it appears.[2] Epistatic mutations therefore have different effects on their own than when they occur together. Originally, the termepistasis specifically meant that the effect of a gene variant is masked by that of different gene.[3]
The concept ofepistasis originated in genetics in 1907[4] but is now used inbiochemistry,computational biology andevolutionary biology. The phenomenon arises due to interactions, either between genes (such as mutations also being needed inregulators of gene expression) or within them (multiple mutations being needed before the gene loses function), leading to non-linear effects. Epistasis has a great influence on the shape ofevolutionary landscapes, which leads to profound consequences forevolution and for theevolvability ofphenotypic traits.
Understanding of epistasis has changed considerably through thehistory of genetics and so too has the use of the term. The term was first used in 1907 byWilliam Bateson and his collaboratorsFlorence Durham andMuriel Wheldale Onslow.[5][4] In early models ofnatural selection devised in the early 20th century, each gene was considered to make its own characteristic contribution to fitness, against an average background of other genes. Some introductory courses still teachpopulation genetics this way. Because of the way that the science ofpopulation genetics was developed,evolutionary geneticists have tended to think of epistasis as the exception. However, in general, the expression of any one allele depends in a complicated way on many other alleles.
Inclassical genetics, if genes A and B are mutated, and each mutation by itself produces a unique phenotype but the two mutations together show the same phenotype as the gene A mutation, then gene A is epistatic and gene B ishypostatic. For example, the gene fortotal baldness is epistatic to the gene forbrown hair. In this sense, epistasis can be contrasted withgenetic dominance, which is an interaction between alleles at the samegenelocus. As the study of genetics developed, and with the advent ofmolecular biology, epistasis started to be studied in relation toquantitative trait loci (QTL) andpolygenic inheritance.
The effects of genes are now commonly quantifiable by assaying the magnitude of a phenotype (e.g.height,pigmentation orgrowth rate) or bybiochemically assaying protein activity (e.g.binding orcatalysis). Increasingly sophisticatedcomputational andevolutionary biology models aim to describe the effects of epistasis on agenome-wide scale and the consequences of this forevolution.[6][7][8] Since identification of epistatic pairs is challenging both computationally and statistically, some studies try to prioritize epistatic pairs.[9][10]
Terminology about epistasis can vary between scientific fields.Geneticists often refer towild type and mutantalleles where the mutation is implicitly deleterious and may talk in terms of genetic enhancement,synthetic lethality and genetic suppressors. Conversely, abiochemist may more frequently focus on beneficial mutations and so explicitly state the effect of a mutation and use terms such as reciprocal sign epistasis and compensatory mutation.[17] Additionally, there are differences when looking at epistasis within a single gene (biochemistry) and epistasis within ahaploid ordiploid genome (genetics). In general, epistasis is used to denote the departure from 'independence' of the effects of different genetic loci. Confusion often arises due to the varied interpretation of 'independence' among different branches of biology.[18] The classifications below attempt to cover the various terms and how they relate to one another.
Two mutations are considered to be purely additive if the effect of the double mutation is the sum of the effects of the single mutations. This occurs when genes do not interact with each other, for example by acting through differentmetabolic pathways. Simply, additive traits were studied early on in thehistory of genetics, however they are relatively rare, with most genes exhibiting at least some level of epistatic interaction.[19][20]
When the double mutation has a fitterphenotype than expected from the effects of the two single mutations, it is referred to aspositive epistasis. Positive epistasis between beneficial mutations generates greater improvements in function than expected.[11][12] Positive epistasis between deleterious mutations protects against the negative effects to cause a less severe fitness drop.[14]
Conversely, when two mutations together lead to a less fitphenotype than expected from their effects when alone, it is callednegative epistasis.[21][22] Negative epistasis between beneficial mutations causes smaller than expected fitness improvements, whereas negative epistasis between deleterious mutations causes greater-than-additive fitness drops.[13]
Independently, when the effect on fitness of two mutations is more radical than expected from their effects when alone, it is referred to assynergistic epistasis. The opposite situation, when the fitness difference of the double mutant from the wild type is smaller than expected from the effects of the two single mutations, it is calledantagonistic epistasis.[16] Therefore, for deleterious mutations, negative epistasis is also synergistic, while positive epistasis is antagonistic; conversely, for advantageous mutations, positive epistasis is synergistic, while negative epistasis is antagonistic.
The termgenetic enhancement is sometimes used when a double (deleterious) mutant has a more severe phenotype than the additive effects of the single mutants. Strong positive epistasis is sometimes referred to bycreationists asirreducible complexity (althoughmost examples are misidentified).
Sign epistasis[23] occurs when one mutation has the opposite effect when in the presence of another mutation. This occurs when a mutation that is deleterious on its own can enhance the effect of a particular beneficial mutation.[18] For example, a large and complexbrain is a waste of energy without a range ofsense organs, but sense organs are made more useful by a large and complex brain that can better process the information. If afitness landscape has no sign epistasis then it is calledsmooth.
At its most extreme,reciprocal sign epistasis[24] occurs when two deleterious genes are beneficial when together. For example, producing atoxin alone can kill abacterium, and producing atoxin exporter alone can waste energy, but producing both can improvefitness by killingcompeting organisms. If a fitness landscape has sign epistasis but no reciprocal sign epistasis then it is calledsemismooth.[25]
Reciprocal sign epistasis also leads togenetic suppression whereby two deleterious mutations are less harmful together than either one on its own, i.e. onecompensates for the other. A clear example of genetic suppression was the demonstration that in the assembly ofbacteriophage T4 two deleteriousmutations, each causing a deficiency in the level of a differentmorphogenetic protein, could interact positively.[26] If a mutation causes a reduction in a particular structural component, this can bring about an imbalance in morphogenesis and loss of viable virus progeny, but production of viable progeny can be restored by a second(suppressor) mutation in another morphogenetic component that restores the balance of protein components.
The term genetic suppression can also apply to sign epistasis where the double mutant has a phenotype intermediate between those of the single mutants, in which case the more severe single mutant phenotype issuppressed by the other mutation or genetic condition. For example, in adiploid organism, a hypomorphic (or partial loss-of-function) mutant phenotype can be suppressed by knocking out one copy of a gene that acts oppositely in the same pathway. In this case, the second gene is described as a "dominant suppressor" of the hypomorphic mutant; "dominant" because the effect is seen when one wild-type copy of the suppressor gene is present (i.e. even in a heterozygote). For most genes, the phenotype of the heterozygous suppressor mutation by itself would be wild type (because most genes are not haplo-insufficient), so that the double mutant (suppressed) phenotype is intermediate between those of the single mutants.
In non reciprocal sign epistasis, fitness of the mutant lies in the middle of that of the extreme effects seen in reciprocal sign epistasis.
When two mutations are viable alone but lethal in combination, it is calledSynthetic lethality orunlinked non-complementation.[27]
In ahaploid organism with genotypes (at twoloci)ab,Ab,aB orAB, we can think of different forms of epistasis as affecting the magnitude of a phenotype upon mutation individually (Ab and aB) or in combination (AB).
Interaction type | ab | Ab | aB | AB | |
No epistasis (additive) | 0 | 1 | 1 | 2 | AB =Ab +aB +ab |
Positive (synergistic) epistasis | 0 | 1 | 1 | 3 | AB >Ab +aB +ab |
Negative (antagonistic) epistasis | 0 | 1 | 1 | 1 | AB <Ab +aB +ab |
Sign epistasis | 0 | 1 | -1 | 2 | AB has opposite sign toAboraB |
Reciprocal sign epistasis | 0 | -1 | -1 | 2 | AB has opposite sign to AbandaB |
Epistasis indiploid organisms is further complicated by the presence of two copies of each gene. Epistasis can occur between loci, but additionally, interactions can occur between the two copies of each locus inheterozygotes. For a twolocus, twoallele system, there are eight independent types of gene interaction.[28]
Additive A locus | Additive B locus | Dominance A locus | Dominance B locus | ||||||||||||||||
aa | aA | AA | aa | aA | AA | aa | aA | AA | aa | aA | AA | ||||||||
bb | 1 | 0 | –1 | bb | 1 | 1 | 1 | bb | –1 | 1 | –1 | bb | –1 | –1 | –1 | ||||
bB | 1 | 0 | –1 | bB | 0 | 0 | 0 | bB | –1 | 1 | –1 | bB | 1 | 1 | 1 | ||||
BB | 1 | 0 | –1 | BB | –1 | –1 | –1 | BB | –1 | 1 | –1 | BB | –1 | –1 | –1 | ||||
Additive by Additive Epistasis | Additive by Dominance Epistasis | Dominance by Additive Epistasis | Dominance by Dominance Epistasis | ||||||||||||||||
aa | aA | AA | aa | aA | AA | aa | aA | AA | aa | aA | AA | ||||||||
bb | 1 | 0 | –1 | bb | 1 | 0 | –1 | bb | 1 | –1 | 1 | bb | –1 | 1 | –1 | ||||
bB | 0 | 0 | 0 | bB | –1 | 0 | 1 | bB | 0 | 0 | 0 | bB | 1 | –1 | 1 | ||||
BB | –1 | 0 | 1 | BB | 1 | 0 | –1 | BB | –1 | 1 | –1 | BB | –1 | 1 | –1 | ||||
This can be the case when multiple genes act in parallel to achieve the same effect. For example, when an organism is in need ofphosphorus, multiple enzymes that break down different phosphorylated components from theenvironment may act additively to increase the amount of phosphorus available to the organism. However, there inevitably comes a point where phosphorus is no longer the limiting factor for growth and reproduction and so further improvements in phosphorus metabolism have smaller or no effect (negative epistasis). Some sets of mutations within genes have also been specifically found to be additive.[29] It is now considered that strict additivity is the exception, rather than the rule, since most genesinteract with hundreds or thousands of other genes.[19][20]
Epistasis within the genomes of organisms occurs due to interactions between the genes within the genome. This interaction may be direct if the genes encode proteins that, for example, are separate components of a multi-component protein (such as theribosome),inhibit each other's activity, or if the protein encoded by one gene modifies the other (such as byphosphorylation). Alternatively the interaction may be indirect, where the genes encode components of ametabolic pathway ornetwork,developmental pathway,signalling pathway ortranscription factor network. For example, the gene encoding theenzyme that synthesizespenicillin is of no use to afungus without the enzymes that synthesize the necessary precursors in the metabolic pathway.
Just as mutations in two separate genes can be non-additive if those genes interact, mutations in twocodons within a gene can be non-additive. In genetics this is sometimes calledintragenic suppression when one deleterious mutation can be compensated for by a second mutation within that gene. Analysis of bacteriophage T4mutants that were altered in therIIB cistron (gene) revealed that certain pairwise combinations of mutations could mutually suppress each other; that is the double mutants had a more nearlywild-typephenotype than either mutant alone.[30] The linear map order of the mutants was established usinggenetic recombination data, From these sources of information, the triplet nature of thegenetic code was logically deduced for the first time in 1961, and other key features of the code were also inferred.[30]
Also intragenic suppression can occur when theamino acids within a protein interact. Due to the complexity of protein folding and activity, additive mutations are rare.
Proteins are held in theirtertiary structure by a distributed, internal network of cooperative interactions (hydrophobic,polar andcovalent).[31] Epistatic interactions occur whenever one mutation alters the local environment of another residue (either by directly contacting it, or by inducing changes in the protein structure).[32] For example, in adisulphide bridge, a singlecysteine has no effect onprotein stability until a second is present at the correct location at which point the two cysteines form achemical bond which enhances the stability of the protein.[33] This would be observed as positive epistasis where the double-cysteine variant had a much higher stability than either of the single-cysteine variants. Conversely, when deleterious mutations are introduced, proteins often exhibitmutational robustness whereby as stabilising interactions are destroyed the protein still functions until it reaches some stability threshold at which point further destabilising mutations have large, detrimental effects as the protein can no longerfold. This leads to negative epistasis whereby mutations that have little effect alone have a large, deleterious effect together.[34][35]
Inenzymes, the protein structure orients a few, keyamino acids into precise geometries to form anactive site to performchemistry.[36] Since these active site networks frequently require the cooperation of multiple components, mutating any one of these components massively compromises activity, and so mutating a second component has a relatively minor effect on the already inactivated enzyme. For example, removing any member of thecatalytic triad of many enzymes will reduce activity to levels low enough that the organism is no longer viable.[37][38][39]
Diploid organisms contain two copies of each gene. If these are different (heterozygous / heteroallelic), the two different copies of the allele may interact with each other to cause epistasis. This is sometimes calledallelic complementation, orinterallelic complementation. It may be caused by several mechanisms, for exampletransvection, where an enhancer from one allele acts intrans to activate transcription from the promoter of the second allele. Alternately,trans-splicing of two non-functional RNA molecules may produce a single, functional RNA.
Similarly, at the protein level, proteins that function asdimers may form aheterodimer composed of one protein from each alternate gene and may display different properties to thehomodimer of one or both variants. Two bacteriophage T4 mutants defective at different locations in the same gene can undergo alleliccomplementation during a mixed infection.[40] That is, each mutant alone upon infection cannot produce viable progeny, but upon mixed infection with two complementing mutants, viable phage are formed. Intragenic complementation was demonstrated for several genes that encode structural proteins of the bacteriophage[40] indicating that such proteins function as dimers or even higher order multimers.[41]
Inevolutionary genetics, the sign of epistasis is usually more significant than the magnitude of epistasis. This is because magnitude epistasis (positive and negative) simply affects how beneficial mutations are together, however sign epistasis affects whether mutation combinations are beneficial or deleterious.[11]
Afitness landscape is a representation of thefitness where allgenotypes are arranged in 2D space and the fitness of each genotype is represented by height on a surface. It is frequently used as a visual metaphor for understandingevolution as the process of moving uphill from one genotype to the next, nearby, fitter genotype.[19]
If all mutations are additive, they can be acquired in any order and still give a continuous uphill trajectory. The landscape is perfectly smooth, with only one peak (global maximum) and all sequences can evolve uphill to it by the accumulation of beneficial mutationsin any order. Conversely, if mutations interact with one another by epistasis, the fitness landscape becomes rugged as the effect of a mutation depends on the genetic background of other mutations.[42] At its most extreme, interactions are so complex that the fitness is 'uncorrelated' with gene sequence and the topology of the landscape is random. This is referred to as arugged fitness landscape and has profound implications for theevolutionary optimisation of organisms. If mutations are deleterious in one combination but beneficial in another, the fittest genotypes can only be accessed by accumulating mutationsin one specific order. This makes it more likely that organisms will get stuck atlocal maxima in the fitness landscape having acquired mutations in the 'wrong' order.[35][43] For example, a variant ofTEM1 β-lactamase with 5 mutations is able to cleavecefotaxime (a third generationantibiotic).[44] However, of the 120 possible pathways to this 5-mutant variant, only 7% are accessible to evolution as the remainder passed through fitness valleys where the combination of mutations reduces activity. In contrast, changes in environment (and therefore the shape of the fitness landscape) have been shown to provide escape from local maxima.[35] In this example, selection in changing antibiotic environments resulted in a "gateway mutation" which epistatically interacted in a positive manner with other mutations along an evolutionary pathway, effectively crossing a fitness valley. This gateway mutation alleviated the negative epistatic interactions of other individually beneficial mutations, allowing them to better function in concert. Complex environments or selections may therefore bypass local maxima found in models assuming simple positive selection.
High epistasis is usually considered a constraining factor on evolution, and improvements in a highly epistatic trait are considered to have lowerevolvability. This is because, in any given genetic background, very few mutations will be beneficial, even though many mutations may need to occur to eventually improve the trait. The lack of a smooth landscape makes it harder for evolution to access fitness peaks. In highly rugged landscapes,fitness valleys block access to some genes, and even if ridges exist that allow access, these may be rare or prohibitively long.[45] Moreover, adaptation can move proteins into more precarious or rugged regions of the fitness landscape.[46] These shifting "fitness territories" may act to decelerate evolution and could represent tradeoffs for adaptive traits.
The frustration of adaptive evolution by rugged fitness landscapes was recognized as a potential force for the evolution ofevolvability.Michael Conrad in 1972 was the first to propose a mechanism for the evolution ofevolvability by noting that a mutation which smoothed the fitness landscape at other loci could facilitate the production of advantageous mutations and hitchhike along with them.[47][48]Rupert Riedl in 1975 proposed that new genes which produced the same phenotypic effects with a single mutation as other loci with reciprocal sign epistasis would be a new means to attain a phenotype otherwise too unlikely to occur by mutation.[49][50]
Rugged, epistatic fitness landscapes also affect the trajectories of evolution. When a mutation has a large number of epistatic effects, each accumulated mutation drastically changes the set of availablebeneficial mutations. Therefore, the evolutionary trajectory followed depends highly on which early mutations were accepted. Thus, repeats of evolution from the same starting point tend to diverge to different local maxima rather than converge on a single global maximum as they would in a smooth, additive landscape.[51][52]
Negative epistasis and sex are thought to be intimately correlated. Experimentally, this idea has been tested in using digital simulations of asexual and sexual populations. Over time, sexual populations move towards more negative epistasis, or the lowering of fitness by two interacting alleles. It is thought that negative epistasis allows individuals carrying the interacting deleterious mutations to be removed from the populations efficiently. This removes those alleles from the population, resulting in an overall more fit population. This hypothesis was proposed byAlexey Kondrashov, and is sometimes known as thedeterministic mutation hypothesis[53]and has also been tested using artificialgene networks.[21]
However, the evidence for this hypothesis has not always been straightforward and the model proposed by Kondrashov has been criticized for assuming mutation parameters far from real world observations.[54] In addition, in those tests which used artificial gene networks, negative epistasis is only found in more densely connected networks,[21] whereas empirical evidence indicates that natural gene networks are sparsely connected,[55] and theory shows that selection for robustness will favor more sparsely connected and minimally complex networks.[55]
Quantitative genetics focuses ongenetic variance due to genetic interactions. Any two locus interactions at a particular gene frequency can be decomposed into eight independent genetic effects using aweighted regression. In this regression, the observed two locus genetic effects are treated as dependent variables and the "pure" genetic effects are used as the independent variables. Because the regression is weighted, the partitioning among the variance components will change as a function of gene frequency. By analogy it is possible to expand this system to three or more loci, or to cytonuclear interactions[56]
When assaying epistasis within a gene,site-directed mutagenesis can be used to generate the different genes, and theirprotein products can beassayed (e.g. for stability or catalytic activity). This is sometimes called a double mutant cycle and involves producing and assaying the wild type protein, the two single mutants and the double mutant. Epistasis is measured as the difference between the effects of the mutations together versus the sum of their individual effects.[57] This can be expressed as a free energy of interaction.The same methodology can be used to investigate the interactions between larger sets of mutations but all combinations have to be produced and assayed. For example, there are 120 different combinations of 5 mutations, some or all of which may show epistasis...
Numerous computational methods have been developed for the detection and characterization of epistasis. Many of these rely onmachine learning to detect non-additive effects that might be missed by statistical approaches such as linear regression.[58]For example,multifactor dimensionality reduction (MDR) was designed specifically for nonparametric and model-free detection of combinations of genetic variants that are predictive of a phenotype such as disease status in humanpopulations.[59][60] Several of these approaches have been broadly reviewed in the literature.[61] Even more recently, methods that utilize insights from theoretical computer science (theHadamard transform[62] andcompressed sensing[63][64]) or maximum-likelihood inference[65] were shown to distinguish epistatic effects from overall non-linearity in genotype–phenotype map structure,[66] while others used patient survival analysis to identify non-linearity.[67]