
Heritability is astatistic used in the fields ofbreeding andgenetics that estimates the degree ofvariation in aphenotypic trait in apopulation that is due togenetic variation between individuals in that population.[1] The concept of heritability can be expressed in the form of the following question: "What is the proportion of the variation in a given trait within a population that isnot explained by the environment or random chance?"[2]
Other causes of measured variation in a trait are characterized asenvironmental factors, includingobservational error. In human studies of heritability these are often apportioned into factors from "shared environment" and "non-shared environment" based on whether they tend to result in persons brought up in the same household being more or less similar to persons who were not.
Heritability is estimated by comparing individual phenotypic variation among related individuals in a population, by examining the association between individual phenotype and genotype data,[3][4] or even by modeling summary-level data from genome-wide association studies (GWAS).[5] Heritability is an important concept inquantitative genetics, particularly inselective breeding andbehavior genetics (for instance,twin studies). It is the source of much confusion because its technical definition is different from its commonly-understood folk definition. Therefore, its use conveys the incorrect impression that behavioral traits are "inherited" or specifically passed down through the genes.[6] Behavioral geneticists also conduct heritability analyses based on the assumption that genes and environments contribute in a separate, additive manner to behavioral traits.[7]
Heritability measures the fraction of phenotype variability that can be attributed togenetic variation. This is not the same as saying that this fraction of an individual phenotype is caused by genetics. For example, it is incorrect to say that since theheritability of personality traits is about 0.6, that means that 60% of your personality is inherited from your parents and 40% comes from the environment. In addition, heritability can change without any genetic change occurring, such as when the environment starts contributing to more variation. As a case in point, consider that bothgenes and environment have the potential to influence intelligence. Heritability could increase if genetic variation increases, causing individuals to show more phenotypic variation, like showing different levels of intelligence. On the other hand, heritability might also increase if the environmental variation decreases, causing individuals to show less phenotypic variation, like showing more similar levels of intelligence. Heritability increases when genetics are contributing more variation or because non-genetic factors are contributing less variation; what matters is the relative contribution. Heritability is specific to a particular population in a particular environment. High heritability of a trait, consequently, does not necessarily mean that the trait is not very susceptible to environmental influences.[8] Heritability can also change as a result of changes in the environment, migration,inbreeding, or how heritability itself is measured in the population under study.[9] The heritability of a trait should not be interpreted as a measure of the extent to which said trait is genetically determined in an individual.[10][11]
The extent of dependence of phenotype on environment can also be a function of the genes involved. Matters of heritability are complicated because genes maycanalize a phenotype, making its expression almost inevitable in all occurring environments. Individuals with the same genotype can also exhibit different phenotypes through a mechanism calledphenotypic plasticity, which makes heritability difficult to measure in some cases. Recent insights in molecular biology have identified changes intranscriptional activity of individual genes associated with environmental changes. However, there are a large number of genes whose transcription is not affected by the environment.[12]
Estimates of heritability usestatistical analyses to help to identify the causes of differences between individuals. Since heritability is concerned with variance, it is necessarily an account of the differences between individuals in a population. Heritability can beunivariate – examining a single trait – or multivariate – examining the genetic and environmental associations between multiple traits at once. This allows a test of the genetic overlap between different phenotypes: for instancehair color andeye color. Environment and genetics may also interact, and heritability analyses can test for and examine these interactions (GxE models).
A prerequisite for heritability analyses is that there is some population variation to account for. This last point highlights the fact that heritability cannot take into account the effect of factors which are invariant in the population. Factors may be invariant if they are absent and do not exist in the population, such as no one having access to a particularantibiotic, or because they are omnipresent, like if everyone is drinkingcoffee. In practice, all human behavioral traits vary and almost all traits show some heritability.[13]
Any particular phenotype can bemodeled as the sum of genetic and environmental effects:[14]
Likewise the phenotypic variance in the trait – Var (P) – is the sum of effects as follows:
In a planned experiment Cov(G,E) can be controlled and held at 0. In this case, heritability, is defined as[15]
H2 is the broad-sense heritability. This reflects all the genetic contributions to a population's phenotypic variance including additive,dominant, andepistatic (multi-genic interactions), as well asmaternal and paternal effects, where individuals are directly affected by their parents' phenotype, such as withmilk production in mammals.
A particularly important component of the genetic variance is the additive variance, Var(A), which is the variance due to the average effects (additive effects) of thealleles. Since each parent passes a single allele perlocus to each offspring, parent-offspring resemblance depends upon the average effect of single alleles. Additive variance represents, therefore, the genetic component of variance responsible for parent-offspring resemblance. The additive genetic portion of the phenotypic variance is known as Narrow-sense heritability and is defined as
An upper caseH2 is used to denote broad sense, and lower caseh2 for narrow sense.
For traits which are not continuous but dichotomous such as an additional toe or certain diseases, the contribution of the various alleles can be considered to be a sum, which past a threshold, manifests itself as the trait, giving theliability threshold model in which heritability can be estimated and selection modeled.
Additive variance is important forselection. If a selective pressure such as improving livestock is exerted, the response of the trait is directly related to narrow-sense heritability. The mean of the trait will increase in the next generation as a function of how much the mean of the selected parents differs from the mean of the population from which the selected parents were chosen. The observedresponse to selection leads to an estimate of the narrow-sense heritability (calledrealized heritability). This is the principle underlyingartificial selection or breeding.

The simplest genetic model involves a single locus with two alleles (b and B) affecting one quantitative phenotype.
The number ofB alleles can be 0, 1, or 2. For any genotype, (Bi,Bj), whereBi andBj are either 0 or 1, the expected phenotype can then be written as the sum of the overall mean, a linear effect, and a dominance deviation (one can think of the dominance term as aninteraction betweenBi andBj):
The additive genetic variance at this locus is theweighted average of the squares of the additive effects:
where
There is a similar relationship for the variance of dominance deviations:
where
Thelinear regression of phenotype on genotype is shown in Figure 1.
Estimates of the total heritability of human traits assume the absence of epistasis, which has been called the "assumption of additivity". Although some researchers have cited such estimates in support of the existence of "missing heritability" unaccounted for by known genetic loci, the assumption of additivity may render these estimates invalid.[16] There is also some empirical evidence that the additivity assumption is frequently violated in behavior genetic studies of adolescent intelligence andacademic achievement.[17]
Since onlyP can be observed or measured directly, heritability must be estimated from the similarities observed in subjects varying in their level of genetic or environmental similarity. Thestatistical analyses required to estimate thegenetic andenvironmental components of variance depend on the sample characteristics. Briefly, better estimates are obtained using data from individuals with widely varying levels of genetic relationship - such astwins, siblings, parents and offspring, rather than from more distantly related (and therefore less similar) subjects. Thestandard error for heritability estimates is improved with large sample sizes.
In non-human populations it is often possible to collect information in a controlled way. For example, among farm animals it is easy to arrange for a bull to produce offspring from a large number of cows and to control environments. Suchexperimental control is generally not possible when gathering human data, relying on naturally occurring relationships and environments.
In classical quantitative genetics, there were two schools of thought regarding estimation of heritability.
Oneschool of thought was developed bySewall Wright atThe University of Chicago, and further popularized byC. C. Li (University of Chicago) andJ. L. Lush (Iowa State University). It is based on the analysis of correlations and, by extension, regression.Path Analysis was developed bySewall Wright as a way of estimating heritability.
The second was originally developed byR. A. Fisher and expanded atThe University of Edinburgh,Iowa State University, andNorth Carolina State University, as well as other schools. It is based on theanalysis of variance of breeding studies, using the intraclass correlation of relatives. Various methods of estimating components of variance (and, hence, heritability) fromANOVA are used in these analyses.
Today, heritability can be estimated from general pedigrees usinglinear mixed models and fromgenomic relatedness estimated from genetic markers.
Studies of human heritability often utilize adoption study designs, often withidentical twins who have been separated early in life and raised in different environments. Such individuals have identical genotypes and can be used to separate the effects of genotype and environment. A limit of this design is the common prenatal environment and the relatively low numbers of twins reared apart. A second and more common design is thetwin study in which the similarity of identical and fraternal twins is used to estimate heritability. These studies can be limited by the fact that identical twins arenot completely genetically identical, potentially resulting in an underestimation of heritability.
Inobservational studies, or because of evocative effects (where a genome evokes environments by its effect on them), G and E may covary:gene environment correlation. Depending on the methods used to estimate heritability, correlations between genetic factors and shared or non-shared environments may or may not be confounded with heritability.[18]
The first school of estimation uses regression and correlation to estimate heritability.
In the comparison of relatives, we find that in general,
wherer can be thought of as thecoefficient of relatedness,b is the coefficient of regression andt is the coefficient of correlation.

Heritability may be estimated by comparing parent and offspring traits (as in Fig. 2). The slope of the line (0.57) approximates the heritability of the trait when offspring values are regressed against the average trait in the parents. If only one parent's value is used then heritability is twice the slope. (This is the source of the term "regression," since the offspring values always tend toregress to the mean value for the population,i.e., the slope is always less than one). This regression effect also underlies theDeFries–Fulker method for analyzing twins selected for one member being affected.[19]
A basic approach to heritability can be taken using full-Sib designs: comparing similarity between siblings who share both a biological mother and a father.[20] When there is only additive gene action, this sibling phenotypic correlation is an index offamiliarity – the sum of half the additive genetic variance plus full effect of the common environment. It thus places an upper limit on additive heritability of twice the full-Sib phenotypic correlation. Half-Sib designs compare phenotypic traits of siblings that share one parent with other sibling groups.

Heritability for traits in humans is most frequently estimated by comparing resemblances between twins. "The advantage of twin studies, is that the total variance can be split up into genetic, shared or common environmental, and unique environmental components, enabling an accurate estimation of heritability".[21] Fraternal or dizygotic (DZ) twins on average share half their genes (assuming there is noassortative mating for the trait), and so identical or monozygotic (MZ) twins on average are twice as genetically similar as DZ twins. A crude estimate of heritability, then, is approximately twice the difference incorrelation between MZ and DZ twins, i.e.Falconer's formulaH2=2(r(MZ)-r(DZ)).
The effect of shared environment,c2, contributes to similarity between siblings due to the commonality of the environment they are raised in. Shared environment is approximated by the DZ correlation minus half heritability, which is the degree to which DZ twins share the same genes,c2 = r(DZ) - 1/2h2. Unique environmental variance,e2, reflects the degree to which identical twins raised together are dissimilar,e2=1-r(MZ).
The second set of methods of estimation of heritability involves ANOVA and estimation of variance components.
We use the basic discussion of Kempthorne.[14] Considering only the most basic of genetic models, we can look at the quantitative contribution of a single locus with genotypeGi as
where is the effect of genotypeGi and is the environmental effect.
Consider an experiment with a group of sires and their progeny from random dams. Since the progeny get half of their genes from the father and half from their (random) mother, the progeny equation is
Consider the experiment above. We have two groups of progeny we can compare. The first is comparing the various progeny for an individual sire (calledwithin sire group). The variance will include terms for genetic variance (since they did not all get the same genotype) and environmental variance. This is thought of as anerror term.
The second group of progeny are comparisons of means of half sibs with each other (calledamong sire group). In addition to theerror term as in the within sire groups, we have an addition term due to the differences among different means of half sibs. The intraclass correlation is
since environmental effects are independent of each other.
In an experiment with sires and progeny per sire, we can calculate the following ANOVA, using as the genetic variance and as the environmental variance:
| Source | d.f. | Mean Square | Expected Mean Square |
|---|---|---|---|
| Between sire groups | |||
| Within sire groups |
The term is theintraclass correlation between half sibs. We can easily calculate. The expected mean square is calculated from the relationship of the individuals (progeny within a sire are all half-sibs, for example), and an understanding of intraclass correlations.
The use of ANOVA to calculate heritability often fails to account for the presence ofgene–-environment interactions, because ANOVA has a much lowerstatistical power for testing for interaction effects than for direct effects.[22]
For a model with additive and dominance terms, but not others, the equation for a single locus is
where
is the additive effect of the ith allele, is the additive effect of the jth allele, is the dominance deviation for the ijth genotype, and is the environment.
Experiments can be run with a similar setup to the one given in Table 1. Using different relationship groups, we can evaluate different intraclass correlations. Using as the additive genetic variance and as the dominance deviation variance, intraclass correlations becomelinear functions of these parameters. In general,
where and are found as
P[alleles drawn at random from the relationship pair areidentical by descent], and
P[genotypes drawn at random from the relationship pair areidentical by descent].
Some common relationships and their coefficients are given in Table 2.
| Relationship | ||
|---|---|---|
| Identical Twins | ||
| Parent-Offspring | ||
| Half Siblings | ||
| Full Siblings | ||
| First Cousins | ||
| Double First Cousins |
A wide variety of approaches using linear mixed models have been reported in literature. Via these methods, phenotypic variance is partitioned into genetic, environmental and experimental design variances to estimate heritability. Environmental variance can be explicitly modeled by studying individuals across a broad range of environments, although inference of genetic variance from phenotypic and environmental variance may lead to underestimation of heritability due to the challenge of capturing the full range of environmental influence affecting a trait. Other methods for calculating heritability use data fromgenome-wide association studies to estimate the influence on a trait by genetic factors, which is reflected by the rate and influence of putatively associated genetic loci (usuallysingle-nucleotide polymorphisms) on the trait. This can lead to underestimation of heritability, however. This discrepancy is referred to as "missing heritability" and reflects the challenge of accurately modeling both genetic and environmental variance in heritability models.[23]
When a large, complex pedigree or another aforementioned type of data is available, heritability and other quantitative genetic parameters can be estimated byrestricted maximum likelihood (REML) orBayesian methods. Theraw data will usually have three or more data points for each individual: a code for the sire, a code for the dam and one or several trait values. Different trait values may be for different traits or for different time points of measurement.
The currently popular methodology relies on high degrees of certainty over the identities of the sire and dam; it is not common to treat the sire identity probabilistically. This is not usually a problem, since the methodology is rarely applied to wild populations (although it has been used for several wild ungulate and bird populations), and sires are invariably known with a very high degree of certainty in breeding programmes. There are also algorithms that account for uncertain paternity.
The pedigrees can be viewed using programs such as Pedigree Viewer[1], and analyzed with programs such asASReml, VCE[2], WOMBAT[3], MCMCglmm within the R environment[4] or theBLUPF90 family of programs[5].
Pedigree models are helpful for untangling confounds such asreverse causality,maternal effects such as theprenatal environment, and confounding ofgenetic dominance, shared environment, and maternal gene effects.[24][9]
When genome-wide genotype data and phenotypes from large population samples are available, one can estimate the relationships between individuals based on their genotypes and use a linear mixed model to estimate the variance explained by the genetic markers. This gives a genomic heritability estimate based on the variance captured by common genetic variants.[4] There are multiple methods that make different adjustments for allele frequency andlinkage disequilibrium. Particularly, the method called High-Definition Likelihood (HDL) can estimate genomic heritability using only GWAS summary statistics,[5] making it easier to incorporate large sample size available in various GWAS meta-analysis.

Inselective breeding of plants and animals, the expected response to selection of a trait with known narrow-sense heritability can be estimated using thebreeder's equation:[25]
In this equation, the Response to Selection (R) is defined as the realized average difference between the parent generation and the next generation, and the Selection Differential (S) is defined as the average difference between the parent generation and the selected parents.[14]: 1957 [26]
For example, imagine that a plant breeder is involved in a selective breeding project with the aim of increasing the number of kernels per ear of corn. For the sake of argument, let us assume that the average ear of corn in the parent generation has 100 kernels. Let us also assume that the selected parents produce corn with an average of 120 kernels per ear. If h2 equals 0.5, then the next generation will produce corn with an average of 0.5(120-100) = 10 additional kernels per ear. Therefore, the total number of kernels per ear of corn will equal, on average, 110.
Observing the response to selection in an artificial selection experiment will allow calculation of realized heritability as in Fig. 4.
Heritability in the above equation is equal to the ratio only if the genotype and the environmental noise followGaussian distributions.
Heritability estimates' prominent critics, such asSteven Rose,[27]Jay Joseph,[28] andRichard Bentall, focus largely on heritability estimates inbehavioral sciences andsocial sciences. Bentall has claimed that such heritability scores are typically calculated counterintuitively to derive numerically high scores, that heritability is misinterpreted asgenetic determination, and that this alleged bias distracts from other factors that researches have found more causally important, such as childhood abuse causing later psychosis.[29][30] Heritability estimates are also inherently limited because they do not convey any information regarding whether genes or environment play a larger role in the development of the trait under study. For this reason,David Moore andDavid Shenk describe the term "heritability" in the context of behavior genetics as "...one of the most misleading in the history of science" and argue that it has no value except in very rare cases.[31] When studying complex human traits, it is impossible to use heritability analysis to determine the relative contributions of genes and environment, as such traits result from multiple causes interacting.[32] In particular,Feldman andLewontin emphasize that heritability is itself a function of environmental variation.[33] However, some researchers argue that it is possible to disentangle the two.[34]
The controversy over heritability estimates is largely via their basis intwin studies. The scarce success ofmolecular-genetic studies to corroborate suchpopulation-genetic studies' conclusions is themissing heritability problem.[35] Eric Turkheimer has argued that newer molecular methods have vindicated the conventional interpretation of twin studies,[35] although it remains mostly unclear how to explain the relations between genes and behaviors.[36] According to Turkheimer, both genes and environment are heritable, genetic contribution varies by environment, and a focus on heritability distracts from other important factors.[37] Overall, however,heritability is a concept widely applicable.[9]
{{cite book}}: CS1 maint: location missing publisher (link)...all complex human traits result from a combination of causes. If these causes interact, it is impossible to assign quantitative values to the fraction of a trait due to each, just as we cannot say how much of the area of a rectangle is due, separately, to each of its two dimensions. Thus, in the analyses of complex human phenotypes...we cannot actually find 'the relative importance of genes and environment in the determination of phenotype'.