Movatterモバイル変換

[0]ホーム

Jump to content

Haplotype

Edit links

From Wikipedia, the free encyclopedia

Group of genes from one parent

This articlemay be too technical for most readers to understand. Pleasehelp improve it tomake it understandable to non-experts, without removing the technical details.(February 2021) (Learn how and when to remove this message)

DNA molecule 1 differs from DNA molecule 2 at a single base-pair location (a C/A polymorphism).

Ahaplotype (haploid genotype) is a group ofalleles in anorganism that are inherited together from a single parent.^[1]^[2]

Many organisms contain genetic material (DNA) which is inherited from two parents. Normally these organisms have their DNA organized in two sets of pairwise similarchromosomes. The offspring gets one chromosome in each pair from each parent. A set of pairs of chromosomes is calleddiploid and a set of only one half of each pair is called haploid. The haploid genotype (haplotype) is a genotype that considers the singular chromosomes rather than the pairs of chromosomes. It can be all the chromosomes from one of the parents or a minor part of a chromosome, for example a sequence of 9000base pairs or a small set of alleles.

Specific contiguous parts of the chromosome are likely to be inherited together and not be split bychromosomal crossover, a phenomenon calledgenetic linkage.^[3]^[4] As a result, identifying these statistical associations and a few alleles of a specific haplotype sequence can facilitate identifyingall other such polymorphic sites that are nearby on the chromosome (imputation).^[5] Such information is critical for investigating the genetics of commondiseases; which have been investigated in humans by theInternational HapMap Project.^[6]^[7]

Other parts of the genome are almost always haploid and do not undergo crossover: for example, humanmitochondrial DNA is passed down through the maternal line and theY chromosome is passed down the paternal line. In these cases, the entire sequence can be grouped into a simple evolutionary tree, with each branch founded by aunique-event polymorphism mutation (often, but not always, asingle-nucleotide polymorphism (SNP)). Eachclade under a branch, containing haplotypes with a single shared ancestor, is called ahaplogroup.^[8]^[9]^[10]

Haplotype resolution

[edit]

An organism'sgenotype may not define its haplotype uniquely. For example, consider adiploid organism and two bi-allelicloci (such asSNPs) on the same chromosome. Assume the first locus has allelesA orT and the second locusG orC. Both loci, then, have three possiblegenotypes: (AA,AT, andTT) and (GG,GC, andCC), respectively. For a given individual, there are nine possible configurations (haplotypes) at these two loci (shown in thePunnett square below). For individuals who are homozygous at one or both loci, the haplotypes are unambiguous - meaning that there is not any differentiation of haplotype T1T2 vs haplotype T2T1; where T1 and T2 are labeled to show that they are the same locus, but labeled as such to show it does not matter which order you consider them in, the end result is two T loci. For individualsheterozygous at both loci, thegametic phase isambiguous - in these cases, an observer does not know which haplotype the individual has, e.g., TA vs AT.

Locus 1 Locus 2	AA	AT	TT
GG	AG AG	AG TG	TG TG
GC	AG AC	AG TC or AC TG	TG TC
CC	AC AC	AC TC	TC TC

The only unequivocal method of resolving phase ambiguity is bysequencing. However, it is possible to estimate the probability of a particular haplotype when phase is ambiguous using a sample of individuals.

Given the genotypes for a number of individuals, the haplotypes can be inferred by haplotype resolution orhaplotype phasing techniques. These methods work by applying the observation that certain haplotypes are common in certain genomic regions. Therefore, given a set of possible haplotype resolutions, these methods choose those that use fewer different haplotypes overall. The specifics of these methods vary - some are based on combinatorial approaches (e.g.,parsimony), whereas others use likelihood functions based on different models and assumptions such as theHardy–Weinberg principle, thecoalescent theory model, or perfect phylogeny. The parameters in these models are then estimated using algorithms such as theexpectation-maximization algorithm (EM),Markov chain Monte Carlo (MCMC), orhidden Markov models (HMM).

Microfluidic whole genome haplotyping is a technique for the physical separation of individual chromosomes from ametaphase cell followed by direct resolution of the haplotype for each allele.

Gametic phase

[edit]

Ingenetics, agametic phase represents the original allelic combinations that adiploid individual inherits from both parents.^[11] It is therefore a particular association ofalleles at different loci on the samechromosome. Gametic phase is influenced bygenetic linkage.^[12]

Y-DNA haplotypes from genealogical DNA tests

[edit]

Main article:Genealogical DNA test

Unlike other chromosomes, Y chromosomes generally do not come in pairs. Every human male (excepting those withXYY syndrome) has only one copy of that chromosome. This means that there is not any chance variation of which copy is inherited, and also (for most of the chromosome) not any shuffling between copies byrecombination; so, unlikeautosomal haplotypes, there is effectively not any randomisation of the Y-chromosome haplotype between generations. A human male should largely share the same Y chromosome as his father, give or take a few mutations; thus Y chromosomes tend to pass largely intact from father to son,with a small but accumulating number of mutations that can serve to differentiate male lineages.In particular, the Y-DNA represented as the numbered results of aY-DNA genealogical DNA test should match, except for mutations.

UEP results (SNP results)

[edit]

Unique-event polymorphisms (UEPs) such as SNPs representhaplogroups. STRs represent haplotypes. The results that comprise the full Y-DNA haplotype from the Y chromosome DNA test can be divided into two parts: the results for UEPs, sometimes loosely called the SNP results as most UEPs aresingle-nucleotide polymorphisms, and the results formicrosatellite short tandem repeat sequences (Y-STRs).

The UEP results represent the inheritance of events it is believed can be assumed to have happened only once in all human history. These can be used to identify the individual'sY-DNA haplogroup, his place in the "family tree" of the whole of humanity. Different Y-DNA haplogroups identify genetic populations that are often distinctly associated with particular geographic regions; their appearance in more recent populations located in different regions represents the migrations tens of thousands of years ago of the directpatrilineal ancestors of current individuals.

Y-STR haplotypes

[edit]

Genetic results also include theY-STR haplotype, the set of results from the Y-STR markers tested.

Unlike the UEPs, the Y-STRs mutate much more easily, which allows them to be used to distinguish recent genealogy. But it also means that, rather than the population of descendants of a genetic event all sharing thesame result, the Y-STR haplotypes are likely to have spread apart, to form acluster of more or less similar results. Typically, this cluster will have a definite most probable center, themodal haplotype (presumably similar to the haplotype of the original founding event), and also ahaplotype diversity — the degree to which it has become spread out. The further in the past the defining event occurred, and the more that subsequent population growth occurred early, the greater the haplotype diversity will be for a particular number of descendants. However, if the haplotype diversity is smaller for a particular number of descendants, this may indicate a more recent common ancestor, or a recent population expansion.

It is important to note that, unlike for UEPs, two individuals with a similar Y-STR haplotype may not necessarily share a similar ancestry. Y-STR events are not unique. Instead, the clusters of Y-STR haplotype results inherited from different events and different histories tend to overlap.

In most cases, it is a long time since the haplogroups' defining events, so typically the cluster of Y-STR haplotype results associated with descendants of that event has become rather broad. These results will tend to significantly overlap the (similarly broad) clusters of Y-STR haplotypes associated with other haplogroups. This makes it impossible for researchers to predict with absolute certainty to which Y-DNA haplogroup a Y-STR haplotype would point. If the UEPs are not tested, the Y-STRs may be used only to predict probabilities for haplogroup ancestry, but not certainties.

A similar scenario exists in trying to evaluate whether shared surnames indicate shared genetic ancestry. A cluster of similar Y-STR haplotypes may indicate a shared common ancestor, with an identifiable modal haplotype, but only if the cluster is sufficiently distinct from what may have happened by chance from different individuals who historically adopted the same name independently. Many names were adopted from common occupations, for instance, or were associated with habitation of particular sites. More extensive haplotype typing is needed to establish genetic genealogy. Commercial DNA-testing companies now offer their customers testing of more numerous sets of markers to improve definition of their genetic ancestry. The number of sets of markers tested has increased from 12 during the early years to 111 more recently.

Establishing plausible relatedness between different surnames data-mined from a database is significantly more difficult. The researcher must establish that thevery nearest member of the population in question, chosen purposely from the population for that reason, would be unlikely to match by accident. This is more than establishing that arandomly selected member of the population is unlikely to have such a close match by accident. Because of the difficulty, establishing relatedness between different surnames as in such a scenario is likely to be impossible, except in special cases where there is specific information to drastically limit the size of the population of candidates under consideration.

Diversity

[edit]

Haplotype diversity is a measure of the uniqueness of a particular haplotype in a given population. The haplotype diversity (H) is computed as:^[13]
$H={\frac {N}{N-1}}(1-\sum _{i}x_{i}^{2})$
where $x_{i}$ is the (relative) haplotype frequency of each haplotype in the sample and $N {\displaystyle N}$ is the sample size. Haplotype diversity is given for each sample.

History

[edit]

The term "haplotype" was first introduced byMHC biologistRuggero Ceppellini during the Third International Histocompatibility Workshop to substitute "pheno-group".^[14]^[15]

References

[edit]

^By C. Barry Cox, Peter D. Moore, Richard Ladle. Wiley-Blackwell, 2016.ISBN 978-1-118-96858-1 p106.Biogeography: An Ecological and Evolutionary Approach
^Editorial Board, V&S Publishers, 2012,ISBN 9381588643 p137.Concise Dictionary of Science
^BiologyPages/H/Haplotypes.html Kimball's Biology Pages (Creative Commons Attribution 3.0)
^"haplotype / haplotypes | Learn Science at Scitable".www.nature.com.
^Yoosefzadeh-Najafabadi, Mohsen; Rajcan, Istvan; Eskandari, Milad (2022)."Optimizing genomic selection in soybean: An important improvement in agricultural genomics".Heliyon.8 (11) e11873.Bibcode:2022Heliy...811873Y.doi:10.1016/j.heliyon.2022.e11873.PMC 9713349.PMID 36468106.
^The International HapMap Consortium (2003)."The International HapMap Project"(PDF).Nature.426 (6968):789–796.Bibcode:2003Natur.426..789G.doi:10.1038/nature02168.hdl:2027.42/62838.PMID 14685227.S2CID 4387110.
^The International HapMap Consortium (2005)."A haplotype map of the human genome".Nature.437 (7063):1299–1320.Bibcode:2005Natur.437.1299T.doi:10.1038/nature04226.PMC 1880871.PMID 16255080. – This article speaks of ahaplotype length, which is the length of a contiguous run of the chromosome inherited from a single parent.
^Arora, Devender; Singh, Ajeet; Sharma, Vikrant; Bhaduria, Harvendra Singh; Patel, Ram Bahadur (2015)."HgsDb: Haplogroups Database to understand migration and molecular risk assessment".Bioinformation.11 (6):272–5.doi:10.6026/97320630011272.PMC 4512000.PMID 26229286.
^International Society of Genetic Genealogy 2015Genetics Glossary,Haplogroup
^"Facts & Genes. Volume 7, Issue 3". Archived fromthe original on May 9, 2008.
^Taylor, Duncan; Bright, Jo-Anne; Buckleton, John S. (2016). "Biological basis for DNA evidence". In Buckleton, John S.; Bright, Jo-Anne; Taylor, Duncan (eds.).Forensic DNA Evidence Interpretation (2nd ed.). Boca Rotan, FL: CRC Press. pp. 1–36.ISBN 978-1-4822-5889-9.
^Excoffier, Laurent (1 November 2003)."Gametic phase estimation over large genomic regions using an adaptive window approach".Human Genomics.1 (1):7–19.doi:10.1186/1479-7364-1-1-7.PMC 3525008.PMID 15601529.
^Masatoshi Nei andFumio Tajima, "DNA polymorphism detectable by restriction endonucleases", Genetics 97:145 (1981)
^Petersdorf, E. W. (Feb 2017)."In celebration of Ruggero Ceppellini: HLA in transplantation".HLA.89 (2):71–76.doi:10.1111/tan.12955.ISSN 2059-2302.PMC 5267337.PMID 28102037.
^Flajnik, M. F.; Singh, Nevil; Holland, Steven M., eds. (2023). "Chapter 19 The Major Histocompatibility Complex".Paul's fundamental immunology (8th ed.). Philadelphia: Wolters Kluwer/Lippincott Williams & Wikins. p. 586.ISBN 978-1-9751-4253-7.

External links

[edit]

HapMap Archived 2014-04-16 at theWayback Machine — homepage for the International HapMap Project.
Haplotype versus Haplogroup — the difference between haplogroup & haplotype explained.

Authority control databases	GND

v t e Genealogical DNA testing
Procedure Types of tests Haplogroup /Haplotype /Subclade Genetic genealogy
People	Bennett Greenspan Spencer Wells Anne Wojcicki
Societies	International Society of Genetic Genealogy
Projects	Genographic Project Surname DNA project
Services	23andMe AncestryDNA FamilyTreeDNA Living DNA MyHeritage Nebula Genomics TellmeGen
Category