Part ofa series on |
Genetic genealogy |
---|
Concepts |
Related topics |
Genetic genealogy is the use ofgenealogical DNA tests, i.e.,DNA profiling andDNA testing, in combination with traditionalgenealogical methods, to infergenetic relationships between individuals. This application ofgenetics came to be used by family historians in the 21st century, asDNA tests became affordable. The tests have been promoted by amateur groups, such assurname study groups or regional genealogical groups, as well as research projects such as theGenographic Project.
As of 2019,[update] about 30 million people had been tested. As the field developed, the aims of practitioners broadened, with many seeking knowledge of their ancestry beyond the recent centuries, for which traditional pedigrees can be constructed.
The investigation ofsurnames ingenetics can be said to go back toGeorge Darwin, a son ofCharles Darwin and Charles' first cousinEmma Darwin. In 1875, George Darwin used surnames to estimate the frequency offirst-cousin marriages and calculated the expected incidence of marriage between people of the same surname (isonymy). He arrived at a figure of 1.5% for cousin-marriage in the population ofLondon, higher (3%-3.5%) among the upper classes and lower (2.25%) among the general rural population.[1]
A famous study in 1998 examined thelineage of descendants ofThomas Jefferson's paternal line and male lineage descendants of the freed slaveSally Hemings.[2]
Bryan Sykes, a molecular biologist atOxford University, tested the new methodology in general surname research.[3] His study of the Sykes surname, published in 2000, obtained results by looking at fourSTR markers on the malechromosome. It pointed the way to genetics becoming a valuable assistant in the service ofgenealogy andhistory.[4]
In 2000,Family Tree DNA was the first company to providedirect-to-consumer genetic testing for genealogy research. It initially offered eleven-marker Y-chromosome STR tests and HVR1 mitochondrial DNA tests but not multi-generational genealogy tests.[5][6][7][8][9] In 2001, GeneTree was acquired bySorenson Molecular Genealogy Foundation (SMGF),[10] which provided freeY-chromosome andmitochondrial DNA (mtDNA) tests.[11] GeneTree later returned to genetic testing in conjunction with its Sorenson parent company until it was acquired byAncestry.com in 2012.[12]
In 2007,23andMe was the first company to offersaliva-based direct-to-consumer testing,[13] and the first to use autosomal DNA for ancestry testing.[14][15] Anautosome is one of the 22 chromosomes other than the X or Y chromosomes. They are transmitted from all ancestors in recent generations and so can be used to match with other testers who may be related. Companies were later also able to use this data to estimate how much of each ethnicity a customer has. FamilyTreeDNA entered this market in 2010, followed by AncestryDNA in 2012, and the number of tests grew rapidly. By 2018 autosomal testing had become the predominant type of test, and for many companies the only test they offered.[16]
MyHeritage launched its testing service in 2016, allowing users to usecheek swabs to collect samples,[17] and introduced new analysis tools in 2019: autoclusters (grouping matches visually into clusters)[18] and family tree theories (suggesting conceivable relations between DNA matches by combining several MyHeritage trees and the Geni global family tree).[19]Living DNA, founded in 2015, usesSNP chips to provide reports on autosomal ancestry, Y, and mtDNA ancestry.[20][21]
By 2019, the combined total of customers at the four largest companies was 26 million.[22][23][14][15] By August 2019, it was reported that about 30 million people had had their DNA tested for genealogical purposes.[24][22]
GEDmatch said in 2018 that about half of their one million profiles were American.[25] Due to the limited geographical distribution of DNA testees, databases and results limit knowledge of variation present in other racial groups. However, this can only be remedied by testing more individuals, making geneticists aware of the genetic variation present in currently underrepresented testees.
The publication ofThe Seven Daughters of Eve by Sykes in 2001, which described the seven majorhaplogroups of European ancestors, helped push personal ancestry testing through DNA tests into wide public notice. With the growing availability and affordability of genealogical DNA testing, genetic genealogy as a field grew rapidly. By 2003, the field of DNA testing of surnames was declared officially to have "arrived" in an article by Jobling and Tyler-Smith inNature Reviews Genetics.[26] The number of firms offering tests, and the number of consumers ordering them, rose dramatically.[27] In 2018, a paper inScience Magazine estimated that a DNA genealogy search on anybody of European descent would result in a third cousin or closer match 60% of the time.[28]
![]() | This section needs to beupdated. Please help update this article to reflect recent events or newly available information.(September 2013) |
The original Genographic Project was a five-year research study launched in 2005 by theNational Geographic Society andIBM, in partnership with the University of Arizona and Family Tree DNA. Its goals were primarily anthropological. The project announced that by April 2010 it had sold more than 350,000 of its public participation testing kits, which test the general public for either twelveSTR markers on the Y chromosome or mutations on theHVR1 region of the mtDNA.[29]
The phase of the project in 2016 was Geno 2.0 Next Generation.[30] As of 2018, almost one-million participants in over 140 countries had joined the project.[31]
This sectionneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources in this section. Unsourced material may be challenged and removed.(July 2013) (Learn how and when to remove this message) |
Genetic genealogy has enabled groups of people to trace their ancestry even though they are not able to use conventional genealogical techniques. This may be because they do not know one or both of their birth parents or because conventional genealogical records have been lost, destroyed or never existed. These groups include adoptees, foundlings, Holocaust survivors, GI babies, child migrants, descendants of children from orphan trains and people with slave ancestry.[32][33]
The earliest test takers were customers most often those who started with a Y-chromosome test to determine theirfather's paternal ancestry. These men often took part insurname projects. The first phase of the Genographic Project brought new participants into genetic genealogy. Those who tested were as likely to be interested in direct maternal heritage as their paternal. The number of those taking mtDNA tests increased. The introduction of autosomal SNP tests based onmicroarray chip technology changed the demographics. Women were as likely as men to test themselves.
Members of the genetic genealogy community have been credited with making useful contributions to knowledge in the field, an example ofcitizen science.[34]
One of the earliest interest groups to emerge was theInternational Society of Genetic Genealogy (ISOGG). Their stated goal is to promote DNA testing for genealogy.[35] Members advocate the use of genetics in genealogical research and the group facilitates networking among genetic genealogists.[36] Since 2006 ISOGG has maintained the regularly updated ISOGG Y-chromosomephylogenetic tree.[36][37] ISOGG aims to keep the tree as up-to-date as possible, incorporating newSNPs.[38] However, the tree has been described by academics as not completely academically verified, phylogenetic trees of Y chromosome haplogroups.[39]
Mitochondrial DNA (mtDNA) testing involves sequencing at least part of the mitochondrial genome. The mitochondrial DNA is transmitted from mother to child, and so can reveal information about the unbroken maternal line. When two individuals have matching or near matching mitochondrial DNA, it can be inferred that they share a common maternal-line ancestor at some point in the "recent" past.[40] Care should be taken to avoid overstating the recency of a relationship however, as a mutation in the mitochondrial genome will only occur every 1000 to 3000 years on average.[41] For this reason, it is usually impossible to distinguish between two individuals related within the last one or two millennia on the basis of mtDNA alone.
Y-Chromosome DNA (Y-DNA) testing involvesshort tandem repeat (STR) and, sometimes,single nucleotide polymorphism (SNP) testing of the Y-Chromosome, which is present only in males and only reveals information on the unbroken paternal line. As with the mitochondria, close matches with individuals indicate a recent common ancestor. However, because a permanent SNP mutation occurs much more frequently on the Y chromosome as compared to mitochondrial DNA, male lineages are much more temporally resolved, with the average lineage producing a new permanent, unique mutation every 83 years.[42] Because surnames in many cultures are transmitted down the paternal line, this testing is often used bysurname DNA projects.[43]
While early studies using STRs made bold claims that large numbers of men descend from prominent historical individuals (e.g.Niall of the Nine Hostages andGenghis Khan), more recent SNP studies have shown many of these to be invalid. In particular, STR mutations are now known to be largely unreliable in proving kinship, as these mutations can appear in multiple unrelated lineages by chance. SNP testing is necessary to prove a true relationship, as these mutations are considered so rare that they could only have arisen in one individual in history. In the few cases where the same SNP mutation occurs in different lineages, the accompanying SNPs ensure its recognition as ade novo mutation. Even so, studies based ostensibly on SNP mutations can still be misleading, as in the case of Fehér (2024),[44] which presented few if any results from individuals with verified patrilines, and associated kin-groups with various SNP mutations that predate their formation by hundreds or thousands of years.
To prove descent from a common ancestor in the male line, a Y-DNA clade generally requires triangulation back to a most recent common ancestor (MRCA), who is generally referred to by the name of the mutation (e.g. L21, U106, etc.) as a shorthand. A SNP mutation unique to a family or kin group is referred to as a "defining mutation", the testing of which can exclude men not related through the male line within one or two centuries at the most. This has been exploited in recent times to identify the defining mutations of noble and royal lineages, such as theStewarts of Scotland[45] and theUí Briúin dynasty of Ireland.[46]
Pedigreefamily trees have traditionally been prepared fromrecollections of individuals about theirparents andgrandparents. These family trees may be extended if recollections of earlier generations were preserved throughoral tradition or written documents. Somegenealogists regard oral tradition asmyths unless confirmed[47] with written documentation likebirth certificates,marriage certificates,census reports,headstones, or notes infamily bibles.[48] Few written records are kept byilliterate populations, and many documents have been destroyed bywarfare ornatural disasters. DNA comparison may offer an alternative means of confirming family relationships ofbiological parents, but may be confused byadoption or when amother conceals the identity of thefather of her child.[49]
While mitochondrial and Y-chromosome DNA matching offer the most definitive confirmation of ancestral relationships, the information from a tested individual is relevant to a decreasing fraction of their ancestors from earlier generations. Potentialambiguity must be considered when seeking confirmation from comparison ofautosomal DNA. The first source of ambiguity arises from the underlying similarity of every individual's DNA sequence. Many short gene segments will be identical by coincidental recombination (Identical by State: IBS) rather than inheritance from a single ancestor (Identical by Descent: IBD). Segments of greater length offer increased confidence of a shared ancestor. A second source of ambiguity results from therandom distribution of genes to each child of a parent. Onlyidentical twins inherit exactly the same gene segments. Although a child inherits exactly half of their DNA from each parent, the percentage inherited from any given ancestor in an earlier generation (with the exception of X chromosome DNA) varies within anormal distribution around amedian value of 100% divided by the number of ancestors in that generation. An individual comparing autosomal DNA with ancestors of successively earlier generations will encounter an increasing number of ancestors from whom they inherited no DNA segments of significant length. Since individuals inherit only a small portion of their DNA from each of theirgreat-grandparents,cousins descended from the same ancestor may not inherit the same DNA segments from that ancestor. All descendants of the same parent or grandparent, and nearly all descendants of the same great-grandparent, will share gene segments of significant length; but approximately 10% of 3rd cousins, 55% of 4th cousins, 85% of 5th cousins, and more than 95% of more distant cousins will share no gene segments of significant length. Failure to share a gene segment of significant length does not disprove the shared ancestry of a distant cousin.[50]
The best autosomal DNA method for confirming ancestry is to compare DNA with known relatives. A more complicated task is using a DNA database to identify previously unknown individuals who share DNA with the individual of interest; and then attempting to find shared ancestors with those individuals.[51] The first problem with the latter procedure involves the relatively poor family history knowledge of most database populations. A significant percentage of individuals in many DNA databases have done DNA testing because they are uncertain of their parentage, and many who confidently identify their parents are unable or unwilling to share information about earlier generations. It may be easier to identify a shared ancestor in the fortunate situation of shared DNA between two individuals with comprehensive family trees, but finding multiple shared ancestors raises the question of from which of those ancestors was the shared segment inherited. Resolving that ambiguity typically requires finding a third individual sharing both the ancestor and the gene segment of interest.[52]
A common component of many autosomal tests is a prediction of biogeographical origin, often called ethnicity. A company offering the test uses computer algorithms and calculations to make a prediction of what percentage of an individual's DNA comes from particular ancestral groups. A typical number of populations is at least 20. Despite this aspect of the tests being heavily promoted and advertised, many genetic genealogists have warned consumers that the results may be inaccurate, and at best are only approximate.[53]
ModernDNA sequencing has identified various ancestral components in contemporary populations. A number of these genetic elements haveWest Eurasian origins. They include the following ancestral components, with their geographical hubs and main associated populations:
# | West Eurasian component | Geographical hub | Peak population | Notes |
---|---|---|---|---|
1 | Ancestral North Indian | Bangladesh,North India,Pakistan | Bangladeshis,North Indians,Pakistanis | Main West Eurasian component in theIndian subcontinent. Peaks amongIndo-European-speaking caste populations in the northern areas, but also found at significant frequencies among someDravidian-speaking caste groups. Associated with either the arrival of Indo-European speakers fromWest Asia orCentral Asia between 3,000 and 4,000 years before present, or with the spread of agriculture and West Asian crops beginning around 8,000-9,000 ybp, or with migrations from West Asia in the pre-agricultural period. Contrasted with the indigenousAncestral South Indian component, which peaks among theOngeAndamanese inhabiting theAndaman Islands.[54][55] |
2 | Arabian | Arabian Peninsula | Yemenis,Saudis,Qataris,Bedouins | Main West Eurasian component in thePersian Gulf region. Most closely associated with localArabic,Semitic-speaking populations.[56] Also found at significant frequencies in parts of theLevant,Egypt andLibya.[56][57] |
3 | Coptic | Nile Valley | Copts,Beja,Afro-AsiaticEthiopians,Sudanese Arabs,Nubians | Main West Eurasian component inNortheast Africa.[58] Roughly equivalent with the Ethio-Somali component.[58][59] Peaks among EgyptianCopts inSudan. Also found at high frequencies among otherAfro-Asiatic (Hamito-Semitic) speakers inEthiopia and Sudan, as well as among manyNubians. Associated withAncient Egyptian ancestry, without the later Arabian influence present among modernEgyptians. Contrasted with the indigenous Nilo-Saharan component, which peaks amongNilo-Saharan- andKordofanian-speaking populations inhabiting the southern part of the Nile Valley.[58] |
4 | Ethio-Somali | Horn of Africa | Somalis,Afars,Amhara,Oromos,Tigrinya | Main West Eurasian component in the Horn.[59] Roughly equivalent with the Coptic component.[58][59] Associated with the arrival of Afro-Asiatic speakers in the region during antiquity. Peaks amongCushitic- andEthiopian Semitic-speaking populations in the northern areas. Diverged from the Maghrebi component around 23,000 ybp, and from the Arabian component about 25,000 ybp. Contrasted with the indigenous Omotic component, which peaks among theOmotic-speakingAri ironworkers inhabiting southern Ethiopia.[59] |
5 | European | Europe | Europeans | Main West Eurasian component in Europe. Also found at significant frequencies in adjacent geographical areas outside of the continent, inAnatolia, theCaucasus, theIranian plateau, and parts of the Levant.[56] |
6 | Levantine | Near East,Caucasus | Druze,Lebanese,Cypriots,Syrians,Jordanians,Palestinians,Armenians,Georgians,Sephardic Jews,Ashkenazi Jews,Iranians,Turks,Sardinians,Adygei | Main West Eurasian component in the Near East and Caucasus. Peaks amongDruze populations in the Levant. Found amongst local Afro-Asiatic, Indo-European,Caucasus andTurkish speakers alike. Diverged from the European component around 9,100-15,900 ybp, and from the Arabian component about 15,500-23,700 ypb. Also found at significant frequencies inSouthern Europe as well as parts of the Arabian Peninsula.[56] |
7 | Maghrebi | Northwest Africa | Berbers,Maghrebis,Sahrawis,Tuareg | Main West Eurasian component in theMaghreb. Peaks among theBerber (non-Arabized) populations in the region.[57] Diverged from the Ethio-Somali/Coptic, Arabian, Levantine and European components prior to theHolocene.[57][59] |
Genealogical DNA testing methods have been used on a longer time scale to tracehuman migratory patterns. For example, they determined when the first humans came to North America and what path they followed.
For several years, researchers and laboratories from around the world sampled indigenous populations from around the globe in an effort to map historical human migration patterns. The National Geographic Society'sGenographic Project aims to map historical human migration patterns by collecting and analyzing DNA samples from over 100,000 people across five continents. The DNA Clans Genetic Ancestry Analysis measures a person's precise genetic connections to indigenous ethnic groups from around the world.[60]
Law enforcement may use genetic genealogy to track down perpetrators of violent crimes such as murder or sexual assault and they may also use it to identify deceased individuals. Initially genetic genealogy sitesGEDmatch andFamily Tree DNA allowed theirdatabases to be used by law enforcement and DNA technology companies[61][62] to do DNA testing for violent criminal cases and genetic genealogy research at the request of law enforcement. This investigative, or forensic, genetic genealogy technique became popular after the arrest of the allegedGolden State Killer in 2018,[63] but has received significant backlash from privacy experts.[64][65] However, in May 2019 GEDmatch made their privacy rules more restrictive, thereby reducing the incentive for law enforcement agencies to use their site.[66][67] Other sites such asAncestry.com,23andMe andMyHeritage have data policies that say that they would not allow their customer data to be used for crime solving without a warrant from law enforcement as they believed it violated users' privacy.[68][69]
The [DNA] test results show a genetic link between the Jefferson and Hemings descendants: A man with the Jefferson Y chromosome fatheredEston Hemings (born 1808). While there were other adult males with the Jefferson Y chromosome living in Virginia at that time, most historians now believe that the documentary and genetic evidence, considered together, strongly support the conclusion that [Thomas] Jefferson was the father of Sally Hemings's children.
Years of researching his family tree through records and documents revealed roots in Argentina, but he ran out of leads looking for his maternal great-grandfather. After hearing about new genetic testing at the University of Arizona, he persuaded a scientist there to test DNA samples from a known cousin in California and a suspected distant cousin in Buenos Aires. It was a match. But the real find was the idea for Family Tree DNA, which the former film salesman launched in early 2000 to provide the same kind of service for others searching for their ancestors.
Businessman Bennett Greenspan hoped that the approach used in the Jefferson and Cohen research would help family historians. After reaching a brick wall on his mother's surname, Nitz, he discovered and Argentine researching the same surname. Greenspan enlisted the help of a male Nitz cousin. A scientist involved in the original Cohen investigation tested the Argentine's and Greenspan's cousin's Y chromosomes. Their haplotypes matched perfectly.
{{cite journal}}
:Cite journal requires|journal=
(help)A real estate developer and entrepreneur, Greenspan has been interested in genealogy since his preteen days.
Greenspan, born and raised in Omaha, Nebraska, has been interested in genealogy from a very young age; he drew his first family tree at age 11.
{{cite book}}
:|website=
ignored (help)The growth of interest in genetic genealogy has inspired a group of individuals outside the academic area who are passionate about the subject and who have an impressive grasp of the research issues. Two focal points for this group are the International Society of Genetic Genealogy and theJournal of Genetic Genealogy. The ISOGG is a non-profit, non-commercial organization that provides resources and maintains one of the most up-to-date, if not completely academically verified, phylogenetic trees of Y chromosome haplogroups.
Meanwhile, new SNPs are being announced or published almost every month. ISOGG's role will be to maintain a tree that is as up-to-date as possible, allowing us to see where each new SNP fits in.
{{cite book}}
: CS1 maint: multiple names: authors list (link)Early book on adoptions, paternity and other relationship testing. Carmichael is a founder of GeneTree.{{cite book}}
: CS1 maint: multiple names: authors list (link)Jennifer Beamish (producer); Clive Maltby (director);Spencer Wells (host) (2003).The Journey of Man (DVD). Alexandria, VA:PBS Home Video.ASIN B0000AYL48.ISBN 978-0-7936-9625-3.OCLC 924430061.
{{cite journal}}
: CS1 maint: DOI inactive as of November 2024 (link)