. Author manuscript; available in PMC: 2014 Jul 21.

Published in final edited form as:Nature. 2013 Nov 20;505(7481):87–91. doi:10.1038/nature12736

Upper Palaeolithic Siberian genome reveals dual ancestry of NativeAmericans

Maanasa Raghavan^1,^*,Pontus Skoglund^2,^*,Kelly E Graf³,Mait Metspalu^4,^5,⁶,Anders Albrechtsen⁷,Ida Moltke^7,⁸,Simon Rasmussen⁹,Thomas W Stafford Jr^1,¹⁰,Ludovic Orlando¹,Ene Metspalu⁶,Monika Karmin^4,⁶,Kristiina Tambets⁴,Siiri Rootsi⁴,Reedik Mägi¹¹,Paula F Campos¹,Elena Balanovska¹²,Oleg Balanovsky^12,¹³,Elza Khusnutdinova^14,¹⁵,Sergey Litvinov^4,¹⁴,Ludmila P Osipova¹⁶,Sardana A Fedorova¹⁷,Mikhail I Voevoda^16,¹⁸,Michael DeGiorgio⁵,Thomas Sicheritz-Ponten^9,¹⁹,Søren Brunak^9,¹⁹,Svetlana Demeshchenko²⁰,Toomas Kivisild^4,²¹,Richard Villems^4,^6,²²,Rasmus Nielsen⁵,Mattias Jakobsson^2,²³,Eske Willerslev¹

¹Centre for GeoGenetics, Natural History Museum of Denmark,University of Copenhagen, Øster Voldgade 5–7, 1350 Copenhagen, Denmark

²Department of Evolutionary Biology, Uppsala University,Norbyvägen 18D, Uppsala 752 36, Sweden

³Center for the Study of the First Americans, Texas A&MUniversity, TAMU-4352, College Station, Texas 77845-4352, USA

⁴Estonian Biocentre, Evolutionary Biology group, Tartu 51010,Estonia

⁵Department of Integrative Biology, University of California,Berkeley, California 94720, USA

⁶Department of Evolutionary Biology, University of Tartu, Tartu51010, Estonia

⁷The Bioinformatics Centre, Department of Biology, University ofCopenhagen, Ole Maaløes Vej 5, Copenhagen 2200, Denmark

⁸Department of Human Genetics, The University of Chicago, Chicago,Illinois 60637, USA

⁹Center for Biological Sequence Analysis, Technical University ofDenmark, Kongens Lyngby 2800, Denmark

¹⁰AMS 14C Dating Centre, Department of Physics and Astronomy,University of Aarhus, Ny Munkegade 120, Aarhus DK-8000, Denmark

¹¹Estonian Genome Center, University of Tartu, Tartu 51010,Estonia

¹²Research Centre for Medical Genetics, Russian Academy ofMedical Sciences, Moskvorechie Street 1, Moscow 115479, Russia

¹³Vavilov Institute of General Genetics, Russian Academy ofSciences, Gubkina Street 3, Moscow 119991, Russia

¹⁴Institute of Biochemistry and Genetics, Ufa Scientific Centre,Russian Academy of Sciences, Ufa, Bashkorostan 450054, Russia

¹⁵Biology Department, Bashkir State University, Ufa, Bashkorostan450074, Russia

¹⁶The Institute of Cytology and Genetics, Center for BrainNeurobiology and Neurogenetics, Siberian Branch of the Russian Academy of Sciences,Lavrentyeva Avenue, Novosibirsk 630090, Russia

¹⁷Department of Molecular Genetics, Yakut Research Center ofComplex Medical Problems, Russian Academy of Medical Sciences and North-Eastern FederalUniversity, Yakutsk, Sakha (Yakutia) 677010, Russia

¹⁸8Institute of Internal Medicine, Siberian Branch of the RussianAcademy of Medical Sciences, Borisa Bogatkova 175/1, Novosibirsk 630089, Russia

¹⁹Novo Nordisk Foundation Center for Biosustainability, TechnicalUniversity of Denmark, Kongens Lyngby 2800, Denmark

²⁰The State Hermitage Museum, 2, Dvortsovaya Ploshchad, St.Petersberg 190000, Russia

²¹Department of Biological Anthropology, University of Cambridge,Cambridge CB2 1QH, UK

²²Estonian Academy of Sciences, Tallinn 10130, Estonia

²³Science for Life Laboratory, Uppsala University,Norbyvägen 18D, 752 36 Uppsala, Sweden

^✉

Correspondence and requests for materials should be addressed to E.W.(ewillerslev@snm.ku.dk)

These authors contributed equally to this work.

Issue date 2014 Jan 2.

PMC Copyright notice

PMCID: PMC4105016 NIHMSID: NIHMS583477 PMID:24256729

Abstract

The origins of the First Americans remain contentious. Although Native Americansseem to be genetically most closely related to east Asians^1–3, there is noconsensus with regard to which specific Old World populations they are closestto^4–8. Here we sequence the draft genome of an approximately 24,000-year-oldindividual (MA-1), from Mal’ta in south-central Siberia⁹, to an average depth of 13. To our knowledge this is theoldest anatomically modern human genome reported to date. The MA-1 mitochondrial genomebelongs to haplogroup U, which has also been found at high frequency among UpperPalaeolithic and Mesolithic European hunter-gatherers^10–12, and the Ychromosome of MA-1 is basal to modern-day western Eurasians and near the root of mostNative American lineages⁵. Similarly, wefind autosomal evidence that MA-1 is basal to modern-day western Eurasians and geneticallyclosely related to modern-day Native Americans, with no close affinity to east Asians.This suggests that populations related to contemporary western Eurasians had a morenorth-easterly distribution 24,000 years ago than commonly thought. Furthermore, weestimate that 14 to 38% of Native American ancestry may originate through geneflow from this ancient population. This is likely to have occurred after the divergence ofNative American ancestors from east Asian ancestors, but before the diversification ofNative American populations in the New World. Gene flow from the MA-1 lineage into NativeAmerican ancestors could explain why several crania from the First Americans have beenreported as bearing morphological characteristics that do not resemble those of eastAsians^2,13. Sequencing of another south-central Siberian, Afontova Gora-2 datingto approximately 17,000 years ago¹⁴,revealed similar autosomal genetic signatures as MA-1, suggesting that the region wascontinuously occupied by humans throughout the Last Glacial Maximum. Our findings revealthat western Eurasian genetic signatures in modern-day Native Americans derive not onlyfrom post-Columbian admixture, as commonly thought, but also from a mixed ancestry of theFirst Americans.

In 2009 we visited Hermitage State Museum in St. Petersburg, Russia, and sampledskeletal remains of a juvenile individual (MA-1) from the Mal’ta Upper Palaeolithicsite in south-central Siberia. Mal’ta, located along the Belaya River near LakeBaikal, was excavated between 1928 and 1958 (ref.9) andyielded a plethora of archaeological finds including 30anthropomorphic Venus figurines, whichare rare for Siberia but found at a number of Upper Palaeolithic sites across westernEurasia^15–17 (Fig. 1a andSupplementary Information, section 1).Accelerator mass spectrometry (AMS)¹⁴C dating of MA-1 produced an age of 20,240± 60¹⁴C years before present or24,423–23,891calendar years beforepresent (cal.bp) (SupplementaryInformation, section 2).

Sample locations and MA-1 genetic affinities. a, Geographical locations ofMal’ta and Afontova Gora-2 in south-central Siberia. For reference, Palaeolithicsites with individuals belonging to mtDNA haplogroup U are shown (red and blacktriangles): 1, Oberkassel; 2, Hohle Fels; 3, Dolni Vestonice; 4, Kostenki-14. APalaeolithic site with an individual belonging to mtDNA haplogroup B is represented by thesquare: 5, Tianyuan Cave. Notable Palaeolithic sites with Venus figurines are marked bybrown circles: 6, Laussel; 7, Lespugue; 8, Grimaldi; 9, Willendorf; 10, Gargarino. Othernotable Palaeolithic sites are shown by grey circles: 11, Sungir; 12, Yana RHS. b, PCA(PC1 versus PC2) of MA-1 and worldwide human populations for which genomic tracts fromrecent European admixture in American and Siberian populations have beenexcluded¹⁹. c, Heat map of thestatistic f₃(Yoruba; MA-1, X) where X is one of 147 worldwide non-Africanpopulations (standard errors shown inSupplementary Fig. 21). The graded heat key represents the magnitude of thecomputed f₃ statistics.

DNA from 0.15 g of bone from MA-1 was sequenced to an average depth of 13 (Supplementary Information, section 3).From one library (referred to as MA-1_1^stextraction inSupplementary Information, section 3.1),approximately 17% of the total reads generated mapped uniquely to the human genome, inagreement with good DNA preservation (seeSupplementary Information Table 2). Low contamination rates were inferred for bothmitochondrial DNA (mtDNA) (1.1%) and the X chromosome (1.6 to 2%; MA-1 ismale) (Supplementary Information, section5). The overall error rate for the data set was estimated to be 0.27%, withthe most dominant errors being transitions typical of ancient DNA damage deriving frompost-mortem deamination of cytosine¹⁸ (Supplementary Information, section6.1).

Phylogenetic analysis of the MA-1 mtDNA genome (76.6X) places it within mtDNAhaplogroup U without affiliation to any known subclades, implying a lineage that is rare orextinct in sampled modern populations (Supplementary Information, section 7 andSupplementary Fig. 4a). Present-day distribution of haplogroup Uencompasses a large area including North Africa, the Middle East, south and central Asia,western Siberia and Europe (SupplementaryFig. 4b), although it is rare or absent east of the Altai Mountains; that is, inpopulations living in the region surrounding Mal’ta. Haplogroup U has also been foundat high frequency (>80%) in ancient hunter-gatherers from Upper Palaeolithicand Mesolithic Europe^10–12. Our result therefore suggests a connectionbetween pre-agricultural Europe and Upper Palaeolithic Siberia. The Y chromosome of MA-1 wassequenced to an average depth of 1.5X, with coverage across 5.8 million bases.Acknowledging the low depth of coverage, we determined the most likely phylogeneticaffiliation of the MA-1 Y chromosome to a basal lineage of haplogroup R (Supplementary Information, section 8 andSupplementary Fig. 5a). The extantsub-lineages of haplogroup R show regional spread patterns within western Eurasia, south Asiaand also extend to the Altai region in southern Siberia (Supplementary Fig. 5b). The sisterlineage to these extant sub-lineages of haplogroup R, haplogroup Q, is the most commonhaplogroup in Native Americans⁵ and it wasrecently shown that, in Eurasia, haplogroup Q lineages closest to Native Americans are foundin southern Altai⁷.

To get an overview of the genomic signature of MA-1, we conducted principal componentanalysis (PCA) using a large data set from worldwide human populations for which genomictracts of recent European admixture in American and Siberian populations have beenexcluded¹⁹ (Supplementary Information, section 10).In the first two principal components, MA-1 is intermediate between modern western Eurasiansand Native Americans, but distant from east Asians (Fig.1b). To investigate the relationship of MA-1 to global human populations in furtherdetail, we used the f-statistics framework²⁰to compute an ‘outgroup’ f₃-statistic, which is expected to beproportional to the amount of shared genetic history between MA-1 and each of 147 non-Africanpopulations from a large worldwide human single-nucleotide polymorphism (SNP) array data set(seeSupplementary Information, section14.2 for details on the f₃-statistics). We find that genetic affinity toMA-1 is greatest in two regions: first, the Americas; and second, northeast Europe andnorthwest Siberia, with north-to-south latitudinal clines in shared drift with MA-1 in bothEurope and Asia (Fig. 1c andSupplementary Figs 21 and22). Notably, the lack of geneticaffinity between MA-1 and most populations in south-central Siberia today suggests that therewas substantial gene flow into the region after the Last Glacial Maximum (LGM), mostlyprobably from east Asian sources (Supplementary Information, section 9.1.3).

We reconstructed admixture graphs using TreeMix²¹ to relate the population history of MA-1 to 11 modern genomes fromworldwide populations²², 4 new genomes fromEurasia (Mari, Avar, Indian and Tajik ancestry) and the Denisova genome²² (Supplementary Information, section 11). The maximum-likelihood populationtree inferred without admixture events places MA-1 on a branch that is basal to westernEurasians (Supplementary Fig. 12).However, a significant residual was observed between the empirical covariance for MA-1 andKaritiana, a Native American population, and the covariance predicted by the tree model (Supplementary Fig. 12). Consequently,gene flow between these lineages was inferred in all graphs incorporating two or moremigration events (Fig. 2 andSupplementary Fig. 13). Bootstrap supportfor the migration edge from MA-1 to Karitiana, rather than from Karitiana to MA-1, was99% in this analysis.

Admixture graph for MA-1 and 16 complete genomes. An admixture graph with twomigration edges (depicted by arrows) was fitted using TreeMix²¹ to relate MA-1 to 11 modern genomes from worldwidepopulations²², 4 modern genomesproduced in this study (Avar, Mari, Indian and Tajik), and the Denisova genome²². Trees without migration, graphs withdifferent number of migration edges, and residual matrices are shown inSupplementary Information, section11. The drift parameter is proportional to 2N_e generations, whereN_e is the effective population size. The migration weight represents thefraction of ancestry derived from the migration edge. The scale bar shows ten times theaverage standard error (s.e.) of the entries in the sample covariance matrix. Note thatthe length of the branch leading to MA-1 is affected by this ancient genome beingrepresented by haploid genotypes.

We investigated further the population history of MA-1 by conducting sequenceread-based D-statistic tests²³ on proposedtree-like histories comprising MA-1 and combinations of 11 modern genomes (Supplementary Information, section 13).In agreement with the TreeMix results, these tests reject the tree ((X, Han), MA-1) where Xrepresents Avar, French, Indian, Mari, Sardinian and Tajik, consistent with the MA-1 lineagesharing more recent ancestry with the western Eurasian branch after the split of Europeans andeast Asians (Supplementary Table 13).This result also holds true when the Han Chinese is replaced with Dai, another east Asianpopulation (Supplementary Table 13).Notably, we can also reject the tree ((Han, Karitiana), MA-1) (Z 5 10.8), suggesting gene flowbetween MA-1 and ancestral Native Americans, in accordance with the admixture graphs (Supplementary Table 13). This result isconsistent with allele frequency-based D-statistic tests²⁰ on SNP arrays for 48 Native American populations of entirely FirstAmerican ancestry¹⁹, indicating that alltested populations are equally related to MA-1 and that the admixture event occurred beforethe population diversification of the First American gene pool (Fig. 3a,SupplementaryInformation, section 14.4 andSupplementary Fig. 24).

Evidence of gene flow from a population related to MA-1 and western Eurasiansinto Native American ancestors. Allele frequency-based D-statistic tests²⁰ of the forms. a, D (Yoruba, MA-1; Han, X),where X represents modern-day populations from North and South America. The D-statistic issignificantly positive for all the tests, providing evidence for gene flow between NativeAmerican ancestors and the MA-1 population lineage; however, it is not informative withrespect to the direction of gene flow. b, D (Yoruba, X; Han, Karitiana), where Xrepresents non-African populations. Since all of the 17 tested western Eurasianpopulations are closer to Karitiana than to Han Chinese, the most parsimonious explanationis that Native Americans have western Eurasian-related ancestry. c, D (Sardinian, X;Papuan, Han), where X represents non-African populations. MA-1 is not significantly closerto Han Chinese than to Papuans, which is compatible with MA-1 having no NativeAmerican-related admixture in its ancestry. Thick and thin error bars correspond to 1 and3 standard errors of the D-statistic, respectively.

The genetic affinity between Native Americans and MA-1 could be explained by geneflow after the split between east Asians and Native Americans, either from the MA-1 lineageinto Native American ancestors or from Native American ancestors to the ancestors of MA-1.However, MA-1, at approximately 24,000 cal.bp, pre-dates time estimates of theNative American–east Asian population divergence event^24,25. This presents littletime for the formation of a diverged Native American gene pool that could have contributedancestry to MA-1, suggesting gene flow from the MA-1 lineage into Native American ancestors.Such gene flow should also be detectable using modern-day western Eurasian populations inplace of MA-1. Consistent with this, D-statistic tests estimated from outgroup-ascertained SNPdata²⁰ reveal significant evidence (Z ==3) for Middle Eastern, European, central Asian and south Asian populations being closer toKaritiana than to Han Chinese²⁰ (Fig. 3b andSupplementary Information, section 14.5). Similar signals were also observed when wereplaced modern-day Han Chinese with data from chromosome 21 from a 40,000-yearold east Asianindividual (Tianyuan Cave, China), which has been found to be ancestral to modern-day Asiansand Native Americans²⁶ (Supplementary Information, section 14.5).Thus, if the gene flow direction was from Native Americans into western Eurasians it wouldhave had to spread subsequently to European, Middle Eastern, south Asian and central Asianpopulations, including MA-1 before 24,000 years ago. Moreover, as Native Americans are closerto Han Chinese than to Papuans (Fig. 3c), NativeAmerican-related gene flow into the ancestors of MA-1is expected to result in MA-1 also beingcloser to Han Chinese than to Papuans. However, our results suggest that this is not the case(D (Papuan, Han; Sardinian, MA-1) = 20.002 ± 0.005 (Z = 20.36)), which is compatiblewith all or almost all of the gene flow being into Native Americans (Supplementary Information, section 14.6).Similar results are obtained when MA-1 is replaced with most modern-day western Eurasianpopulations, except populations with recent admixture from east Asia (Russian, Adygei andBurusho) and Africa (Middle Eastern populations) (Fig.3c). The most parsimonious explanation for these results is that Native Americans havemixed origins, resulting from admixture between peoples related to modern-day east Asians andwestern Eurasians. Admixture graphs fitted with MixMapper²⁷ model Karitiana as having 14–38% western Eurasianancestry and 62–86% east Asian ancestry, but we caution that these estimatesassume unadmixed ancestral populations (Supplementary Information, section 12).

Importantly, in addition to the low contamination rates and rare or extinctuniparental lineages, we exclude modern DNA contamination as being the source of the observedpopulation affinities of MA-1 for three reasons. First, we corrected the sequence read-basedD-statistics tests for differing amounts of contamination, using a European individual as thecontamination source (SupplementaryInformation, section 13.5). We find similar outcomes for corrected and uncorrectedtests (Supplementary Fig. 20), evenwhen contamination levels larger than that estimated for MA-1 are considered, confirming thatour results are not affected by contamination from a European source. Second, restricting thePCA to sequences with evidence of post-mortem degradation gives results that are comparablewith those using the complete data set (Supplementary Information, section 15). Finally, the genome sequence of theresearcher (Indian ancestry) who carried out DNA extraction and library preparation of MA-1enables us to exclude the researcher as a source of contamination (Supplementary Information, sections 11and13). In addition, we excludepost-Columbian European admixture (after 1492ad) as an explanation for the geneticaffinity between MA-1 and Native Americans for three reasons. First, for SNP array-basedanalyses, we take recent European admixture into account by using a data set masked forinferred admixed genomic regions¹⁹. Second,allele frequency-based D-statistic tests²⁰show that all 48 tested modern-day populations with First American ancestry¹⁹ are equally related to MA-1 within theresolution of our data (SupplementaryInformation, section 14.4), which would not be expected if the signal was driven byrecent European admixture. Third, MA-1 is closer to Native Americans than any of the 15 testedEuropean populations (SupplementaryInformation, section 14.8).

Human dispersals in northeast Asia immediately before and after the LGM are mostlikely to have led to the settlement of Beringia, and ultimately the Americas²⁸. As MA-1 pre-dates the LGM, we investigatedwhether the genetic composition of southern Siberia changed during the LGM by generating alow-coverage data set (~0.1X) of a post-LGM individual from Afontova Gora-2(AG-2) (ref.14), located on the western bank of theEnisei River in south-central Siberia (Fig. 1a). Weobtained a direct AMS¹⁴C date of 13,810 ± 35¹⁴C years beforepresent or 17,075–16,750 cal.bp for AG-2 (Supplementary Information, section 2).Despite substantial present-day DNA contamination in this sample (Supplementary Information, section 5), wefind that AG-2 shows close similarity to the genetic profile of MA-1 on a PCA (Supplementary Information, section 15 andSupplementary Fig. 29) and is significantly closer to Karitiana than to Han (D(Yoruba, AG-2; Han, Karitiana) = 0.078 ± 0.004, Z = 19.9) (Supplementary Information, section 15).We observe consistent results when restricting analyses to sequences with evidence ofpost-mortem degradation (SupplementaryInformation, section 15 andSupplementary Fig. 29), implying that southern Siberia may have experienced geneticcontinuity through the environmentally harsh LGM.

Our study has four important implications. First, we find evidence that contemporaryNative Americans and western Eurasians share ancestry through gene flow from a Siberian UpperPalaeolithic population into First Americans. Second, our findings may provide an explanationfor the presence of mtDNA haplogroup X in Native Americans, which is related to westernEurasians but not found in east Asian populations²⁹. Third, such an easterly presence in Asia of a population related tocontemporary western Eurasians provides a possibility that non-east Asian cranialcharacteristics of the First Americans¹³derived from the Old World via migration through Beringia, rather than by a trans-Atlanticvoyage from Iberia as proposed by the Solutrean hypothesis³⁰. Fourth, the presence of an ancient western Eurasian genomic signature inthe Baikal area before and after the LGM suggests that parts of south-central Siberia wereoccupied by humans throughout the coldest stages of the last ice age.

METHODS

Samples

A humerus (MA-1) from Mal’ta and a humerus (AG-2) from Afontova Gora-2were sampled at the Hermitage Museum, St. Petersburg, Russia in 2009 for ancient DNAanalysis and accelerator mass spectrometry (AMS)¹⁴C dating. In addition, fourmodern human samples (Avar, Mari, Tajik and Indian) were obtained for genome sequencing inaccordance with informed consent requirements for human demographic studies. Ethicalapproval for genome sequencing of the above four modern samples was acquired from TheNational Committee on Health Research Ethics, Denmark (H-3-2012-FSP21).

Radiocarbon dating

AMS¹⁴C dating was carried out on the two ancient bone samplesfollowing standard protocols^31,32 (Supplementary Information, section 2). Contemporary¹⁴Cstandards included National Bureau of Standards Oxalic Acid-I and ANU sucrose. Respectivechemistry and combustion backgrounds were determined by using .70,000-year-old collagenisolated from the fossil Eschrichtius robustus (grey whale)^32,33 and Sigma AldrichL-Alanine (catalogue number A7627). The graphitized samples and standards were analysed atthe University of California-Irvine WM Keck Carbon Cycle Accelerator Mass SpectrometryLaboratory (UCIAMS). The¹⁴C dates were calibrated using OxCal 4.2 (ref.34) and the INTCAL09 data set³⁵.

Genome sequencing and read processing

DNA extractions and library constructions for the ancient samples were performedin a laboratory facility dedicated to the analysis of ancient DNA (Centre for GeoGenetics,Copenhagen). Bone powder from MA-1 and AG-2 (149 mg and 119 mg, respectively) wasextracted using a silica spin-column protocol^11,36,37 (SupplementaryInformation, section3.1.1). Undigested pellets were subject to another round ofdigestion. Blood samples from one individual each of Avar, Mari and Tajik ancestry wereextracted using standard protocol³⁸(Supplementary Information, section3.2.2). A saliva sample from an individual of Indian ancestry was extracted usinga prepITNL2P extraction kit (DNA Genotek) (Supplementary Information, section 3.2.2). Illumina libraries wereconstructed on the ancient and modern extracts (Supplementary Information, sections 3.1.2 and3.2.3). The protocols outlined in thekit manuals (GS FLX Titanium Rapid Library Preparation Kit, 454 Life Sciences, Roche,Branford, CO and NEBNext DNA Sample Prep Master Mix Set 2, New England Biolabs, E6070) aswell as in a previous paper³⁹ werefollowed. Equimolar pools of the ancient (100 cycles, single-read mode) and modernlibraries (100 cycles, paired-end mode) were sequenced on the llumina HiSeq 2000 at theDanish National High-Throughput DNA Sequencing Centre. The ancient libraries weresequenced to near-saturation.

Read processing was performed on the ancient and modern genomes produced in thisstudy as well as previously published genomes (Supplementary Information, sections 4.1 and4.2). The latter genomes included 11high-coverage modern genomes²², onelow-coverage Cambodian genome⁴⁰, and theDenisovan²² and Tianyuan²⁶ ancient data sets. All sequences weretrimmed using Adapter Removal⁴¹ andmapped to the human reference genome builds hg18 and 37.1 using the Burrows-WheelerAligner (BWA)⁴². The seed length optionwas disabled for ancient reads to optimize the mapping efficiency⁴³. Polymerase chain reaction (PCR) duplicateswere removed using Picard Mark Duplicates (http://picard.sourceforge.net).All modern samples (except the Cambodian genome) and the Denisova individual weregenotyped using samtools mpileup and bctools⁴⁴, and filtered to achieve a high-confidence SNP set (Supplementary Information,section4.2). Only bi-allelic sites were included when producing the final call setand the individual calls were merged to a final set using Genome Analysis Toolkit (GATK)CombineVariants-2.5-2 (ref.45).

Contamination and error rate estimation

Mitochondrial DNA (mtDNA) contamination rates for MA-1 and AG-2 were estimatedby identifying consensus calls in the ancient mtDNA data set that are private ornear-private to the ancient individual (at an allele frequency of less than 1% ina set of 311 modern human mtDNA genomes)⁴⁶ (SupplementaryInformation, section 5.1). The near-private consensus alleles and potentialcontaminating reads at these positions were counted, and a 95% confidence intervalwas obtained assuming that the allele observed in each read is a random outcome of drawingone of two alleles (endogenous and contaminant). Positions with a depth of less than 103were excluded, as were positions where the consensus allele was either C or G in atransition polymorphism, as these are sensitive to post-mortem nucleotidemisincorporations. A phred-scaled base quality of 30 was required.

As we found both individuals (MA-1 and AG-2) to be males by comparing the numberof alignments to the X and Y chromosomes⁴⁷ (SupplementaryInformation, section 4.3), it was possible to obtain X chromosome-basedcontamination estimates using previously published methods⁴⁸ (Supplementary Information, section 5.2). These estimates were based on a fixedset of SNPs known to be polymorphic in European HapMap phase II release 27 data⁴⁹. This SNP data set was pruned such thatpolymorphic sites were more than 10 bases apart. The same HapMap data was used forestimating allele frequencies in Europeans. The MA-1 and AG-2 data sets were filtered toremove: regions homologous between the X and Y chromosomes; reads mapping non-uniquely tomultiple regions of the genome with more than 98% identity; reads with mappingquality score less than 30 and base quality score less than 20; and sites with a readdepth of less than 3 (or 2 depending on library depth) or above 40.

The error rates for the sequenced ancient and modern libraries were estimatedusing a method similar to a previously published method⁴⁰ that makes use of a high quality genome (Supplementary Information, section6.1). The estimates were based on the rationale that any given human sampleshould have the same expected number of derived alleles compared to some outgroup, in thiscase the chimpanzee, panTro2, from the multiway alignment hg19 multiz46. The numbers ofderived alleles were counted from the high-quality genome (individual NA06985 from the1000 Genomes Project Consortium⁵⁰) andthe error rate estimates were based on the assumption that any excess of derived alleles(compared to the high quality genome) observed in our sample is due to errors. The overallerror rates were estimated using a method of moment estimator, while the type specificerror rates were estimated using a maximum likelihood approach. The model and theestimation methods are described in detail elsewhere³⁹. All reads with a mapping quality score less than 30 and all baseswith a base quality scoreless than20 were excluded. mtDNA and Y-chromosome haplogroupdetermination. Sequence reads from MA-1 were mapped to the revised Cambridge ReferenceSequence (rCRS,NC_012920.1) and filtered for PCR duplicates and paralogues, requiring aminimum mapping quality of 25 (Supplementary Information, section 4.1). A file of variants filtered for aminimum depth of 10, was generated (Supplementary Information, section 7). Indels were excluded from the analysis.mtDNA sequences from the individual Dolni Vestonice 14 (DV-14; GenBank accession numberKC521458), basal to the extant mtDNA haplogroup U5 (ref.12), was included in the analysis for comparison. Both the MA-1 and DV-14 mtDNAsequences were analysed for the presence of diagnostic mutations of the majorsub-haplogroups of extant mtDNA haplogroup U lineages, using information from mtDNA treeBuild 15 (Sept 30, 2012)⁵¹. Aphylogenetic tree including all major extant branches of mtDNA haplogroup U was built,with the age estimates (kilo years ± s.d.) of the differentsub-haplogroups⁵² (Supplementary Fig. 4a). To show thepresent spread of haplogroup U and its different sub-haplogroups, the average frequencies,divided into four frequency classes, were calculated in regional groups, using a data setconsisting of approximately 30,000 partial mtDNA genomes (references inSupplementary Information, section7).

Owing to low depth of coverage of the MA-1 individual, genotyping at each siteon the Y chromosome was performed by selecting the allele with the highest frequency ofbases with a base quality of 13 or higher (Supplementary Information, section 8). A multi-fast a file wasgenerated from the variable positions on the Y chromosomes available from 24 CompleteGenomics public genomes⁵³. SNPs werefiltered for quality (using the threshold VQHIGH as defined by Complete Genomics), withtri-allelic positions excluded and only Y-chromosome regions determined asphylogenetically informative being used⁵⁴. This yielded a final data set of 22,492 positions that was merged withMA-1 Y chromosome data. A neighbour joining tree with default parameters in MEGAphylogenetic software⁵⁵ was constructed(Supplementary Fig. 5a).Phylogenetically informative positions and their state in MA-1 were then determined toconfirm the placement of MA-1 on the tree. Non-informative positions, including those withmore than four Ns in the public data set, were excluded (633 positions). Moreover, thefollowing positions were also excluded which were: in reference state in all individuals,including MA-1 (7,172 positions); N in MA-1 and either N or reference state among the restof the individuals (9,682 positions); ‘N-ref’, those with only N orreference state among all individuals (586 positions), and ‘N-alt’,positions with alternative alleles, but difficult to classify (11 positions);‘reference-specific’ (79 positions); and ‘recurrent’ (28positions). This resulted in 4,301 positions being retained that were classified accordingto their haplogroup affiliations. Among those phylogenetically informative positions,1,889 non-N positions were retrieved from MA-1.

Principal component analysis

A single read was sampled from each position in the MA-1 data set, whichoverlapped with SNPs in a data set compiled from a previous paper¹⁹ in which the authors had used local ancestryinference to mask segments of European and African ancestry in Siberian and NativeAmerican populations^56–59 (Supplementary Information, section 10). A phred-scaled mapping qualityof 30 and base quality score of30 was required in the sequence data for a haploid genotypeto be called, and reads with indels were excluded. SNPs with minor allele frequency of,1% in the total data set were removed. To reduce the effect of nucleotidemisincorporations, the first and last three bases of each sequence read in the MA-1 datawere excluded. SNPs where there was no information from MA-1 were excluded, and a singlehaploid genotype was randomly sampled from each modern individual to match the single-passnature of the shotgun data⁶⁰. PCA wasperformed on various population subsets separately using EIGENSOFT 4.0 (ref.61), removing one SNP from each pair for which linkagedisequilibrium exceeded a low arbitrary threshold (r² − 0.2).Transition SNPs, where the ancient individual displayed a T or an A⁶², as well as triallelic SNPs, wereexcluded.

To look more closely at the genetic affinities of AG-2 to modern-daypopulations, data from non-African populations^59,63,64 were used as a reference panel and PCA was performed as detailedabove (Supplementary Information,section 15). To compare the PCA results from MA-1 and AG-1, Procrustestransformation was performed as described in a previous paper⁶², rotating the PC1–PC2 configurations obtained forthe two individuals to the configuration obtained using only the reference panel (Supplementary Information, section15). The analysis was repeated using only those sequences which displayed a C R Tmismatch consistent with post-mortem ancient DNA nucleotide misincorporations (PMD) in thefirst five bases of the sequence read (requiring a base quality of at least 30) (Supplementary Information, section15).

Admixture graph inference

To infer admixture graphs, a total of 17 individuals were used: the archaicDenisova genome²²; 11 present-dayindividuals²²; the 4 novel genomesfrom this study (SupplementaryInformation, section 4.2); and the MA-1 genome (Supplementary Information, section11). Haploid genotypes from MA-1 were added to variants identified in the otherindividuals, as in the PCA analysis to alleviate the increased rate of errors inlow-coverage ancient DNA sequence data. If multiple sequence reads overlapped a position,one read was randomly sampled²³. Thisavoids biasing for, or against, heterozygotes and renders the MA-1 data haploid. Alltransition SNPs were excluded and MA-1 sequence reads with a mapping quality less than 30and bases with base quality less than 30 were discarded. Positions at which there was nodata from one of the individuals in the analysis were also excluded. This resulted in afinal count of 156,250 SNPs for the main analysis. TreeMix²¹ (version 1.12) was used to build ancestry graphs assuming 0to 10 migration edges, the placement and weight of each being optimized by the algorithm.TreeMix was run using the ‘-global’ option, which corresponds toperforming a round of global rearrangements of the graph after initial fitting. Samplesize correction was also disabled, as all the populations consisted of single individuals(‘-noss’). Standard errors were estimated in blocks with 500 SNPs in each.For those analyses that included one or more a priori specified events, a round ofoptimization was performed on the original migration edge (option‘-climb’).

Admixture graphs relating MA-1 to modern groups were also inferred usingMixMapper v1.0 (ref.27) (Supplementary Information,section12). A scaffold tree was constructed using four African genomes (San,Yoruba, Mandenka, Dinka), and Sardinian and Han²² genomes, to which MA-1 and other genomes were fitted. All transitionswere excluded, and standard errors of the f-statistics were estimated using 500 bootstrapreplicates over 50 blocks of the autosomal genome.

D-statistics

To investigate the relationship between MA-1 and a number of modern populations,a sequence read-based D-statistic test (‘ABBA-BABA test’), equivalent topreviously published tests^23,40, was applied to sequencing data from asingle genome from each of the populations of interest (Supplementary Information, section13). MA-1 and 11 high-coverage present-day genomes were included in this test. Forthe chimpanzee outgroup, the multiway alignment, which includes both chimpanzee and human(pantro2 from the hg19 multiz46), was used. The data werefilteredasfollowsbeforecalculatingthesequenceread-basedD-statistic³⁹. First, all reads with mapping quality below30 were removed. Subsequently, bases of low quality were removed by dividing all basesinto eight base categories: A, C, G, T on the plus strand and A, C, G, T on the minusstrand. The lowest-scoring 50% of bases from each of the eight categories werethen discarded. More specifically, within each base category, we found the highest basequality score, Q, for which less than half of the bases in the base category had a qualityscore smaller than Q. We then removed all bases with quality score smaller than Q, andrandomly sampled and removed bases with quality score equal to Q until 50% of thebases from the base category had been removed in total. The data were filtered separatelyfor each of the eight base categories to avoid bias in the test in case of significantdifference in the base quality between the categories. After filtering, a single base wassampled at each site for each individual in order to avoid introducing bias due todifferences in sequencing depth. Finally, all sites containing transitions were removed.Based on the filtered data, D-statistics were calculated and to assess if these weresignificantly different from 0, standard errors and Z scores were obtained using a methodknownas‘delete-mJackknifefor unequal m’, with a block size of 5 megabases⁶⁵.

For genotype data from SNP arrays we computed an allele frequency-basedD-statistic test, which is a generalization of the sequence read-based test (Supplementary Information, section14.3). We used previously presented estimators^20,66, obtainingstandard errors using a block jackknife procedure over 5-megabase blocks in the genome,except for the tests with the Tianyuan data (chromosome 21), in which case we used 100-kbblocks to increase power. Two main data sets were used: first, a published SNP data set(364,470SNPs) masked for European and African ancestries in Siberian and Native Americanpopulations¹⁹, which was merged withadditional data from Finnish populations⁶³; and second, SNPs ascertained in San and Yoruban individuals and typedin worldwide populations²⁰. As the Sanand Yoruba populations are approximate outgroups to non-African populations, this data areunbiased for all comparisons between non-Africans. Transition SNPs were included but thefirst and last three bases of each sequence read were excluded since the majority ofnucleotide misincorporations occur at the ends of ancient DNA templates (Supplementary Information, section6.2). For other tests, (inSupplementary Information, section 14), SNP data described inSupplementary Table 11 were used. Wesampled a single read at each position from the MA-1 data as in the principal componentanalysis.

Outgroup f₃-statistics

Classical measures of pair wise genetic distance, such as Wright’sfixation index F_ST, are sensitive to genetic drift that has occurred since thedivergence of the two test populations. If such lineage-specific genetic drift differsbetween populations that share an equal amount of genetic history with an ancientindividual, the ancient individual would be observed as being closer to the modernpopulations with the least degree of historical genetic drift using distance-based methodssuch at F_ST. To circumvent these issues and obtain a statistic that isinformative of the genetic relatedness between a particular sample and each candidatepopulation in a reference set, an ‘outgroup f₃-statistic’ wascomputed (Supplementary Information,section 14.2). The expected value of the f₃-statistic²⁰, f₃ (Outgroup; A, B), equals thesum of expected squared change in allele frequency (normalized for heterozygosity in theoutgroup) due to genetic drift on the path in the population tree from the outgroup to theroot and from the root to the ancestor of populations A and B. As genetic drift in thelineage specific to the outgroup is expected to be constant regardless of whichpopulations A and B are used (in the absence of gene flow), the remaining variationbetween statistics will depend on how much genetic history is shared between populations Aand B. We used Yoruba as an outgroup to non-African populations and computed the statisticf₃(Yoruba; MA-1, X) to investigate the shared history of MA-1 and a set of147 worldwide candidate populations (as X) obtained by merging several data sets (Supplementary Figs 21 and22), and we corroborated majorpatterns using SNPs from a San individual from southern Africa (Supplementary Information, section14).

Supplementary Material

Supplemental Information - Upper Palaeolithic Siberian genome reveals dual ancestryof Native Americans

NIHMS583477-supplement-Supp.pdf^{(12MB, pdf)}

Acknowledgements

We thank the Hermitage State Museum for providing access to the Mal’ta andAfontova Gora-2 human remains. We also thank the Danish National High-Throughput DNASequencing Centre and T. Reisberg for technical assistance. This work was supported by theDanish National Research Foundation and the Lundbeck Foundation (E.W. and M.R.) and theArctic Social Sciences Program, National Science Foundation (grant PLR-1003725 to K.E.G.).R.V.,M.M., M.K., E.M., K.T., S.Ro. and R.M. were supported by the European RegionalDevelopment Fund (European Union) through the Centre of Excellence in Genomics to EstonianBiocentre and University of Tartu and Estonian Basic Research grant SF0270177As08. M.M.thanks the Estonian Science Foundation grant no. 8973 and Baltic-American Freedom FoundationResearch Scholarship program and M.I.V. thanks the Government of Russian Federation grantno. 14.B25.31.0033 (to E. I. Rogaev). M.D. was supported by the US National ScienceFoundation (grant DBI-1103639). Computational analyses were carried out at the HighPerformance Computing Center, University of Tartu, and the Swedish National Infrastructurefor Computing (SNIC-UPPMAX, project b2012063).

Footnotes

Supplementary Information is available in the online version of the paper.

Author Contributions E.W. and K.E.G. conceived the project. E.W. headed theproject. E.W. and M.R. designed the experimental research project setup. S.D. and K.E.G.provided access to the Mal’ta and Afontova Gora-2 samples, and K.E.G. providedarchaeological context for the samples. T.W.S. Jr performed AMS dating. E.B. and O.B.(Tajik individual), E.K. and S.L. (Mari and Avar individuals) provided modern DNA extractsfor complete genome sequencing. E.K. and S.L. (Kazakh, Kirghiz, Uzbek and Mariindividuals), L.P.O. (Selkup individuals), S.A.F. (Even, Dolgan and Yakut individuals) andM.I.V. (Altai individuals) provided access to modern DNA extracts for genotyping. R.V.carried out Illumina chip analysis on modern samples. P.F.C. performed DNA extraction fromthe Indian individual. M.R. performed the ancient extractions and library constructions onthe modern and ancient samples —the latter with input from L.O. M.R. coordinatedthe sequencing. M.R. and S.Ra. performed mapping of MA-1 and AG-2 data sets with inputfrom L.O. S.Ra., T.S.-P. and S.B. provided super-computing resources, developed thenext-generation sequencing pipeline and performed mapping and genotyping for all themodern genomes. M.R. performed DNA damage analysis with input from L.O. M.M. performed theadmixture analysis.M.M.,E.M., K.T. and R.V. performed the mtDNA analysis.M.M.,M.K.,S.Ro.,T.K., E.X. and R.M. performed the Y-chromosome analysis. A.A. and I.M. performed theautosomal contamination estimates, error rate estimates, D-statistics tests based onsequence reads and ngsAdmix analyses. P.S. performed biological sexing, mtDNAcontamination estimates, PCA, TreeMix, MixMapper, D-statistic tests based on allelefrequencies, f₃-statistics and phenotypic analyses, and analysis of AG-2 usingnucleotide misincorporation patterns under the supervision of R.N. and M.J. M.R., P.S. andE.W. wrote the majority of the manuscript with critical input from R.N., M.J., M.M.,K.E.G., A.A., I.M. and M.D. M.M., A.A. and I.M.

Author Information Sequence data for MA-1 and AG-2, produced in this study, areavailable for download through NCBI SRA accession number SRP029640. Data from the Illuminagenotyping analysis generated in this study are available through GEO Series accessionnumberGSE50727; PLINK files can be accessed fromhttp://www.ebc.ee/free_data. Inaddition, the above data and alignments for the published modern genomes, Denisova genome,Tianyuan individual and the two ancient genomes are available athttp://www.cbs.dtu.dk/suppl/malta. Raw reads and alignments for the fourmodern genomes sequenced in this study are available for demographic research under dataaccess agreement with E.W.

The authors declare no competing financial interests.

References

1.Turner CG. Advances in the dental search for native american origins. Acta Anthropogenet. 1984;8:23–78. [PubMed] [Google Scholar]
2.Hubbe M, Harvati K, Neves W. Paleoamerican morphology in the context of European and East AsianPleistocene variation: implications for human dispersion into the NewWorld. Am. J. Phys. Anthropol. 2011;144:442–453. doi: 10.1002/ajpa.21425. [DOI] [PubMed] [Google Scholar]
3.Schurr T. The peopling of the New World: perspectives from molecularanthropology. Annu. Rev. Anthropol. 2004;33:551–583. [Google Scholar]
4.O’Rourke DH, Raff JA. The human genetic history of the Americas: the finalfrontier. Curr. Bio. 2010;20:R202–R207. doi: 10.1016/j.cub.2009.11.051. [DOI] [PubMed] [Google Scholar]
5.Lell JT, et al. The dual origin and siberian affinities of native american Ychromosomes. Am. J. Hum. Genet. 2002;70:192–206. doi: 10.1086/338457. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Starikovskaya EB, et al. Mitochondrial DNA diversity in indigenous populations of the southernextent of Siberia, and the origins of Native American haplogroups. Ann. Hum. Genet. 2005;69:67–89. doi: 10.1046/j.1529-8817.2003.00127.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Dulik MC, et al. Mitochondrial DNA and Y chromosome variation provides evidence for a recentcommon ancestry between Native American and Indigenous Altaians. Am. J. Hum. Genet. 2012;90:229–246. doi: 10.1016/j.ajhg.2011.12.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Regueiro M, Alvarez J, Rowold D, Herrera RJ. On the origins, rapid expansion and genetic diversity of Native Americansfrom hunting-gatherers to agriculturalists. Am. J. Phys. Anthropol. 2013;150:333–348. doi: 10.1002/ajpa.22207. [DOI] [PubMed] [Google Scholar]
9.Gerasimov MM. The Archaeology and Geomorphology of Northern Asia: Selected Works5–32. University of Toronto Press; 1964. [Google Scholar]
10.Bramanti B, et al. Genetic discontinuity between local hunter-gatherers and centralEurope’s first farmers. Science. 2009;326:137–140. doi: 10.1126/science.1176869. [DOI] [PubMed] [Google Scholar]
11.Malmström H, et al. Ancient DNA reveals lack of continuity between Neolithic hunter-gatherersand contemporary Scandinavians. Curr. Biol. 2009;19:1758–1762. doi: 10.1016/j.cub.2009.09.017. [DOI] [PubMed] [Google Scholar]
12.Fu Q, et al. A revised timescale for human evolution based on ancient mitochondrialgenomes. Curr. Biol. 2013;23:553–559. doi: 10.1016/j.cub.2013.02.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Owsley DW, Jantz RL. Claiming the Stones-Naming the Bones: Cultural Property and the Negotiation ofNational and Ethnic Identity. Getty Research Institute; 2002. [Google Scholar]
14.Astakhov SN. Paleolit Eniseia: Paleoliticheskie Stoianki Afontovoi Gore v G.Krasnoiarske. Evropaiskii Dom; 1999. [Google Scholar]
15.Gamble C. Interaction and alliance in Palaeolithic society. Man (Lond) 1982;17:92–107. [Google Scholar]
16.Abramova Z. L’art Paléolithique d’Europe Orientale et deSibérie. Jérôme Millon; 1995. [Google Scholar]
17.White R. The women of Brassempouy: a century of research andinterpretation. J. Archaeol. Method and Theory. 2006;13:250–303. [Google Scholar]
18.Hansen AJ, Willerslev E, Wiuf C, Mourier T, Arctander P. Statistical evidence for miscoding lesions in ancient DNAtemplates. Mol. Biol. Evol. 2001;18:262–265. doi: 10.1093/oxfordjournals.molbev.a003800. [DOI] [PubMed] [Google Scholar]
19.Reich D, et al. Reconstructing Native American population history. Nature. 2012;488:370–374. doi: 10.1038/nature11258. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Patterson N, et al. Ancient admixture inhuman history. Genetics. 2012;192:1065–1093. doi: 10.1534/genetics.112.145037. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Pickrell JK, Pritchard JK. Inference of population splits and mixtures from genome-wide allelefrequency data. PLoS Genet. 2012;8:e1002967. doi: 10.1371/journal.pgen.1002967. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Meyer M, et al. A high-coverage genome sequence from an archaic Denisovanindividual. Science. 2012;338:222–226. doi: 10.1126/science.1224344. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Green RE, et al. A draft sequence of the Neandertal genome. Science. 2010;328:710–722. doi: 10.1126/science.1188021. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Gutenkunst RN, Hernandez RD, Williamson SH, Bustamante CD. Inferring the joint demographic history of multiple populations frommultidimensional SNP frequency data. PLoS Genet. 2009;5:e1000695. doi: 10.1371/journal.pgen.1000695. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Wall JD, et al. Genetic variation in Native Americans, inferred from latino SNP andresequencing data. Mol. Biol. Evol. 2011;28:2231–2237. doi: 10.1093/molbev/msr049. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Fu Q, et al. DNA analysis of an early modern human from Tianyuan Cave,China. Proc. Natl Acad. Sci. USA. 2013;110:2223–2227. doi: 10.1073/pnas.1221359110. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Lipson M, et al. Efficient moment-based inference of admixture parameters and sources ofgene flow. Mol. Biol. Evol. 2013 doi: 10.1093/molbev/mst099. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Goebel T. Pleistocene human colonization of siberia and peopling of the Americas: anecological approach. Evol. Anthropol. 1999;8:208–227. [Google Scholar]
29.Brown MD, et al. mtDNA haplogroup X: an ancient link between Europe/Western Asia and NorthAmerica? Am. J. Hum. Genet. 1998;63:1852–1861. doi: 10.1086/302155. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Bradley B, Stanford D. The North Atlantic ice-edge corridor: a possible Palaeolithic route to theNew World. World Archaeol. 2004;36:459–478. [Google Scholar]
31.Stafford TW, Jr, Jull AJT, Brendel K, Duhamel R, Donahue D. Study of bone radiocarbon dating accuracy at the University of Arizona NSFaccelerator facility for radioisotope analysis. Radiocarbon. 1987;29:24–44. [Google Scholar]
32.Stafford TW, Jr, Brendel K, Duhamel R. Radiocarbon, 13C and 15N analysis of fossil bone:removal of humates with XAD-2 resin. Geochim. Cosmochim. Acta. 1988;52:2257–2267. [Google Scholar]
33.Stafford TW, Jr, Hare PE, Currie L, Jull AJT, Donahue D. Accelerator radiocarbon dating at the molecular level. J. Archaeol. Sci. 1991;18:35–72. [Google Scholar]
34.Ramsey CB. Bayesian analysis of radiocarbon dates. Radiocarbon. 2009;51:337–360. [Google Scholar]
35.Reimer PJ, et al. IntCal09 and Marine09 radiocarbon age calibration curves, 0–50,000years cal bp. Radiocarbon. 2009;51:1111–1150. [Google Scholar]
36.Yang DY, Eng B, Waye JS, Dudar JC, Sanders SR. Technical note: improved DNA extraction from ancient bones usingsilica-based spin columns. Am. J. Phys. Anthropol. 1998;105:539–543. doi: 10.1002/(SICI)1096-8644(199804)105:4<539::AID-AJPA10>3.0.CO;2-1. [DOI] [PubMed] [Google Scholar]
37.Svensson EM, et al. Tracing genetic change over time using nuclear SNPs in ancient and moderncattle. Anim. Genet. 2007;38:378–383. doi: 10.1111/j.1365-2052.2007.01620.x. [DOI] [PubMed] [Google Scholar]
38.Powell R, Gannon F. Purification of DNA by phenol extraction and ethanolprecipitation. Oxford Practical Approach Series. 2002http://fds.oup.com/www.oup.co.uk/pdf/pas/9v1-7-3.pdf. [Google Scholar]
39.Orlando L, et al. Recalibrating Equus evolution using the genome sequence of an early MiddlePleistocene horse. Nature. 2013;499:74–78. doi: 10.1038/nature12323. [DOI] [PubMed] [Google Scholar]
40.Reich D, et al. Genetic history of an archaic hominin group from Denisova Cave inSiberia. Nature. 2010;468:1053–1060. doi: 10.1038/nature09710. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Lindgreen S. AdapterRemoval: easy cleaning of next-generation sequencingreads. BMC Res. Notes. 2012;5:337. doi: 10.1186/1756-0500-5-337. [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheelertransform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Schubert M, et al. Improving ancient DNA read mapping against modern referencegenomes. BMC Genomics. 2012;13:178. doi: 10.1186/1471-2164-13-178. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Li H, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
45.DePristo MA, et al. A framework for variation discovery and genotyping using next-generationDNA sequencing data. Nature Genet. 2011;43:491–498. doi: 10.1038/ng.806. [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Krause J, et al. A complete mtDNA genome of an early modern human from Kostenki,Russia. Curr. Biol. 2010;20:231–236. doi: 10.1016/j.cub.2009.11.068. [DOI] [PubMed] [Google Scholar]
47.Skoglund P, Storå J, Götherström A, Jakobsson M. Accurate sex identification in ancient human remains using DNA shotgunsequencing. J. Archaeol. Sci. 2013;40:4477–4482. [Google Scholar]
48.Rasmussen M, et al. An Aboriginal Australian genome reveals separate human dispersals inAsia. Science. 2011;334:94–98. doi: 10.1126/science.1211177. [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Frazer KA, et al. A second generation human haplotype map of over 3.1 millionSNPs. Nature. 2007;449:851–861. doi: 10.1038/nature06258. [DOI] [PMC free article] [PubMed] [Google Scholar]
50.The 1000 Genomes Project Consortium. An integrated map of genetic variationfrom 1,092 human genomes. Nature. 2012;491:56–65. doi: 10.1038/nature11632. [DOI] [PMC free article] [PubMed] [Google Scholar]
51.Van Oven M, Kayser M. Updated comprehensive phylogenetic tree of global human mitochondrial DNAvariation. Hum. Mutat. 2009;30:E386–E394. doi: 10.1002/humu.20921. [DOI] [PubMed] [Google Scholar]
52.Behar DM, et al. A “Copernican” reassessment of the human mitochondrial DNAtree from its root. Am. J. Hum. Genet. 2012;90:675–684. doi: 10.1016/j.ajhg.2012.03.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
53.Drmanac R, et al. Human genome sequencing using unchained base reads on self-assembling DNAnanoarrays. Science. 2010;327:78–81. doi: 10.1126/science.1181498. [DOI] [PubMed] [Google Scholar]
54.Wei W, et al. A calibrated human Y-chromosomal phylogeny based onresequencing. Genome Res. 2013;23:388–395. doi: 10.1101/gr.143198.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
55.Tamura K, et al. MEGA5: molecular evolutionary genetics analysis using maximum likelihood,evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 2011;28:2731–2739. doi: 10.1093/molbev/msr121. [DOI] [PMC free article] [PubMed] [Google Scholar]
56.Hancock AM, et al. Adaptations to climate-mediated selective pressures inhumans. PLoS Genet. 2011;7:e1001375. doi: 10.1371/journal.pgen.1001375. [DOI] [PMC free article] [PubMed] [Google Scholar]
57.Rasmussen M, et al. Ancient human genome sequence of an extinct Palaeo-Eskimo. Nature. 2010;463:757–762. doi: 10.1038/nature08835. [DOI] [PMC free article] [PubMed] [Google Scholar]
58.International HapMap 3 Consortium. Integrating common and rare geneticvariation in diverse human populations. Nature. 2010;467:52–58. doi: 10.1038/nature09298. [DOI] [PMC free article] [PubMed] [Google Scholar]
59.Li JZ, et al. World wide human relationships inferred from genome-wide patterns ofvariation. Science. 2008;319:1100–1104. doi: 10.1126/science.1153717. [DOI] [PubMed] [Google Scholar]
60.Skoglund P, Jakobsson M. Archaic human ancestry in East Asia. Proc. Natl Acad. Sci. USA. 2011;108:18301–18306. doi: 10.1073/pnas.1108181108. [DOI] [PMC free article] [PubMed] [Google Scholar]
61.Patterson N, Price AL, Reich D. Population structure and Eigen analysis. PLoS Genet. 2006;2:e190. doi: 10.1371/journal.pgen.0020190. [DOI] [PMC free article] [PubMed] [Google Scholar]
62.Skoglund P, et al. Origins and Genetic legacy of Neolithic farmers and hunter-gatherers inEurope. Science. 2012;336:466–469. doi: 10.1126/science.1216304. [DOI] [PubMed] [Google Scholar]
63.Surakka I, et al. Founder population-specific HapMap panel increases power in GWA studiesthrough improved imputation accuracy and CNV tagging. Genome Res. 2010;20:1344–1351. doi: 10.1101/gr.106534.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
64.International HapMap3 Consortium. Integrating common and rare geneticvariation in diverse human populations. Nature. 2010;467:52–58. doi: 10.1038/nature09298. [DOI] [PMC free article] [PubMed] [Google Scholar]
65.Busing FMTA, Meijer E, Van der Leeden R. Delete-m Jackknife for Unequal m. Stat. Comput. 1999;9:3–8. [Google Scholar]
66.Durand EY, Patterson N, Reich D, Slatkin M. Testing for ancient admixture between closely relatedpopulations. Mol. Biol. Evol. 2011;28:2239–2252. doi: 10.1093/molbev/msr048. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Information - Upper Palaeolithic Siberian genome reveals dual ancestryof Native Americans

NIHMS583477-supplement-Supp.pdf^{(12MB, pdf)}

Movatterモバイル変換

PERMALINK

Upper Palaeolithic Siberian genome reveals dual ancestry of NativeAmericans

Maanasa Raghavan

Pontus Skoglund

Kelly E Graf

Mait Metspalu

Anders Albrechtsen

Ida Moltke

Simon Rasmussen

Thomas W Stafford Jr

Ludovic Orlando

Ene Metspalu

Monika Karmin

Kristiina Tambets

Siiri Rootsi

Reedik Mägi

Paula F Campos

Elena Balanovska

Oleg Balanovsky

Elza Khusnutdinova

Sergey Litvinov

Ludmila P Osipova

Sardana A Fedorova

Mikhail I Voevoda

Michael DeGiorgio

Thomas Sicheritz-Ponten

Søren Brunak

Svetlana Demeshchenko

Toomas Kivisild

Richard Villems

Rasmus Nielsen

Mattias Jakobsson

Eske Willerslev

Abstract

Figure 1.

Figure 2.

Figure 3.