Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Reference genome

From Wikipedia, the free encyclopedia
Digital nucleic acid sequence database
The first printout of the human reference genome presented as a series of books, displayed at theWellcome Collection, London

Areference genome is agenome assembly that represents thecomplete genetic sequence of an organism as a continuous string ofnucleotides (A, T, C, and G). For an assembly to serve as a reference genome, it is typically accompanied by annotations, produced through a process known as DNA orgenome annotation. The annotations specify the genomic coordinates (start and end locations) ofgenes,exons,introns, andmRNA, and are often paired with corresponding transcript (mRNA) andprotein sequences (algorithm predicted or experimentally validated).[1]

Reference genomes exist for a wide variety ofspecies, including species ofviruses,bacteria,fungi,plants andanimals, and they differ in how they are constructed and represented. A reference may be derived from a single individual or from multiple individuals whose sequences are collapsed into one representative assembly -haplotype. Two main factors determine reference genome's assembly quality: thesequencing technology which affects sequence accuracy and the assembly level which indicates how complete the genome representation is.[2][3]

The ideal is a chromosome-level assembly, which is a complete DNA sequence for each chromosome with no unplaced segments. However, achieving this remains technically challenging, especially for large or repetitive genomes (dense inrepetitive elements). Earlier sequencing technologies often produced assemblies at thecontig (short contiguous sequences) orscaffold (ordered sets of contigs) level, with limited chromosomal context. The exact size of these fragments depends on the sequencing platform and bioinformatic methods available at the time.[4]

For assemblies that are not fully resolved, summary statistics such as N50 and L50 are commonly used to characterise contiguity and assembly fragmentation; these metrics are explained in theContigs andScaffolds section.

Reference genomes are central toomics research, particularlygenomics. They provide a reference for "mapping" DNA sequence data from many individuals, enabling efficient identification of the genomic location of these sequences and the detection ofpolymorphisms (sequence differences among individuals) through a process known asvariant calling.[5]

The limitations of this practice, such as reference bias and under-representation of population diversity, have led to the development of population-level reference sets andpangenomes.[6]

Reference genomes and their annotations are publicly accessible through online genome browsers and archives such asEnsembl,[7] the European Nucleotide Archive (ENA) atEMBL-EBI, theUCSC Genome Browser, andNCBI.

Properties of reference genomes

[edit]

Measures of length

[edit]

The length of a genome can be measured in multiple different ways.

A simple way to measure genome length is to count the number of base pairs in the assembly.[8]

Thegolden path is an alternative measure of length that omits redundant regions such ashaplotypes andpseudo autosomal regions.[9][10] It is usually constructed by layering sequencing information over a physical map to combine scaffold information. It is a 'best estimate' of what thegenome will look like and typically includes gaps, making it longer than the typical base pair assembly.[11]

Contigs and scaffolds

[edit]
Diagram of reads arrangement, formingcontigs and these can be assembled intoscaffolds in the complete process of sequencing and assembly of a reference genome. The gap between contig 1 and 2 is indicated as sequenced, forming a scaffold, while the other gap is not sequenced and separates scaffold 1 and 2.

Reference genomes assembly requires reads overlapping, creatingcontigs, which are contiguous DNA regions ofconsensus sequences.[12] If there are gaps between contigs, these can be filled byscaffolding, either by contigs amplification with PCR and sequencing or byBacterial Artificial Chromosome (BAC) cloning.[13][12] Filling these gaps is not always possible, in this case multiple scaffolds are created in a reference assembly.[14] Scaffolds are classified in 3 types: 1) Placed, whose chromosome, genomic coordinates and orientations are known; 2) Unlocalised, when only the chromosome is known but not the coordinates or orientation; 3) Unplaced, whose chromosome is not known.[15]

The number ofcontigs andscaffolds, as well as their average lengths are relevant parameters, among many others, for a reference genome assembly quality assessment since they provide information about the continuity of the final mapping from the original genome. The smaller the number of scaffolds per chromosome, until a single scaffold occupies an entire chromosome, the greater the continuity of the genome assembly.[16][17][18] Other related parameters areN50 andL50. N50 is the length of the contigs/scaffolds in which the 50% of the assembly is found in fragments of this length or greater, while L50 is the number of contigs/scaffolds whose length is N50. The higher the value of N50, the lower the value of L50, and vice versa, indicating high continuity in the assembly.[19][20][21]

Mammalian genomes

[edit]

The human and mouse reference genomes are maintained and improved by theGenome Reference Consortium (GRC), a group of fewer than 20 scientists from a number of genome research institutes, including theEuropean Bioinformatics Institute, theNational Center for Biotechnology Information, theSanger Institute andMcDonnell Genome Institute atWashington University in St. Louis. GRC continues to improve reference genomes by building new alignments that contain fewer gaps, and fixing misrepresentations in the sequence.

Human reference genome

[edit]

The original human reference genome was derived from thirteen anonymous volunteers fromBuffalo, New York. Donors were recruited by advertisement inThe Buffalo News, on Sunday, March 23, 1997. The first ten male and ten female volunteers were invited to make an appointment with the project'sgenetic counselors and donate blood from which DNA was extracted. As a result of how the DNA samples were processed, about 80 percent of the reference genome came from eight people and one male, designatedRP11, accounts for 66 percent of the total. TheABO blood group system differs among humans, but the human reference genome contains only anO allele, although the others areannotated.[22][23][24][25][26]

Evolution of the cost of sequencing a human genome from 2001 to 2021

As the cost ofDNA sequencing falls, and newfull genome sequencing technologies emerge, more genome sequences continue to be generated. In several cases people such asJames D. Watson had their genome assembled usingmassive parallel DNA sequencing.[27][28] Comparison between the reference (assembly NCBI36/hg18) and Watson's genome revealed 3.3  millionsingle nucleotide polymorphism differences, while about 1.4 percent of his DNA could not be matched to the reference genome at all.[26][27] For regions where there is known to be large-scale variation, sets of alternateloci are assembled alongside the reference locus.

Chromosomes ideogram of the human reference genome assembly GRCh38/hg38. Characteristic bands patterns are displayed in black, grey and white, while the gaps and partially assembled regions are displayed in blue and rose, respectively. Reference: Genome Data Viewer of the NCBI.[29]

The latest human reference genome assembly, released by theGenome Reference Consortium, was GRCh38 in 2017.[30] Several patches were added to update it, the latest patch being GRCh38.p14, published on the 3rd of February 2022.[31][32] This build only has 349 gaps across the entire assembly, which implies a great improvement in comparison with the first version, which had roughly 150,000 gaps.[23] The gaps are mostly in areas such astelomeres,centromeres, and longrepetitive sequences, with the biggest gap along the long arm of the Y chromosome, a region of ~30 Mb in length (~52% of the Y chromosome's length).[33] The number ofgenomic clone libraries contributing to the reference has increased steadily to >60 over the years, although individualRP11 still accounts for 70% of the reference genome.[34] Genomic analysis of this anonymous male suggests that he is of African-European ancestry.[34] According to the GRC website, their next assembly release for the human genome (version GRCh39) is currently "indefinitely postponed".[35]

In 2022, the Telomere-to-Telomere (T2T) Consortium,[36] an open, community-based effort, published the first completely assembled reference genome (version T2T-CHM13), without any gaps in the assembly. It did not contain a Y-chromosome until version 2.0.[37][38] This assembly allows for the examination of centromeric and pericentromeric sequence evolution. The consortium employed rigorous methods to assemble, clean, and validate complex repeat regions which are particularly difficult to sequence.[39] It used ultra-long–read (>100 kb) sequencing to accurately sequencesegmental duplications.[40]

The T2T-CHM13 is sequenced from CHM13hTERT, a cell line from an essentially haploidhydatidiform mole. "CHM" stands for "Complete Hydatidiform Mole," and "13" is its line number. "hTERT" stands for "humanTelomerase Reverse Transcriptase". The cell line has been transfected with the TERT gene, which is responsible for maintaining telomere length and thus contributes to thecell line's immortality.[41] A hydatidiform mole contains two copies of the same parental genome, and thus is essentially haploid. This eliminates allelic variation and allows better sequencing accuracy.[40]

Recent genome assemblies are as follows:[42]

Release nameDate of releaseEquivalent UCSC version
GRCh39Indefinitely postponed[35]-
T2T-CHM13January 2022hs1
GRCh38Dec 2013hg38
GRCh37Feb 2009hg19
NCBI Build 36.1Mar 2006hg18
NCBI Build 35May 2004hg17
NCBI Build 34Jul 2003hg16

Limitations

[edit]

For much of a genome, the reference provides a good approximation of the DNA of any single individual. But in regions with highallelic diversity, such as themajor histocompatibility complex in humans and themajor urinary proteins of mice, the reference genome may differ significantly from other individuals.[43][44][45] Due to the fact that the reference genome is a "single" distinct sequence, which gives its utility as an index or locator of genomic features, there are limitations in terms of how faithfully it represents the human genome and itsvariability. Most of the initial samples used for reference genome sequencing came from people of European ancestry. In 2010, it was found that, byde novo assembling genomes from African and Asian populations with the NCBI reference genome (version NCBI36), these genomes had ~5Mb sequences that did not align against any region of the reference genome.[46]

Following projects to the Human Genome Project seek to address a deeper and more diverse characerization of the human genetic variability, which the reference genome is not able to represent. TheHapMap Project, active during the period 2002 -2010, with the purpose of creating ahaplotypes map and their most common variations among different human populations. Up to 11 populations of different ancestry were studied, such as individuals of theHan ethnic group from China,Gujaratis from India, theYoruba people from Nigeria orJapanese people, among others.[47][48][49][50] The1000 Genomes Project, carried out between 2008 and 2015, with the aim of creating a database that includes more than 95% of the variations present in the human genome and whose results can be used in studies of association with diseases (GWAS) such as diabetes, cardiovascular or autoimmune diseases. A total of 26 ethnic groups were studied in this project, expanding the scope of the HapMap project to new ethnic groups such as theMende people of Sierra Leone, theVietnamese people or theBengali people.[51][52][53][54] TheHuman Pangenome Project, which started its initial phase in 2019 with the creation of the Human Pangenome Reference Consortium, seeks to create the largest map of human genetic variability taking the results of previous studies as a starting point.[55][56]

Mouse reference genome

[edit]

Recent mouse genome assemblies are as follows:[42]

Release nameDate of releaseEquivalent UCSC version
GRCm39June 2020mm39
GRCm38Dec 2011mm10
NCBI Build 37Jul 2007mm9
NCBI Build 36Feb 2006mm8
NCBI Build 35Aug 2005mm7
NCBI Build 34Mar 2005mm6

Other genomes

[edit]

Since the Human Genome Project was finished, multiple international projects have started, focused on assembling reference genomes for many organisms. Model organisms (e.g., zebrafish (Danio rerio), chicken (Gallus gallus),Escherichia coli etc.) are of special interest to the scientific community, as well as, for example, endangered species (e.g., Asian arowana (Scleropages formosus) or the American bison (Bison bison)). As of August 2022, the NCBI database supports 71 886 partially or completely sequenced and assembled genomes from different species, such as 676mammals, 590birds and 865fishes. Also noteworthy are the numbers of 1796insects genomes, 3747fungi, 1025plants, 33 724bacteria, 26 004virus and 2040archaea.[57] A lot of these species have annotation data associated with their reference genomes that can be publicly accessed andvisualized in genome browsers such asEnsembl andUCSC Genome Browser.[58][59]

Some examples of these international projects are: theChimpanzee Genome Project, carried out between 2005 and 2013 jointly by theBroad Institute and theMcDonnell Genome Institute ofWashington University in St. Louis, which generated the first reference genomes for 4 subspecies ofPan troglodytes;[60][61] the100K Pathogen Genome Project, which started in 2012 with the main goal of creating a database of reference genomes for 100 000pathogen microorganisms to use in public health, outbreaks detection, agriculture and environment;[62] theEarth BioGenome Project, which started in 2018 and aims to sequence and catalog the genomes of all the eukaryotic organisms on Earth to promote biodiversity conservation projects. Inside this big-science project there are up to 50 smaller-scale affiliated projects such as theAfrica BioGenome Project or the1000 Fungal Genomes Project.[63][64][65]

See also

[edit]

References

[edit]
  1. ^Ejigu, Girum Fitihamlak; Jung, Jaehee (2020-09-18)."Review on the Computational Genome Annotation of Sequences Obtained by Next-Generation Sequencing".Biology.9 (9): 295.doi:10.3390/biology9090295.ISSN 2079-7737.PMC 7565776.PMID 32962098.
  2. ^Giani, Alice Maria; Gallo, Guido Roberto; Gianfranceschi, Luca; Formenti, Giulio (2020-01-01)."Long walk to genomics: History and current approaches to genome sequencing and assembly".Computational and Structural Biotechnology Journal.18:9–19.doi:10.1016/j.csbj.2019.11.002.ISSN 2001-0370.PMC 6926122.PMID 31890139.
  3. ^Ballouz, Sara; Dobin, Alexander; Gillis, Jesse A. (2019-08-09)."Is it time to change the reference genome?".Genome Biology.20 (1): 159.doi:10.1186/s13059-019-1774-4.ISSN 1474-760X.PMC 6688217.PMID 31399121.
  4. ^Nurk, Sergey; Koren, Sergey; Rhie, Arang; Rautiainen, Mikko; Bzikadze, Andrey V.; Mikheenko, Alla; Vollger, Mitchell R.; Altemose, Nicolas; Uralsky, Lev; Gershman, Ariel; Aganezov, Sergey; Hoyt, Savannah J.; Diekhans, Mark; Logsdon, Glennis A.; Alonge, Michael (April 2022)."The complete sequence of a human genome".Science.376 (6588):44–53.Bibcode:2022Sci...376...44N.doi:10.1126/science.abj6987.ISSN 0036-8075.PMC 9186530.PMID 35357919.
  5. ^Aganezov, Sergey; Yan, Stephanie M.; Soto, Daniela C.; Kirsche, Melanie; Zarate, Samantha; Avdeyev, Pavel; Taylor, Dylan J.; Shafin, Kishwar; Shumate, Alaina; Xiao, Chunlin; Wagner, Justin; McDaniel, Jennifer; Olson, Nathan D.; Sauria, Michael E. G.; Vollger, Mitchell R. (April 2022)."A complete reference genome improves analysis of human genetic variation".Science.376 (6588) eabl3533.doi:10.1126/science.abl3533.ISSN 0036-8075.PMC 9336181.PMID 35357935.
  6. ^Miga, Karen H.; Wang, Ting (2021-08-31)."The Need for a Human Pangenome Reference Sequence".Annual Review of Genomics and Human Genetics.22 (1):81–102.doi:10.1146/annurev-genom-120120-081921.ISSN 1527-8204.PMC 8410644.PMID 33929893.
  7. ^Flicek, P.; Aken, B. L.; Beal, K.; Ballester, B.; Caccamo, M.; Chen, Y.; Clarke, L.; Coates, G.; Cunningham, F.; Cutts, T.; Down, T.; Dyer, S. C.; Eyre, T.; Fitzgerald, S.; Fernandez-Banet, J. (2007-12-23)."Ensembl 2008".Nucleic Acids Research.36 (Database):D707–D714.doi:10.1093/nar/gkm988.ISSN 0305-1048.PMC 2238821.PMID 18000006.
  8. ^"Help - Glossary - Homo sapiens - Ensembl genome browser 87".www.ensembl.org.
  9. ^"Golden path length | VectorBase".www.vectorbase.org. Archived fromthe original on 2020-08-07. Retrieved2016-12-12.
  10. ^"Help - Glossary - Homo sapiens - Ensembl genome browser 87".www.ensembl.org.
  11. ^"Whole assembly vs Golden path length in Ensembl? - SEQanswers".seqanswers.com. 31 July 2014. Retrieved2016-12-12.
  12. ^abGibson, Greg; Muse, Spencer V. (2009).A Primer of Genome Science (3rd ed.). Sinauer Associates. p. 84.ISBN 978-0-878-93236-8.
  13. ^"Help - Glossary - Homo_sapiens - Ensembl genome browser 107".www.ensembl.org. Retrieved2022-09-26.
  14. ^Luo, Junwei; Wei, Yawei; Lyu, Mengna; Wu, Zhengjiang; Liu, Xiaoyan; Luo, Huimin; Yan, Chaokun (2021-09-02). "A comprehensive review of scaffolding methods in genome assembly".Briefings in Bioinformatics.22 (5) bbab033.doi:10.1093/bib/bbab033.ISSN 1477-4054.PMID 33634311.
  15. ^"Chromosomes, scaffolds and contigs".www.ensembl.org. Retrieved2022-09-26.
  16. ^Meader, Stephen; Hillier, LaDeana W.; Locke, Devin; Ponting, Chris P.; Lunter, Gerton (May 2010)."Genome assembly quality: Assessment and improvement using the neutral indel model".Genome Research.20 (5):675–684.doi:10.1101/gr.096966.109.ISSN 1088-9051.PMC 2860169.PMID 20305016.
  17. ^Rice, Edward S.; Green, Richard E. (2019-02-15)."New Approaches for Genome Assembly and Scaffolding".Annual Review of Animal Biosciences.7 (1):17–40.doi:10.1146/annurev-animal-020518-115344.ISSN 2165-8102.PMID 30485757.S2CID 54121772.
  18. ^Cao, Minh Duc; Nguyen, Son Hoang; Ganesamoorthy, Devika; Elliott, Alysha G.; Cooper, Matthew A.; Coin, Lachlan J. M. (2017-02-20)."Scaffolding and completing genome assemblies in real-time with nanopore sequencing".Nature Communications.8 (1) 14515.Bibcode:2017NatCo...814515C.doi:10.1038/ncomms14515.ISSN 2041-1723.PMC 5321748.PMID 28218240.
  19. ^Mende, Daniel R.; Waller, Alison S.; Sunagawa, Shinichi; Järvelin, Aino I.; Chan, Michelle M.; Arumugam, Manimozhiyan; Raes, Jeroen; Bork, Peer (2012-02-23)."Assessment of Metagenomic Assembly Using Simulated Next Generation Sequencing Data".PLOS ONE.7 (2) e31386.Bibcode:2012PLoSO...731386M.doi:10.1371/journal.pone.0031386.ISSN 1932-6203.PMC 3285633.PMID 22384016.
  20. ^Alhakami, Hind; Mirebrahim, Hamid; Lonardi, Stefano (2017-05-18)."A comparative evaluation of genome assembly reconciliation tools".Genome Biology.18 (1): 93.doi:10.1186/s13059-017-1213-3.ISSN 1474-7596.PMC 5436433.PMID 28521789.
  21. ^Castro, Christina J.; Ng, Terry Fei Fan (2017-11-01)."U50: A New Metric for Measuring Assembly Output Based on Non-Overlapping, Target-Specific Contigs".Journal of Computational Biology.24 (11):1071–1080.doi:10.1089/cmb.2017.0013.PMC 5783553.PMID 28418726.
  22. ^Scherer S (2008).A short guide to the human genome. CSHL Press. p. 135.ISBN 978-0-87969-791-4.
  23. ^ab"E pluribus unum".Nature Methods.7 (5): 331. May 2010.doi:10.1038/nmeth0510-331.PMID 20440876.
  24. ^Ballouz S, Dobin A, Gillis JA (August 2019)."Is it time to change the reference genome?".Genome Biology.20 (1) 159.doi:10.1186/s13059-019-1774-4.PMC 6688217.PMID 31399121.
  25. ^Rosenfeld JA, Mason CE, Smith TM (11 July 2012)."Limitations of the human reference genome for personalized genomics".PLOS ONE.7 (7) e40294.Bibcode:2012PLoSO...740294R.doi:10.1371/journal.pone.0040294.PMC 3394790.PMID 22811759.
  26. ^abWade N (May 31, 2007)."Genome of DNA Pioneer Is Deciphered".New York Times. RetrievedFebruary 21, 2009.
  27. ^abWheeler DA, Srinivasan M, Egholm M, Shen Y, Chen L, McGuire A, et al. (April 2008)."The complete genome of an individual by massively parallel DNA sequencing".Nature.452 (7189):872–876.Bibcode:2008Natur.452..872W.doi:10.1038/nature06884.PMID 18421352.
  28. ^The exception to this isJ. Craig Venter whose DNA was sequenced and assembled usingshotgun sequencing methods.
  29. ^"Genome Data Viewer - NCBI".www.ncbi.nlm.nih.gov. Retrieved2022-08-18.
  30. ^Schneider VA, Graves-Lindsay T, Howe K, Bouk N, Chen HC, Kitts PA, et al. (May 2017)."Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly".Genome Research.27 (5):849–864.doi:10.1101/gr.213611.116.PMC 5411779.PMID 28396521.
  31. ^"GRCh38.p14 - hg38 - Genome - Assembly - NCBI".www.ncbi.nlm.nih.gov. Retrieved2022-08-19.
  32. ^Genome Reference Consortium (2022-05-09)."GenomeRef: GRCh38.p14 is now released!".GRC Blog (GenomeRef). Retrieved2022-08-19.
  33. ^"GRCh38.p14 - hg38 - Genome - Assembly - NCBI - Statistics Report".www.ncbi.nlm.nih.gov. Retrieved2022-08-18.
  34. ^ab"How many individuals were sequenced for the human reference genome assembly?".Genome Reference Consortium. Retrieved7 April 2022.
  35. ^ab"Genome Reference Consortium".www.ncbi.nlm.nih.gov. Retrieved2022-08-18.
  36. ^"Telomere-to-Telomere".NHGRI. Retrieved2022-08-16.
  37. ^Nurk S, Koren S, Rhie A, Rautiainen M, Bzikadze AV, Mikheenko A, et al. (April 2022)."The complete sequence of a human genome".Science.376 (6588):44–53.Bibcode:2022Sci...376...44N.doi:10.1126/science.abj6987.PMC 9186530.PMID 35357919.S2CID 247854936.
  38. ^"T2T-CHM13v2.0 - Genome - Assembly - NCBI".www.ncbi.nlm.nih.gov. Retrieved2022-08-16.
  39. ^Altemose, Nicolas; Logsdon, Glennis A.; Bzikadze, Andrey V.; Sidhwani, Pragya; Langley, Sasha A.; Caldas, Gina V.; Hoyt, Savannah J.; Uralsky, Lev; Ryabov, Fedor D.; Shew, Colin J.; Sauria, Michael E. G.; Borchers, Matthew; Gershman, Ariel; Mikheenko, Alla; Shepelev, Valery A. (April 2022)."Complete genomic and epigenetic maps of human centromeres".Science.376 (6588) eabl4178.doi:10.1126/science.abl4178.ISSN 0036-8075.PMC 9233505.PMID 35357911.
  40. ^abChurch, Deanna M. (April 2022)."A next-generation human genome sequence".Science.376 (6588):34–35.Bibcode:2022Sci...376...34C.doi:10.1126/science.abo5367.ISSN 0036-8075.PMID 35357937.
  41. ^Steinberg, Karyn Meltz; Schneider, Valerie A.; Graves-Lindsay, Tina A.; Fulton, Robert S.; Agarwala, Richa; Huddleston, John; Shiryev, Sergey A.; Morgulis, Aleksandr; Surti, Urvashi; Warren, Wesley C.; Church, Deanna M.; Eichler, Evan E.; Wilson, Richard K. (December 2014)."Single haplotype assembly of the human genome from a hydatidiform mole".Genome Research.24 (12):2066–2076.doi:10.1101/gr.180893.114.ISSN 1088-9051.PMC 4248323.PMID 25373144.
  42. ^ab"UCSC Genome Bioinformatics: FAQ".genome.ucsc.edu. Retrieved2016-08-18.
  43. ^MHC Sequencing Consortium (October 1999). "Complete sequence and gene map of a human major histocompatibility complex. The MHC sequencing consortium".Nature.401 (6756):921–923.Bibcode:1999Natur.401..921T.doi:10.1038/44853.PMID 10553908.S2CID 186243515.
  44. ^Logan DW, Marton TF, Stowers L (September 2008). Vosshall LB (ed.)."Species specificity in major urinary proteins by parallel evolution".PLOS ONE.3 (9) e3280.Bibcode:2008PLoSO...3.3280L.doi:10.1371/journal.pone.0003280.PMC 2533699.PMID 18815613.
  45. ^Hurst J, Beynon RJ, Roberts SC, Wyatt TD (October 2007).Urinary Lipocalins in Rodenta:is there a Generic Model?. Chemical Signals in Vertebrates 11. Springer New York.ISBN 978-0-387-73944-1.
  46. ^Li R, Li Y, Zheng H, Luo R, Zhu H, Li Q, et al. (January 2010). "Building the sequence map of the human pan-genome".Nature Biotechnology.28 (1):57–63.doi:10.1038/nbt.1596.PMID 19997067.S2CID 205274447.
  47. ^The International HapMap Consortium (October 2005)."A haplotype map of the human genome".Nature.437 (7063):1299–1320.Bibcode:2005Natur.437.1299T.doi:10.1038/nature04226.PMC 1880871.PMID 16255080.
  48. ^Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, et al. (October 2007)."A second generation human haplotype map of over 3.1 million SNPs".Nature.449 (7164):851–861.Bibcode:2007Natur.449..851F.doi:10.1038/nature06258.PMC 2689609.PMID 17943122.
  49. ^Altshuler DM, Gibbs RA, Peltonen L, Altshuler DM, Gibbs RA, Peltonen L, et al. (September 2010)."Integrating common and rare genetic variation in diverse human populations".Nature.467 (7311):52–58.Bibcode:2010Natur.467...52T.doi:10.1038/nature09298.PMC 3173859.PMID 20811451.
  50. ^"International HapMap Project".Genome.gov. Retrieved2022-08-18.
  51. ^Abecasis GR, Altshuler D, Auton A, Brooks LD, Durbin RM, Gibbs RA, et al. (October 2010)."A map of human genome variation from population-scale sequencing".Nature.467 (7319):1061–1073.Bibcode:2010Natur.467.1061T.doi:10.1038/nature09534.PMC 3042601.PMID 20981092.
  52. ^Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, et al. (November 2012)."An integrated map of genetic variation from 1,092 human genomes".Nature.491 (7422):56–65.Bibcode:2012Natur.491...56T.doi:10.1038/nature11632.PMC 3498066.PMID 23128226.
  53. ^Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, et al. (October 2015)."A global reference for human genetic variation".Nature.526 (7571):68–74.Bibcode:2015Natur.526...68T.doi:10.1038/nature15393.PMC 4750478.PMID 26432245.
  54. ^Sudmant PH, Rausch T, Gardner EJ, Handsaker RE, Abyzov A, Huddleston J, et al. (October 2015)."An integrated map of structural variation in 2,504 human genomes".Nature.526 (7571):75–81.Bibcode:2015Natur.526...75..doi:10.1038/nature15394.PMC 4617611.PMID 26432246.
  55. ^Miga KH, Wang T (August 2021)."The Need for a Human Pangenome Reference Sequence".Annual Review of Genomics and Human Genetics.22 (1):81–102.doi:10.1146/annurev-genom-120120-081921.PMC 8410644.PMID 33929893.
  56. ^Wang T, Antonacci-Fulton L, Howe K, Lawson HA, Lucas JK, Phillippy AM, et al. (April 2022)."The Human Pangenome Project: a global resource to map genomic diversity".Nature.604 (7906):437–446.Bibcode:2022Natur.604..437W.doi:10.1038/s41586-022-04601-8.PMC 9402379.PMID 35444317.S2CID 248297723.
  57. ^"Genome List - Genome - NCBI".www.ncbi.nlm.nih.gov. Archived fromthe original on November 28, 2011. Retrieved2022-08-18.
  58. ^"Species List".uswest.ensembl.org. Archived fromthe original on 2022-08-06. Retrieved2022-08-18.
  59. ^"GenArk: UCSC Genome Archive".hgdownload.soe.ucsc.edu. Retrieved2022-08-18.
  60. ^"Chimpanzee Genome Project".BCM-HGSC. 2016-03-04. Retrieved2022-08-18.
  61. ^Prado-Martinez J, Sudmant PH, Kidd JM, Li H, Kelley JL, Lorente-Galdos B, et al. (July 2013)."Great ape genetic diversity and population history".Nature.499 (7459):471–475.Bibcode:2013Natur.499..471P.doi:10.1038/nature12228.PMC 3822165.PMID 23823723.
  62. ^"100K Pathogen Genome Project – Genomes for Public Health & Food Safety". Retrieved2022-08-18.
  63. ^Lewin HA, Robinson GE, Kress WJ, Baker WJ, Coddington J, Crandall KA, et al. (April 2018)."Earth BioGenome Project: Sequencing life for the future of life".Proceedings of the National Academy of Sciences of the United States of America.115 (17):4325–4333.Bibcode:2018PNAS..115.4325L.doi:10.1073/pnas.1720115115.PMC 5924910.PMID 29686065.
  64. ^"African BioGenome Project – Genomics in the service of conservation and improvement of African biological diversity". Retrieved2022-08-18.
  65. ^"1000 Fungal Genomes Project".mycocosm.jgi.doe.gov. Retrieved2022-08-18.

External links

[edit]
Retrieved from "https://en.wikipedia.org/w/index.php?title=Reference_genome&oldid=1336147765"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp