- Analysis
- Published:
Human gene essentiality
Nature Reviews Geneticsvolume 19, pages51–62 (2018)Cite this article
21kAccesses
101Altmetric
Key Points
A gene is considered essential when loss of its function compromises the viability or fitness of the organism.
Large-scale, population genome analyses in humans allow the observation of genes that do not tolerate loss of function, that is, are essential, and genes that tolerate biallelic loss of function, that is, are dispensable.
Human essential genes may not be captured in mouse knockout mouse models or recapitulated in cellular assays.
Observing the phenotypic consequences of loss-of-function variants is now used to anticipate drug safety and efficacy and guide drug discovery.
Abstract
A gene can be defined as essential when loss of its function compromises viability of the individual (for example, embryonic lethality) or results in profound loss of fitness. At the population level, identification of essential genes is accomplished by observing intolerance to loss-of-function variants. Several computational methods are available to score gene essentiality, and recent progress has been made in defining essentiality in the non-coding genome. Haploinsufficiency is emerging as a critical aspect of gene essentiality: approximately 3,000 human genes cannot tolerate loss of one of the two alleles. Genes identified as essential in human cell lines or knockout mice may be distinct from those in living humans. Reconciling these discrepancies in how we evaluate gene essentiality has applications in clinical genetics and may offer insights for drug development.
This is a preview of subscription content,access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
9,800 Yen / 30 days
cancel any time
Subscription info for Japanese customers
We have a dedicated website for our Japanese customers. Please go tonatureasia.com to subscribe to this journal.
Prices may be subject to local taxes which are calculated during checkout




Similar content being viewed by others
References
Maniloff, J. The minimal cell genome: “on being the right size”.Proc. Natl Acad. Sci. USA93, 10004–10006 (1996).
Hutchison III, C. A. et al. Global transposon mutagenesis and a minimal Mycoplasma genome.Science286, 2165–2169 (1999).
Hutchison III, C. A., et al. Design and synthesis of a minimal bacterial genome.Science351, aad6253 (2016).
Liu, G. et al. Gene essentiality is a quantitative property linked to cellular evolvability.Cell163, 1388–1399 (2015).
Luo, H., Lin, Y., Gao, F., Zhang, C. T. & Zhang, R. DEG 10, an update of the database of essential genes that includes both protein-coding genes and noncoding genomic elements.Nucleic Acids Res.42, D574–D580 (2014).
Deutschbauer, A. M. et al. Mechanisms of haploinsufficiency revealed by genome-wide profiling in yeast.Genetics169, 1915–1925 (2005).
Cirulli, E. T. et al. A whole-genome analysis of premature termination codons.Genomics98, 337–342 (2011).
Rausell, A. et al. Analysis of stop-gain and frameshift variants in human innate immunity genes.PLoS Comput. Biol.10, e1003757 (2014).
Rivas, M. A. et al. Human genomics. Effect of predicted protein-truncating genetic variants on the human transcriptome.Science348, 666–669 (2015).
MacArthur, D. G. et al. A systematic survey of loss-of-function variants in human protein-coding genes.Science335, 823–828 (2012).
Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans.Nature501, 506–511 (2013).
Montgomery, S. B., Lappalainen, T., Gutierrez-Arcelus, M. & Dermitzakis, E. T. Rare and common regulatory variation in population-scale sequenced human genomes.PLoS Genet.7, e1002144 (2011).
Huang, N., Lee, I., Marcotte, E. M. & Hurles, M. E. Characterising and predicting haploinsufficiency in the human genome.PLoS Genet.6, e1001154 (2010).
Telenti, A. et al. Deep sequencing of 10,000 human genomes.Proc. Natl Acad. Sci. USA113, 11901–11906 (2016).
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans.Nature536, 285–291 (2016).This paper presents the identification by ExAC of 3,230 genes with near-complete depletion of predicted protein-truncating variants. This work describes the widely used pLI score to identify essential genes.
Dewey, F. E. et al. Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study.Science354, aaf6814 (2016).
Chamary, J. V., Parmley, J. L. & Hurst, L. D. Hearing silence: non-neutral evolution at synonymous sites in mammals.Nat. Rev. Genet.7, 98–108 (2006).
Hunt, R. C., Simhadri, V. L., Iandoli, M., Sauna, Z. E. & Kimchi-Sarfaty, C. Exposing synonymous mutations.Trends Genet.30, 308–321 (2014).
Petrovski, S., Wang, Q., Heinzen, E. L., Allen, A. S. & Goldstein, D. B. Genic intolerance to functional variation and the interpretation of personal genomes.PLoS Genet.9, e1003709 (2013).
Rackham, O. J., Shihab, H. A., Johnson, M. R. & Petretto, E. EvoTol: a protein-sequence based evolutionary intolerance framework for disease-gene prioritization.Nucleic Acids Res.43, e33 (2015).
Samocha, K. E. et al. A framework for the interpretation ofde novo mutation in human disease.Nat. Genet.46, 944–950 (2014).This is an influential paper describing context-dependent mutation rates across the genome. It forms the basis for several sores of essentiality.
Fadista, J., Oskolkov, N., Hansson, O. & Groop, L. LoFtool: a gene intolerance score based on loss-of-function variants in 60 706 individuals.Bioinformatics33, 471–474 (2016).
Bartha, I. et al. The characteristics of heterozygous protein truncating variants in the human genome.PLoS Comput Biol11, e1004647 (2015).This study highlights rare heterozygous variants as an unexplored source of diversity of phenotypic traits and diseases. It describes the lack of compensation at expression level (haploinsufficiency).
Cassa, C. A. et al. Estimating the selective effects of heterozygous protein-truncating variants from human exome data.Nat. Genet.49, 806–810 (2017).This paper describes a large set of essential genes that are likely to have crucial functions but have not yet been characterized.
Dang, V. T., Kassahn, K. S., Marcos, A. E. & Ragan, M. A. Identification of human haploinsufficient genes and their genomic proximity to segmental duplications.Eur. J. Hum. Genet.16, 1350–1357 (2008).
Khurana, E., Fu, Y., Chen, J. & Gerstein, M. Interpretation of genomic variants using a unified biological network approach.PLoS Comput. Biol.9, e1002886 (2013).
Steinberg, J., Honti, F., Meader, S. & Webber, C. Haploinsufficiency predictions without study bias.Nucleic Acids Res.43, e101 (2015).
Shihab, H. A., Rogers, M. F., Campbell, C. & Gaunt, T. R. HIPred: an integrative approach to predicting haploinsufficient genes.Bioinformatics33, 1751–1757 (2017).
Giaever, G. & Nislow, C. The yeast deletion collection: a decade of functional genomics.Genetics197, 451–465 (2014).
Fraser, A. Essential Human Genes.Cell Syst.1, 381–382 (2015).
Dickerson, J. E., Zhu, A., Robertson, D. L. & Hentges, K. E. Defining the role of essential genes in human disease.PLoS ONE6, e27368 (2011).
Khuri, S. & Wuchty, S. Essentiality and centrality in protein interaction networks revisited.BMC Bioinformatics16, 109 (2015).
Vinayagam, A. et al. Controllability analysis of the directed human protein interaction network identifies disease genes and drug targets.Proc. Natl Acad. Sci. USA113, 4976–4981 (2016).
Georgi, B., Voight, B. F. & Bucan, M. From mouse to human: evolutionary genomics analysis of human orthologs of essential genes.PLoS Genet.9, e1003484 (2013).
Cannavo, E. et al. Genetic variants regulating expression levels and isoform diversity during embryogenesis.Nature541, 402–406 (2017).
Jeong, H., Mason, S. P., Barabasi, A. L. & Oltvai, Z. N. Lethality and centrality in protein networks.Nature411, 41–42 (2001).
Zhang, X., Acencio, M. L. & Lemke, N. Predicting essential genes and proteins based on machine learning and network topological features: a comprehensive review.Front. Physiol.7, 75 (2016).
Blomen, V. A. et al. Gene essentiality and synthetic lethality in haploid human cells.Science350, 1092–1096 (2015).
Hart, T. et al. High-resolution CRISPR screens reveal fitness genes and genotype-specific cancer liabilities.Cell163, 1515–1526 (2015).
Wang, T., Wei, J. J., Sabatini, D. M. & Lander, E. S. Genetic screens in human cells using the CRISPR-Cas9 system.Science343, 80–84 (2014).
Rosenthal, N. & Brown, S. The mouse ascending: perspectives for human-disease models.Nat. Cell Biol.9, 993–999 (2007).
Ayadi, A. et al. Mouse large-scale phenotyping initiatives: overview of the European Mouse Disease Clinic (EUMODIC) and of the Wellcome Trust Sanger Institute Mouse Genetics Project.Mamm. Genome23, 600–610 (2012).
Justice, M. J. & Dhillon, P. Using the mouse to model human disease: increasing validity and reproducibility.Dis. Model. Mech.9, 101–103 (2016).
Prado, A., Canal, I. & Ferrus, A. The haplolethal region at the 16F gene cluster ofDrosophila melanogaster: structure and function.Genetics151, 163–175 (1999).
Howell, G. R., Munroe, R. J. & Schimenti, J. C. Transgenic rescue of the mouse t complex haplolethal locus Thl1.Mamm. Genome16, 838–846 (2005).
Dickinson, M. E. et al. High-throughput discovery of novel developmental phenotypes.Nature537, 508–514 (2016).This is the largest study from the International Mouse Phenotyping Consortium. It identifies 410 lethal genes during the production of the first 1,751 mouse gene knockouts.
Dey, G., Jaimovich, A., Collins, S. R., Seki, A. & Meyer, T. Systematic discovery of human gene function and principles of modular organization through phylogenetic profiling.Cell Rep.http://dx.doi.org/10.1016/j.celrep.2015.01.025 (2015).
Edwards, A. M. et al. Too many roads not taken.Nature470, 163–165 (2011).
Ganna, A. et al. Quantifying the impact of rare and ultra-rare coding variation across the phenotypic spectrum. Preprint athttp://biorxiv.org/content/early/2017/06/09/148247 (2017).
Landrum, M. J. et al. ClinVar: public archive of relationships among sequence variation and human phenotype.Nucleic Acids Res.42, D980–D985 (2014).
Stenson, P. D. et al. The Human Gene Mutation Database: building a comprehensive mutation repository for clinical and molecular genetics, diagnostic testing and personalized genomic medicine.Hum. Genet.133, 1–9 (2014).
Gudbjartsson, D. F. et al. Large-scale whole-genome sequencing of the Icelandic population.Nat. Genet.47, 435–444 (2015).
Narasimhan, V. M., Xue, Y. & Tyler-Smith, C. Human knockout carriers: dead, diseased, healthy, or improved?Trends Mol. Med.22, 341–351 (2016).
Narasimhan, V. M. et al. Health and population effects of rare gene knockouts in adult humans with related parents.Science352, 474–477 (2016).
Sulem, P. et al. Identification of a large set of rare complete human knockouts.Nat. Genet.47, 448–452 (2015).
Lim, E. T. et al. Distribution and medical impact of loss-of-function variants in the Finnish founder population.PLoS Genet.10, e1004494 (2014).
Saleheen, D. et al. Human knockouts and phenotypic analysis in a cohort with a high rate of consanguinity.Nature544, 235–239 (2017).This provides a roadmap for a 'human knockout project' to understand the phenotypic consequences of complete disruption of genes in humans.
Nagy, E. & Maquat, L. E. A rule for termination-codon position within intron-containing genes: when nonsense affects RNA abundance.Trends Biochem. Sci.23, 198–199 (1998).
Lykke-Andersen, S. & Jensen, T. H. Nonsense-mediated mRNA decay: an intricate machinery that shapes transcriptomes.Nat. Rev. Mol. Cell Biol.16, 665–677 (2015).
Zhang, F. & Lupski, J. R. Non-coding genetic variants in human disease.Hum. Mol. Genet.24, R102–R110 (2015).
Esteller, M. Non-coding RNAs in human disease.Nat. Rev. Genet.12, 861–874 (2011).
Makrythanasis, P. & Antonarakis, S. E. Pathogenic variants in non-protein-coding sequences.Clin. Genet.84, 422–428 (2013).
Gordon, C. T. & Lyonnet, S. Enhancer mutations and phenotype modularity.Nat. Genet.46, 3–4 (2014).
Smedley, D. et al. A whole-genome analysis framework for effective identification of pathogenic regulatory variants in mendelian disease.Am. J. Hum. Genet.99, 595–606 (2016).
Harmston, N., Baresic, A. & Lenhard, B. The mystery of extreme non-coding conservation.Phil. Trans. R. Soc. B368, 20130021 (2013).
Wright, J. B. & Sanjana, N. E. CRISPR screens to discover functional noncoding elements.Trends Genet.32, 526–529 (2016).
Consortium, E. P. An integrated encyclopedia of DNA elements in the human genome.Nature489, 57–74 (2012).
Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants.Nat. Genet.46, 310–315 (2014).
Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning-based sequence model.Nat. Methods12, 931–934 (2015).
Ionita-Laza, I., McCallum, K., Xu, B. & Buxbaum, J. D. A spectral approach integrating functional genomic annotations for coding and noncoding variants.Nat. Genet.48, 214–220 (2016).
Khurana, E. et al. Role of non-coding sequence variants in cancer.Nat. Rev. Genet.17, 93–108 (2016).
Aggarwala, V. & Voight, B. F. An expanded sequence context model broadly explains variability in polymorphism levels across the human genome.Nat. Genet.48, 349–355 (2016).
di Iulio, J. et al. The human non-coding genome defined by genetic diversity.Nat. Genet. (in the press) (2017).
Fulco, C. P. et al. Systematic mapping of functional enhancer-promoter connections with CRISPR interference.Science354, 769–773 (2016).
Korkmaz, G. et al. Functional genetic screens for enhancer elements in the human genome using CRISPR-Cas9.Nat. Biotechnol.34, 192–198 (2016).
Sanjana, N. E. et al. High-resolution interrogation of functional elements in the noncoding genome.Science353, 1545–1549 (2016).
Zhu, S. et al. Genome-scale deletion screening of human long non-coding RNAs using a paired-guide RNA CRISPR-Cas9 library.Nat. Biotechnol.34, 1279–1286 (2016).
Kathiresan, S. Developing medicines that mimic the natural successes of the human genome: lessons from NPC1L1, HMGCR, PCSK9, APOC3, and CETP.J. Am. Coll. Cardiol.65, 1562–1566 (2015).
Este, J. A. & Telenti, A. HIV entry inhibitors.Lancet370, 81–88 (2007).
Kuehn, H. S. et al. Immune dysregulation in human subjects with heterozygous germline mutations in CTLA4.Science345, 1623–1627 (2014).This is a report of haploinsufficiency linked to a severe immune disease in several unrelated adults that escaped diagnosis for years. It serves as a model of the syndromes to come.
Sabatine, M. S. et al. Evolocumab and clinical outcomes in patients with cardiovascular disease.N. Engl. J. Med.376, 1713–1722 (2017).This is a clinical trial of a drug built on the knowledge of the cardiovascular phenotype of a human PCSK9 truncation.
Samocha, K. E. et al. Regional missense constraint improves variant deleteriousness prediction. Preprint athttp://biorxiv.org/content/early/2017/06/12/148353 (2017).
Sohail, M. et al. Negative selection in humans and fruit flies involves synergistic epistasis.Science356, 539–542 (2017).
Lindblad-Toh, K. et al. A high-resolution map of human evolutionary constraint using 29 mammals.Nature478, 476–482 (2011).
Kellis, M. et al. Defining functional DNA elements in the human genome.Proc. Natl Acad. Sci. USA111, 6131–6138 (2014).
Wang, T. et al. Identification and characterization of essential genes in the human genome.Science350, 1096–1101 (2015).
Acknowledgements
The authors thank Drs Ewen Kirkness and Michael Hicks for valuable comments. The authors are employees of Human Longevity, Inc.
Author information
Authors and Affiliations
Human Longevity Inc., San Diego, 92121, California, USA
István Bartha, Julia di Iulio, J. Craig Venter & Amalio Telenti
J. Craig Venter Institute, Capricorn Lane, La Jolla, 92037, California, USA
J. Craig Venter & Amalio Telenti
- István Bartha
You can also search for this author inPubMed Google Scholar
- Julia di Iulio
You can also search for this author inPubMed Google Scholar
- J. Craig Venter
You can also search for this author inPubMed Google Scholar
- Amalio Telenti
You can also search for this author inPubMed Google Scholar
Contributions
All authors substantially contributed to discussion of content and to reviewing/editing the manuscript before submission. I.B., J.d.I. and A.T. researched data for the article and contributed to writing the manuscript.
Corresponding authors
Correspondence toJ. Craig Venter orAmalio Telenti.
Ethics declarations
Competing interests
The authors are employees of Human Longevity, Inc. There is no commercial interest or intellectual property associated with this work.
Related links
FURTHER INFORMATION
Supplementary information
Supplementary information
Supplementary information S1 (box) (PDF 422 kb)
Supplementary information
Supplementary information S2 (table) (XLSX 3420 kb)
Supplementary information
Supplementary information S3 (table) (TXT 70 kb)
Glossary
- Minimal genome
A genome limited to the essential genes for life.
- Robustness
The ability of a biological system to keep its behaviour unchanged under perturbation.
- Redundancy
The possibility of having a function encoded by more than one gene.
- Evolvability
The degree to which an organism can generate adaptive solutions to future environments through heritable phenotypic variation.
- Exome
The subset of the genome that is part of mature RNAs and translated into proteins.
- Protein truncation
A truncated, incomplete and usually nonfunctional protein product. Generally, the result of stop-gain, frameshift or splice-donor genetic variants.
- Loss-of-function variants
Genetic variants that severely disrupt the function of a protein. These can be missense (a change of the codon resulting in a change in the amino acid) or nonsense and protein-truncating variants.
- Haploinsufficiency
In a diploid organism, having only a single functional copy of a gene (with the other copy inactivated by mutation), which is insufficient to maintain proper gene function.
- Stop-gain variants
Also known as nonsense variants, changes in the genetic material that result in premature termination of the translated protein.
- Saturate
When referring to the generation of gene variants genome-wide, the sample size at which all positions in the genome are seen variant at least once.
- Frameshift variants
Deletions or insertions in the protein-coding region, the lengths of which are not divisible by three, thus disrupting the reading frame of the gene.
- Synonymous variants
A change of nucleotide that does not lead to changes in the amino-acid sequence of a protein.
- Neutral variation
Genetic variants that are not subjects of natural selection.
- ROC curve
(Receiver operating characteristic curve). A visual and quantitative method of evaluating the performance of binary classifiers. The true positive rate of a classifier is plotted against the false-positive rate.
- Expression quantitative trait loci
(eQTLs). Loci where variation is associated with differential expression of a gene.
- Haploid
Of cells, containing a single set of chromosomes.
- Ploidy
The number of sets of chromosomes in a cell.
- Hemizygosity
The absence of one copy of a gene in diploid cells.
- Compound heterozygosity
The state in which both alleles of a gene carry a (deleterious) variant, but those variants are different.
- Nonsense-mediated mRNA decay
(NMD). A cellular pathway that serves to recognize and degrade mRNAs with translation termination codons that are positioned in abnormal contexts.
- Haplotype phasing
The assignment of an allele to one of the two copies of the chromosomes (maternal and paternal).
Rights and permissions
About this article
Cite this article
Bartha, I., di Iulio, J., Venter, J.et al. Human gene essentiality.Nat Rev Genet19, 51–62 (2018). https://doi.org/10.1038/nrg.2017.75
Published:
Issue Date:
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative