- Brief Communication
- Published:
Deep learning improves prediction of CRISPR–Cpf1 guide RNA activity
- Hui Kwon Kim1,2 na1,
- Seonwoo Min3 na1,
- Myungjae Song1,4,
- Soobin Jung1,2,
- Jae Woo Choi1,5,
- Younggwang Kim1,2,
- Sangeun Lee1,2,
- Sungroh Yoon ORCID:orcid.org/0000-0002-2367-197X3,6 &
- …
- Hyongbum (Henry) Kim ORCID:orcid.org/0000-0002-4693-738X1,2,5,7,8
Nature Biotechnologyvolume 36, pages239–241 (2018)Cite this article
24kAccesses
138Altmetric
Abstract
We present two algorithms to predict the activity of AsCpf1 guide RNAs. Indel frequencies for 15,000 target sequences were used in a deep-learning framework based on a convolutional neural network to train Seq-deepCpf1. We then incorporated chromatin accessibility information to create the better-performing DeepCpf1 algorithm for cell lines for which such information is available and show that both algorithms outperform previous machine learning algorithms on our own and published data sets.
This is a preview of subscription content,access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
9,800 Yen / 30 days
cancel any time
Subscription info for Japanese customers
We have a dedicated website for our Japanese customers. Please go tonatureasia.com to subscribe to this journal.
Prices may be subject to local taxes which are calculated during checkout


Similar content being viewed by others
Accession codes
References
Zetsche, B. et al.Cell163, 759–771 (2015).
Zetsche, B. et al.Nat. Biotechnol.35, 31–34 (2017).
Hur, J.K. et al.Nat. Biotechnol.34, 807–808 (2016).
Kim, Y. et al.Nat. Biotechnol.34, 808–810 (2016).
Xu, R. et al.Plant Biotechnol. J.15, 713–717 (2017).
Kim, D. et al.Nat. Biotechnol.34, 863–868 (2016).
Kleinstiver, B.P. et al.Nat. Biotechnol.34, 869–874 (2016).
Kim, H.K. et al.Nat. Methods14, 153–159 (2017).
Doench, J.G. et al.Nat. Biotechnol.34, 184–191 (2016).
Lee, C.M., Davis, T.H. & Bao, G.Exp. Physiol.doi:10.1113/EP086043 (2017).
Encode Project Consortium.Nature489, 57–74 (2012).
Chari, R., Yeo, N.C., Chavez, A. & Church, G.M.ACS Synth. Biol.6, 902–904 (2017).
Haeussler, M. et al.Genome Biol.17, 148 (2016).
Yamano, T. et al.Cell165, 949–962 (2016).
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L.Genome Biol.10, R25 (2009).
LeCun, Y., Bengio, Y. & Hinton, G.Nature521, 436–444 (2015).
Goodfellow, I., Bengio, Y. & Courville, A.Deep Learning (MIT Press, 2016).
Min, S., Lee, B. & Yoon, S.Brief. Bioinform.18, 851–869 (2017).
Alipanahi, B., Delong, A., Weirauch, M.T. & Frey, B.J.Nat. Biotechnol.33, 831–838 (2015).
Kelley, D.R., Snoek, J. & Rinn, J.L.Genome Res.26, 990–999 (2016).
Doench, J.G. et al.Nat. Biotechnol.32, 1262–1267 (2014).
Wang, T., Wei, J.J., Sabatini, D.M. & Lander, E.S.Science343, 80–84 (2014).
Chari, R., Mali, P., Moosburner, M. & Church, G.M.Nat. Methods12, 823–826 (2015).
Moreno-Mateos, M.A. et al.Nat. Methods12, 982–988 (2015).
Xu, H. et al.Genome Res.25, 1147–1157 (2015).
Wong, N., Liu, W. & Wang, X.Genome Biol.16, 218 (2015).
Bergstra, J. et al. in.Proc. 9th Python Sci. Conf. 3–10 (2010).
Kingma, D.P. & Ba, J. Preprint athttps://arxiv.org/abs/1412.6980 (2014).
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R.J. Mach. Learn. Res.15, 1929–1958 (2014).
Acknowledgements
The authors thank E.-S. Lee for proofreading and R. Gopalappa, N. Kim, S. Park, and J. Park for assisting in sample preparation. This work was supported in part by the National Research Foundation of Korea (grants 2017R1A2B3004198 (H.K.), 2017M3A9B4062403 (H.K.), 2013M3A9B4076544 (H.K.), 2014M3C9A3063541 (S.Y.)), Brain Korea 21 Plus Project (Yonsei University College of Medicine), Brain Korea 21 Plus Project (SNU ECE) in 2017, Institute for Basic Science (IBS; IBS-R026-D1), and the Korean Health Technology R&D Project, Ministry of Health and Welfare, Republic of Korea (grants HI17C0676 (H.K.), and HI16C1012 (H.K.)).
Author information
Hui Kwon Kim and Seonwoo Min: These authors contributed equally to this work.
Authors and Affiliations
Department of Pharmacology, Yonsei University College of Medicine, Seoul, Republic of Korea
Hui Kwon Kim, Myungjae Song, Soobin Jung, Jae Woo Choi, Younggwang Kim, Sangeun Lee & Hyongbum (Henry) Kim
Brain Korea 21 Plus Project for Medical Sciences, Yonsei University College of Medicine, Seoul, Republic of Korea
Hui Kwon Kim, Soobin Jung, Younggwang Kim, Sangeun Lee & Hyongbum (Henry) Kim
Electrical and Computer Engineering, Seoul National University, Seoul, Republic of Korea
Seonwoo Min & Sungroh Yoon
Graduate School of Biomedical Science and Engineering, Hanyang University, Seoul, Republic of Korea
Myungjae Song
Severance Biomedical Science Institute, Yonsei University College of Medicine, Seoul, Republic of Korea
Jae Woo Choi & Hyongbum (Henry) Kim
Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea
Sungroh Yoon
Center for Nanomedicine, Institute for Basic Science (IBS), Seoul, Republic of Korea
Hyongbum (Henry) Kim
Yonsei-IBS Institute, Yonsei University, Seoul, Republic of Korea
Hyongbum (Henry) Kim
- Hui Kwon Kim
You can also search for this author inPubMed Google Scholar
- Seonwoo Min
You can also search for this author inPubMed Google Scholar
- Myungjae Song
You can also search for this author inPubMed Google Scholar
- Soobin Jung
You can also search for this author inPubMed Google Scholar
- Jae Woo Choi
You can also search for this author inPubMed Google Scholar
- Younggwang Kim
You can also search for this author inPubMed Google Scholar
- Sangeun Lee
You can also search for this author inPubMed Google Scholar
- Sungroh Yoon
You can also search for this author inPubMed Google Scholar
- Hyongbum (Henry) Kim
You can also search for this author inPubMed Google Scholar
Contributions
H.K.K., M.S., and S.J. performed experiments to build data sets of AsCpf1 indel frequencies. S.M. and S.Y. developed the framework, and carried out the model training and computational validation. J.W.C. performed bioinformatic analyses. Y.K. and S.L. made substantial contributions to the performance of the experiments including cell culture and deep-sequencing. H.H.K. conceived and designed the study. H.K.K., S.M., S.Y., and H.H.K. analyzed the data and wrote the manuscript.
Corresponding authors
Correspondence toSungroh Yoon orHyongbum (Henry) Kim.
Ethics declarations
Competing interests
Yonsei University and Seoul National University have filed a patent based on this work, in which H.K.K., S.M., M.S., S.J., S.Y., and H.K. are co-inventors.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–14 and Supplementary Note (PDF 2816 kb)
Supplementary Tables
All tables that are included together, Supplementary tables 2, 4, and 6 (PDF 521 kb)
Supplementary Table 1
Source data used for this study. (XLSX 2463 kb)
Supplementary Table 3
Model selection results of Seq-deepCpf1 (XLSX 19 kb)
Supplementary Table 5
Oligonucleotides used in this study (XLSX 40 kb)
Supplementary Table 7
Confidence intervals for the result values (XLSX 15 kb)
Supplementary Code
The source code of Seq-deepCpf1 and DeepCpf1 (ZIP 750 kb)
Rights and permissions
About this article
Cite this article
Kim, H., Min, S., Song, M.et al. Deep learning improves prediction of CRISPR–Cpf1 guide RNA activity.Nat Biotechnol36, 239–241 (2018). https://doi.org/10.1038/nbt.4061
Received:
Accepted:
Published:
Issue Date:
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative