- Nuthan Munaiah16,
- Benjamin S. Meyers16,
- Cecilia O. Alm17,
- Andrew Meneely16,
- Pradeep K. Murukannaiah16,
- Emily Prud’hommeaux17,
- Josephine Wolff17 &
- …
- Yang Yu18
Part of the book series:Lecture Notes in Computer Science ((LNSC,volume 10379))
Included in the following conference series:
1482Accesses
Abstract
Engineering secure software is challenging. Software development organizations leverage a host of processes and tools to enable developers to prevent vulnerabilities in software. Code reviewing is one such approach which has been instrumental in improving the overall quality of a software system. In a typical code review, developers critique a proposed change to uncover potential vulnerabilities. Despite best efforts by developers, some vulnerabilities inevitably slip through the reviews. In this study, we characterized linguistic features—inquisitiveness, sentiment and syntactic complexity—of conversations between developers in a code review, to identify factors that could explain developers missing a vulnerability. We used natural language processing to collect these linguistic features from 3,994,976 messages in 788,437 code reviews from the Chromium project. We collected 1,462 Chromium vulnerabilities to empirically analyze the linguistic features. We found that code reviews with lower inquisitiveness, higher sentiment, and lower complexity were more likely to miss a vulnerability. We used a Naïve Bayes classifier to assess if the words (or lemmas) in the code reviews could differentiate reviews that are likely to miss vulnerabilities. The classifier used a subset of all lemmas (over 2 million) as features and their corresponding TF-IDF scores as values. The average precision, recall, and F-measure of the classifier were 14%, 73%, and 23%, respectively. We believe that our linguistic characterization will help developers identify problematic code reviews before they result in a vulnerability being missed.
This is a preview of subscription content,log in via an institution to check access.
Access this chapter
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
- Chapter
- JPY 3498
- Price includes VAT (Japan)
- eBook
- JPY 5719
- Price includes VAT (Japan)
- Softcover Book
- JPY 7149
- Price includes VAT (Japan)
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Reuters-21578, Distribution 1.0.http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.html
Baddeley, A.: Recent developments in working memory. Curr. Opin. Neurobiol.8(2), 234–238 (1998)
Baddeley, A.: Working memory and language: an overview. J. Commun. Disord.36(3), 189–208 (2003)
Baysal, O., Kononenko, O., Holmes, R., Godfrey, M.W.: The influence of non-technical factors on code review. In: 2013 20th Working Conference on Reverse Engineering (WCRE), pp. 122–131, October 2013
Beller, M., Bacchelli, A., Zaidman, A., Juergens, E.: Modern code reviews in open-source projects: which problems do they fix? In: Proceedings of the 11th Working Conference on Mining Software Repositories, MSR 2014, New York, NY, USA, pp. 202–211. ACM, New York (2014)
Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit. O’Reilly Media Inc, Sebastopol (2009)
Bosu, A., Carver, J.C.: Peer code review to prevent security vulnerabilities: an empirical evaluation. In: 2013 IEEE Seventh International Conference on Software Security and Reliability Companion, pp. 229–230, June 2013
Bosu, A., Greiler, M., Bird, C.: Characteristics of useful code reviews: an empirical study at microsoft. In: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, pp. 146–156, May 2015
Bosu, A., Carver, J.C., Hafiz, M., Hilley, P., Janni, D.: Identifying the characteristics of vulnerable code changes: an empirical study. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2014, New York, NY, pp. 257–268. ACM, New York (2014)
Brown, C., Snodgrass, T., Kemper, S.J., Herman, R., Covington, M.A.: Automatic measurement of propositional idea density from part-of-speech tagging. Behav. Res. Methods40(2), 540–545 (2008)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res.16, 321–357 (2002)
Chomsky, N.: Syntactic Structures. Mouton, The Hague (1957)
Chromium: Chromium OS developer’s guide (2017).https://www.chromium.org/chromium-os/developer-guide
Ciolkowski, M., Laitenberger, O., Biffl, S.: Software reviews: the state of the practice. IEEE Software20(6), 46–51 (2003)
Czerwonka, J., Greiler, M., Tilford, J.: Code reviews do not find bugs: how the current code review best practice slows us down. In: Proceedings of the 37th International Conference on Software Engineering, ICSE 2015, vol. 2, pp. 27–28. IEEE Press, Piscataway (2015).http://dl.acm.org/citation.cfm?id=2819009.2819015
Edmundson, A., Holtkamp, B., Rivera, E., Finifter, M., Mettler, A., Wagner, D.: An empirical study on the effectiveness of security code review. In: Jürjens, J., Livshits, B., Scandariato, R. (eds.) ESSoS 2013. LNCS, vol. 7781, pp. 197–212. Springer, Heidelberg (2013). doi:10.1007/978-3-642-36563-8_14
Francis, W.N., Kucera, H.: A standard corpus of present-day edited American English, for use with digital computers. Coll. Engl.26(4), 267 (1965)
Frazier, L.: Syntactic complexity. In: Dowty, D.R., Karttunen, L., Zwicky, A.M. (eds.) Natural Language Parsing, pp. 129–189. Cambridge University Press (CUP), Cambridge (1985)
Frazier, L.: Sentence Processing: A Tutorial Review (1987)
Frazier, L.: syntactic processing: evidence from Dutch. Nat. Lang. Linguist. Theor.5(4), 519–559 (1987)
Frazier, L., Taft, L., Roeper, T., Clifton, C., Ehrlich, K.: Parallel structure: a source of facilitation in sentence comprehension. Mem. Cogn.12(5), 421–430 (1984)
Guzman, E., Azócar, D., Li, Y.: Sentiment analysis of commit comments in GitHub: an empirical study. In: Proceedings of the 11th Working Conference on Mining Software Repositories, MSR 2014, NY, pp. 352–355. ACM, New York (2014)
Hart, M.S., Austen, J., Blake, W., Burgess, T.W., Bryant, S.C., Carroll, L., Chesterton, G.K., Edgeworth, M., Melville, H., Milton, J., Shakespeare, W., Whitman, W., Bible, K.J.: Project Gutenberg Selections. Freely available as a Corpus in the Natural Language ToolKit.http://www.nltk.org/nltk_data/#25
Hinkle, D.E., Wiersma, W., Jurs, S.G.: Applied Statistics for the Behavioral Sciences. Houghton Mifflin, Boston (2002)
Lipner, S.: The trustworthy computing security development lifecycle. In: 20th Annual Computer Security Applications Conference, pp. 2–13, December 2004
Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: Association for Computational Linguistics (ACL) System Demonstrations, pp. 55–60 (2014)
Mäntylä, M.V., Lassenius, C.: What types of defects are really discovered in code reviews? IEEE Trans. Software Eng.35(3), 430–448 (2009)
Mayer, R.E., Moreno, R.: Nine ways to reduce cognitive load in multimedia learning. Educ. Psychol.38(1), 43–52 (2003)
McGraw, G.: Software security. IEEE Secur. Priv.2(2), 80–83 (2004)
Meneely, A., Srinivasan, H., Musa, A., Tejeda, A.R., Mokary, M., Spates, B.: When a patch goes bad: exploring the properties of vulnerability-contributing commits. In: 2013 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 65–74, October 2013
Menzies, T., Menzies, A., Distefano, J., Greenwald, J.: Problems with precision: a response to “comments on ‘data mining static code attributes to learn defect predictors”’. IEEE Trans. Softw. Eng.33(9), 637–640 (2007). doi:10.1109/TSE.2007.70721. ISSN: 0098-5589
Meyers, B.S.: Speech processing & linguistic analysis tool (SPLAT).https://github.com/meyersbs/SPLAT
Miller, G.: Human memory and the storage of information. IRE Trans. Inf. Theor.2(3), 129–137 (1956)
Miller, J.F., Chapman, R.S.: The relation between age and mean length of utterance in morphemes. J. Speech Lang. Hear. Res.24(2), 154–161 (1981)
Pletea, D., Vasilescu, B., Serebrenik, A.: Security and emotion: sentiment analysis of security discussions on GitHub. In: Proceedings of the 11th Working Conference on Mining Software Repositories, MSR 2014, NY, pp. 348–351. ACM, New York (2014)
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2015).https://www.R-project.org/
Roark, B., Mitchell, M., Hosom, J., Hollingshead, K., Kaye, J.: Spoken language derived measures for detecting mild cognitive impairment. Trans. Audio Speech Lang. Proc.19(7), 2081–2090 (2011)
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manage.24(5), 513–523 (1988)
Socher, R., Perelygin, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, October 2013
Sweller, J., Chandler, P.: Evidence for cognitive load theory. Cogn. Instr.8(4), 351–362 (1991)
Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: ICML, vol. 97, pp. 412–420 (1997)
Yngve, V.H.: A Model and an Hypothesis for Language Structure, vol. 104, pp. 444–466. American Philosophical Society (1960)
Author information
Authors and Affiliations
B. Thomas Golisano College of Computing and Information Sciences, Rochester Institute of Technology, Rochester, NY, 14623, USA
Nuthan Munaiah, Benjamin S. Meyers, Andrew Meneely & Pradeep K. Murukannaiah
College of Liberal Arts, Rochester Institute of Technology, Rochester, NY, 14623, USA
Cecilia O. Alm, Emily Prud’hommeaux & Josephine Wolff
Saunders College of Business, Rochester Institute of Technology, Rochester, NY, 14623, USA
Yang Yu
- Nuthan Munaiah
You can also search for this author inPubMed Google Scholar
- Benjamin S. Meyers
You can also search for this author inPubMed Google Scholar
- Cecilia O. Alm
You can also search for this author inPubMed Google Scholar
- Andrew Meneely
You can also search for this author inPubMed Google Scholar
- Pradeep K. Murukannaiah
You can also search for this author inPubMed Google Scholar
- Emily Prud’hommeaux
You can also search for this author inPubMed Google Scholar
- Josephine Wolff
You can also search for this author inPubMed Google Scholar
- Yang Yu
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toNuthan Munaiah.
Editor information
Editors and Affiliations
University of Paderborn, Paderborn, Germany
Eric Bodden
Purdue University, West Lafayette, USA
Mathias Payer
University of Cyprus, Nicosia, Cyprus
Elias Athanasopoulos
A Comparing Distribution of Inquisitiveness, Sentiment and Complexity Metrics
A Comparing Distribution of Inquisitiveness, Sentiment and Complexity Metrics
The comparison of the distribution of inquisitiveness, sentiment and complexity metrics for neutral and missed vulnerability code reviews is shown in Fig. 3.
Comparing the distribution of inquisitiveness, sentiment and complexity metrics for neutral and missed vulnerability code reviews
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Munaiah, N.et al. (2017). Natural Language Insights from Code Reviews that Missed a Vulnerability. In: Bodden, E., Payer, M., Athanasopoulos, E. (eds) Engineering Secure Software and Systems. ESSoS 2017. Lecture Notes in Computer Science(), vol 10379. Springer, Cham. https://doi.org/10.1007/978-3-319-62105-0_5
Download citation
Published:
Publisher Name:Springer, Cham
Print ISBN:978-3-319-62104-3
Online ISBN:978-3-319-62105-0
eBook Packages:Computer ScienceComputer Science (R0)
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative