Knowledge-driven Data Construction for Zero-shot Evaluation in Commonsense Question Answering

Authors

Kaixin MaLanguage Technologies Institute, School of Computer Science, Carnegie Mellon University
Filip IlievskiInformation Sciences Institute, Viterbi School of Engineering, University of Southern California
Jonathan FrancisLanguage Technologies Institute, School of Computer Science, Carnegie Mellon UniversityHuman-Machine Collaboration, Bosch Research Pittsburgh
Yonatan BiskLanguage Technologies Institute, School of Computer Science, Carnegie Mellon University
Eric NybergLanguage Technologies Institute, School of Computer Science, Carnegie Mellon University
Alessandro OltramariHuman-Machine Collaboration, Bosch Research Pittsburgh

DOI:

https://doi.org/10.1609/aaai.v35i15.17593

Keywords:

Question Answering, Common-Sense Reasoning, Unsupervised & Self-Supervised Learning, Neuro-Symbolic AI (NSAI)

Abstract

Recent developments in pre-trained neural language modeling have led to leaps in accuracy on common-sense question-answering benchmarks. However, there is increasing concern that models overfit to specific tasks, without learning to utilize external knowledge or perform general semantic reasoning. In contrast, zero-shot evaluations have shown promise as a more robust measure of a model’s general reasoning abilities. In this paper, we propose a novel neuro-symbolic framework for zero-shot question answering across commonsense tasks. Guided by a set of hypotheses, the framework studies how to transform various pre-existing knowledge resources into a form that is most effective for pre-training models. We vary the set of language models, training regimes, knowledge sources, and data generation strategies, and measure their impact across tasks. Extending on prior work, we devise and compare four constrained distractor-sampling strategies. We provide empirical results across five commonsense question-answering tasks with data generated from five external knowledge resources. We show that, while an individual knowledge graph is better suited for specific tasks, a global knowledge graph brings consistent gains across different tasks. In addition, both preserving the structure of the task as well as generating fair and informative questions help language models learn more effectively.

Downloads

Published

2021-05-18

How to Cite

Ma, K., Ilievski, F., Francis, J., Bisk, Y., Nyberg, E., & Oltramari, A. (2021). Knowledge-driven Data Construction for Zero-shot Evaluation in Commonsense Question Answering.Proceedings of the AAAI Conference on Artificial Intelligence,35(15), 13507-13515. https://doi.org/10.1609/aaai.v35i15.17593

Issue

Vol. 35 No. 15: AAAI-21 Technical Tracks 15

Section

AAAI Technical Track on Speech and Natural Language Processing II

Movatterモバイル変換