We introduce Social IQa, the first large-scale benchmark for commonsense reasoning about social situations. Social IQa contains 38,000 multiple choice questions for probing emotional and social intelligence in a variety of everyday situations (e.g., Q: “Jordan wanted to tell Tracy a secret, so Jordan leaned towards Tracy. Why did Jordan do this?” A: “Make sure no one else could hear”). Through crowdsourcing, we collect commonsense questions along with correct and incorrect answers about social interactions, using a new framework that mitigates stylistic artifacts in incorrect answers by asking workers to provide the right answer to a different but related question. Empirical results show that our benchmark is challenging for existing question-answering models based on pretrained language models, compared to human performance (>20% gap). Notably, we further establish Social IQa as a resource for transfer learning of commonsense knowledge, achieving state-of-the-art performance on multiple commonsense reasoning tasks (Winograd Schemas, COPA).
Maarten Sap, Hannah Rashkin, Derek Chen, Ronan Le Bras, and Yejin Choi. 2019.Social IQa: Commonsense Reasoning about Social Interactions. InProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4463–4473, Hong Kong, China. Association for Computational Linguistics.
@inproceedings{sap-etal-2019-social, title = "Social {IQ}a: Commonsense Reasoning about Social Interactions", author = "Sap, Maarten and Rashkin, Hannah and Chen, Derek and Le Bras, Ronan and Choi, Yejin", editor = "Inui, Kentaro and Jiang, Jing and Ng, Vincent and Wan, Xiaojun", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)", month = nov, year = "2019", address = "Hong Kong, China", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/D19-1454/", doi = "10.18653/v1/D19-1454", pages = "4463--4473", abstract = "We introduce Social IQa, the first large-scale benchmark for commonsense reasoning about social situations. Social IQa contains 38,000 multiple choice questions for probing emotional and social intelligence in a variety of everyday situations (e.g., Q: {\textquotedblleft}Jordan wanted to tell Tracy a secret, so Jordan leaned towards Tracy. Why did Jordan do this?{\textquotedblright} A: {\textquotedblleft}Make sure no one else could hear{\textquotedblright}). Through crowdsourcing, we collect commonsense questions along with correct and incorrect answers about social interactions, using a new framework that mitigates stylistic artifacts in incorrect answers by asking workers to provide the right answer to a different but related question. Empirical results show that our benchmark is challenging for existing question-answering models based on pretrained language models, compared to human performance ({\ensuremath{>}}20{\%} gap). Notably, we further establish Social IQa as a resource for transfer learning of commonsense knowledge, achieving state-of-the-art performance on multiple commonsense reasoning tasks (Winograd Schemas, COPA)."}
<?xml version="1.0" encoding="UTF-8"?><modsCollection xmlns="http://www.loc.gov/mods/v3"><mods ID="sap-etal-2019-social"> <titleInfo> <title>Social IQa: Commonsense Reasoning about Social Interactions</title> </titleInfo> <name type="personal"> <namePart type="given">Maarten</namePart> <namePart type="family">Sap</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Hannah</namePart> <namePart type="family">Rashkin</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Derek</namePart> <namePart type="family">Chen</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Ronan</namePart> <namePart type="family">Le Bras</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yejin</namePart> <namePart type="family">Choi</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2019-11</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)</title> </titleInfo> <name type="personal"> <namePart type="given">Kentaro</namePart> <namePart type="family">Inui</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Jing</namePart> <namePart type="family">Jiang</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Vincent</namePart> <namePart type="family">Ng</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Xiaojun</namePart> <namePart type="family">Wan</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Hong Kong, China</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>We introduce Social IQa, the first large-scale benchmark for commonsense reasoning about social situations. Social IQa contains 38,000 multiple choice questions for probing emotional and social intelligence in a variety of everyday situations (e.g., Q: “Jordan wanted to tell Tracy a secret, so Jordan leaned towards Tracy. Why did Jordan do this?” A: “Make sure no one else could hear”). Through crowdsourcing, we collect commonsense questions along with correct and incorrect answers about social interactions, using a new framework that mitigates stylistic artifacts in incorrect answers by asking workers to provide the right answer to a different but related question. Empirical results show that our benchmark is challenging for existing question-answering models based on pretrained language models, compared to human performance (\ensuremath>20% gap). Notably, we further establish Social IQa as a resource for transfer learning of commonsense knowledge, achieving state-of-the-art performance on multiple commonsense reasoning tasks (Winograd Schemas, COPA).</abstract> <identifier type="citekey">sap-etal-2019-social</identifier> <identifier type="doi">10.18653/v1/D19-1454</identifier> <location> <url>https://aclanthology.org/D19-1454/</url> </location> <part> <date>2019-11</date> <extent unit="page"> <start>4463</start> <end>4473</end> </extent> </part></mods></modsCollection>
%0 Conference Proceedings%T Social IQa: Commonsense Reasoning about Social Interactions%A Sap, Maarten%A Rashkin, Hannah%A Chen, Derek%A Le Bras, Ronan%A Choi, Yejin%Y Inui, Kentaro%Y Jiang, Jing%Y Ng, Vincent%Y Wan, Xiaojun%S Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)%D 2019%8 November%I Association for Computational Linguistics%C Hong Kong, China%F sap-etal-2019-social%X We introduce Social IQa, the first large-scale benchmark for commonsense reasoning about social situations. Social IQa contains 38,000 multiple choice questions for probing emotional and social intelligence in a variety of everyday situations (e.g., Q: “Jordan wanted to tell Tracy a secret, so Jordan leaned towards Tracy. Why did Jordan do this?” A: “Make sure no one else could hear”). Through crowdsourcing, we collect commonsense questions along with correct and incorrect answers about social interactions, using a new framework that mitigates stylistic artifacts in incorrect answers by asking workers to provide the right answer to a different but related question. Empirical results show that our benchmark is challenging for existing question-answering models based on pretrained language models, compared to human performance (\ensuremath>20% gap). Notably, we further establish Social IQa as a resource for transfer learning of commonsense knowledge, achieving state-of-the-art performance on multiple commonsense reasoning tasks (Winograd Schemas, COPA).%R 10.18653/v1/D19-1454%U https://aclanthology.org/D19-1454/%U https://doi.org/10.18653/v1/D19-1454%P 4463-4473
Maarten Sap, Hannah Rashkin, Derek Chen, Ronan Le Bras, and Yejin Choi. 2019.Social IQa: Commonsense Reasoning about Social Interactions. InProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4463–4473, Hong Kong, China. Association for Computational Linguistics.