- Maksim Savkin ORCID:orcid.org/0009-0003-5465-227417 &
- Vasily Konovalov ORCID:orcid.org/0000-0002-4745-471817
Part of the book series:Communications in Computer and Information Science ((CCIS,volume 1905))
Included in the following conference series:
159Accesses
Abstract
Few-shot intent classification and out-of-scope (OOS) detection are core components of task-oriented dialogue systems. Solving both tasks can be challenging because of limited data availability. In this study, we aim to develop a few-shot intent classification model capable of OOS detection that does not require fine-tuning on target data. We adopt the discriminative nearest neighbor classification architecture and replace the fine-tuning phase with a consecutive pre-training approach involving natural language inference and paraphrasing tasks. Our approach leverages the training set for predictions, offering a quick and convenient way to adjust the model’s behavior by modifying a set of few labeled examples. When compared to methods that do not require fine-tuning, the developed model exhibits higher scores on various few-shot intent classification datasets.
This is a preview of subscription content,log in via an institution to check access.
Access this chapter
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
- Chapter
- JPY 3498
- Price includes VAT (Japan)
- eBook
- JPY 12583
- Price includes VAT (Japan)
- Softcover Book
- JPY 10724
- Price includes VAT (Japan)
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. CoRR abs/1508.05326 (2015).http://arxiv.org/abs/1508.05326
Burtsev, M., et al.: Deeppavlov: an open source library for conversational AI. In: NIPS (2018).https://openreview.net/pdf?id=BJzyCF6Vn7
Casanueva, I., Temcinas, T., Gerz, D., Henderson, M., Vulic, I.: Efficient intent detection with dual sentence encoders. CoRR abs/2003.04807 (2020).https://arxiv.org/abs/2003.04807
Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 670–680. Association for Computational Linguistics, Copenhagen, Denmark (2017).https://doi.org/10.18653/v1/D17-1070,https://aclanthology.org/D17-1070
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018).http://arxiv.org/abs/1810.04805
Dolan, W.B., Brockett, C.: Automatically constructing a corpus of sentential paraphrases. In: Proceedings of the Third International Workshop on Paraphrasing (IWP2005) (2005).https://aclanthology.org/I05-5002
Gao, T., Yao, X., Chen, D.: SimCSE: simple contrastive learning of sentence embeddings (2022)
Gunel, B., Du, J., Conneau, A., Stoyanov, V.: Supervised contrastive learning for pre-trained language model fine-tuning (2021)
Hendrycks, D., Lee, K., Mazeika, M.: Using pre-training can improve model robustness and uncertainty (2019)
Iskender, B., Xu, Z., Kornblith, S., Chu, E.H., Khademi, M.: Improving dense contrastive learning with dense negative pairs (2023)
Karpov, D., Konovalov, V.: Knowledge transfer between tasks and languages in the multi-task encoder-agnostic transformer-based models. In: Computational Linguistics and Intellectual Technologies, vol. 2023 (2023).https://doi.org/10.28995/2075-7182-2023-22-200-214,https://www.dialog-21.ru/media/5902/karpovdpluskonovalovv002.pdf
Kolesnikova, A., Kuratov, Y., Konovalov, V., Mikhail, M.: Knowledge distillation of Russian language models with reduction of vocabulary. In: Computational Linguistics and Intellectual Technologies. RSUH (2022).https://doi.org/10.28995/2075-7182-2022-21-295-310,https://www.dialog-21.ru/media/5770/kolesnikovaaplusetal036.pdf
Konovalov, V., Gulyaev, P., Sorokin, A., Kuratov, Y., Burtsev, M.: Exploring the BERT cross-lingual transfer for reading comprehension. In: Computational Linguistics and Intellectual Technologies, pp. 445–453 (2020).https://doi.org/10.28995/2075-7182-2020-19-445-453,http://www.dialog-21.ru/media/5100/konovalovvpplusetal-118.pdf
Konovalov, V., Melamud, O., Artstein, R., Dagan, I.: Collecting better training data using biased agent policies in negotiation dialogues. In: Proceedings of WOCHAT, the Second Workshop on Chatbots and Conversational Agent Technologies. Zerotype, Los Angeles (2016).http://workshop.colips.org/wochat/documents/RP-270.pdf
Larson, S., et al.: An evaluation dataset for intent classification and out-of-scope prediction. CoRR abs/1909.02027 (2019).http://arxiv.org/abs/1909.02027
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019).http://arxiv.org/abs/1907.11692
Nie, Y., Wang, S., Bansal, M.: Revealing the importance of semantic retrieval for machine reading at scale. CoRR abs/1909.08041 (2019).http://arxiv.org/abs/1909.08041
Ostyakova, L., Molchanova, M., Petukhova, K., Smilga, N., Kornev, D., Burtsev, M.: Corpus with speech function annotation: challenges, advantages, and limitations. In: Computational Linguistics and Intellectual Technologies, pp. 1129–1139 (2022)
Ostyakova, L., PetukhovaO, K., Smilga, V., ZharikovaO, D.: Linguistic annotation generation with chatGPT: a synthetic dataset of speech functions for discourse annotation of casual conversations. In: Proceedings of the International Conference “Dialogue, vol. 2023 (2023)
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. CoRR abs/1908.10084 (2019).http://arxiv.org/abs/1908.10084
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2015).https://doi.org/10.1109/cvpr.2015.7298682
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.R.: GLUE: a multi-task benchmark and analysis platform for natural language understanding. CoRR abs/1804.07461 (2018).http://arxiv.org/abs/1804.07461
Williams, A., Nangia, N., Bowman, S.R.: A broad-coverage challenge corpus for sentence understanding through inference. CoRR abs/1704.05426 (2017).http://arxiv.org/abs/1704.05426
Zhang, J., et al.: Discriminative nearest neighbor few-shot intent detection by transferring natural language inference. CoRR abs/2010.13009 (2020).https://arxiv.org/abs/2010.13009
Zhang, J.G., et al.: Are pretrained transformers robust in intent classification? A missing ingredient in evaluation of out-of-scope intent detection. In: The 4th Workshop on NLP for Conversational AI. ACL 2022 (2022)
Acknowledgments
This work was supported by a grant for research centers in the field of artificial intelligence, provided by the Analytical Center for the Government of the Russian Federation in accordance with the subsidy agreement (agreement identifier 000000D730321P5Q0002) and the agreement with the Moscow Institute of Physics and Technology dated November 1, 2021 No. 70-2021-00138.
Author information
Authors and Affiliations
Moscow Institute of Physics and Technology, Dolgoprudny, Russia
Maksim Savkin & Vasily Konovalov
- Maksim Savkin
You can also search for this author inPubMed Google Scholar
- Vasily Konovalov
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toMaksim Savkin.
Editor information
Editors and Affiliations
National Research University Higher School of Economics, Moscow, Russia
Dmitry I. Ignatov
Krasovskii Institute of Mathematics and Mechanics of Russian Academy of Sciences, Yekaterinburg, Russia
Michael Khachay
University of Oslo, Oslo, Norway
Andrey Kutuzov
American University of Armenia, Yerevan, Armenia
Habet Madoyan
Artificial Intelligence Research Institute, Moscow, Russia
Ilya Makarov
Universität Hamburg, Hamburg, Germany
Irina Nikishina
Skolkovo Institute of Science and Technology, Moscow, Russia
Alexander Panchenko
Mohamed bin Zayed University of Artificial Intelligence and Technology Innovation Institute, Abu Dhabi, United Arab Emirates
Maxim Panov
Industrial and Systems Engineering, University of Florida, Gainesville, FL, USA
Panos M. Pardalos
National Research University Higher School of Economics, Nizhny Novgorod, Russia
Andrey V. Savchenko
Apptek, Aachen, Nordrhein-Westfalen, Germany
Evgenii Tsymbalov
Kazan Federal University and HSE University, Moscow, Russia
Elena Tutubalina
MTS AI, Moscow, Russia
Sergey Zagoruyko
A Appendix
A Appendix
Comparison of kNN-based methods on test sets of 3 different intent classification datasets. Higher area under curve means more robustness to threshold selection.
Comparison between fine-tuned baselines and our approach on test sets of 3 different intent classification datasets. Higher area under curve means more robustness to threshold selection.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Savkin, M., Konovalov, V. (2024). Tuning-Free Discriminative Nearest Neighbor Few-Shot Intent Detection via Consecutive Knowledge Transfer. In: Ignatov, D.I.,et al. Recent Trends in Analysis of Images, Social Networks and Texts. AIST 2023. Communications in Computer and Information Science, vol 1905. Springer, Cham. https://doi.org/10.1007/978-3-031-67008-4_8
Download citation
Published:
Publisher Name:Springer, Cham
Print ISBN:978-3-031-67007-7
Online ISBN:978-3-031-67008-4
eBook Packages:Computer ScienceComputer Science (R0)
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative