Movatterモバイル変換

Part of the book series:Communications in Computer and Information Science ((CCIS,volume 1905))

Included in the following conference series:

International Conference on Analysis of Images, Social Networks and Texts

159Accesses

Abstract

Few-shot intent classification and out-of-scope (OOS) detection are core components of task-oriented dialogue systems. Solving both tasks can be challenging because of limited data availability. In this study, we aim to develop a few-shot intent classification model capable of OOS detection that does not require fine-tuning on target data. We adopt the discriminative nearest neighbor classification architecture and replace the fine-tuning phase with a consecutive pre-training approach involving natural language inference and paraphrasing tasks. Our approach leverages the training set for predictions, offering a quick and convenient way to adjust the model’s behavior by modifying a set of few labeled examples. When compared to methods that do not require fine-tuning, the developed model exhibits higher scores on various few-shot intent classification datasets.

This is a preview of subscription content,log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 12583; Price includes VAT (Japan)

Softcover Book: JPY 10724; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Few-shot out-of-scope intent classification: analyzing the robustness of prompt-based learning

Article06 January 2024

Episode-Based Prompt Learning for Any-Shot Intent Detection

Context Aware Joint Modeling of Domain Classification, Intent Detection and Slot Filling with Zero-Shot Intent Detection Approach

References

Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. CoRR abs/1508.05326 (2015).http://arxiv.org/abs/1508.05326
Burtsev, M., et al.: Deeppavlov: an open source library for conversational AI. In: NIPS (2018).https://openreview.net/pdf?id=BJzyCF6Vn7
Casanueva, I., Temcinas, T., Gerz, D., Henderson, M., Vulic, I.: Efficient intent detection with dual sentence encoders. CoRR abs/2003.04807 (2020).https://arxiv.org/abs/2003.04807
Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 670–680. Association for Computational Linguistics, Copenhagen, Denmark (2017).https://doi.org/10.18653/v1/D17-1070,https://aclanthology.org/D17-1070
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018).http://arxiv.org/abs/1810.04805
Dolan, W.B., Brockett, C.: Automatically constructing a corpus of sentential paraphrases. In: Proceedings of the Third International Workshop on Paraphrasing (IWP2005) (2005).https://aclanthology.org/I05-5002
Gao, T., Yao, X., Chen, D.: SimCSE: simple contrastive learning of sentence embeddings (2022)
Google Scholar
Gunel, B., Du, J., Conneau, A., Stoyanov, V.: Supervised contrastive learning for pre-trained language model fine-tuning (2021)
Google Scholar
Hendrycks, D., Lee, K., Mazeika, M.: Using pre-training can improve model robustness and uncertainty (2019)
Google Scholar
Iskender, B., Xu, Z., Kornblith, S., Chu, E.H., Khademi, M.: Improving dense contrastive learning with dense negative pairs (2023)
Google Scholar
Karpov, D., Konovalov, V.: Knowledge transfer between tasks and languages in the multi-task encoder-agnostic transformer-based models. In: Computational Linguistics and Intellectual Technologies, vol. 2023 (2023).https://doi.org/10.28995/2075-7182-2023-22-200-214,https://www.dialog-21.ru/media/5902/karpovdpluskonovalovv002.pdf
Kolesnikova, A., Kuratov, Y., Konovalov, V., Mikhail, M.: Knowledge distillation of Russian language models with reduction of vocabulary. In: Computational Linguistics and Intellectual Technologies. RSUH (2022).https://doi.org/10.28995/2075-7182-2022-21-295-310,https://www.dialog-21.ru/media/5770/kolesnikovaaplusetal036.pdf
Konovalov, V., Gulyaev, P., Sorokin, A., Kuratov, Y., Burtsev, M.: Exploring the BERT cross-lingual transfer for reading comprehension. In: Computational Linguistics and Intellectual Technologies, pp. 445–453 (2020).https://doi.org/10.28995/2075-7182-2020-19-445-453,http://www.dialog-21.ru/media/5100/konovalovvpplusetal-118.pdf
Konovalov, V., Melamud, O., Artstein, R., Dagan, I.: Collecting better training data using biased agent policies in negotiation dialogues. In: Proceedings of WOCHAT, the Second Workshop on Chatbots and Conversational Agent Technologies. Zerotype, Los Angeles (2016).http://workshop.colips.org/wochat/documents/RP-270.pdf
Larson, S., et al.: An evaluation dataset for intent classification and out-of-scope prediction. CoRR abs/1909.02027 (2019).http://arxiv.org/abs/1909.02027
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019).http://arxiv.org/abs/1907.11692
Nie, Y., Wang, S., Bansal, M.: Revealing the importance of semantic retrieval for machine reading at scale. CoRR abs/1909.08041 (2019).http://arxiv.org/abs/1909.08041
Ostyakova, L., Molchanova, M., Petukhova, K., Smilga, N., Kornev, D., Burtsev, M.: Corpus with speech function annotation: challenges, advantages, and limitations. In: Computational Linguistics and Intellectual Technologies, pp. 1129–1139 (2022)
Google Scholar
Ostyakova, L., PetukhovaO, K., Smilga, V., ZharikovaO, D.: Linguistic annotation generation with chatGPT: a synthetic dataset of speech functions for discourse annotation of casual conversations. In: Proceedings of the International Conference “Dialogue, vol. 2023 (2023)
Google Scholar
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. CoRR abs/1908.10084 (2019).http://arxiv.org/abs/1908.10084
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2015).https://doi.org/10.1109/cvpr.2015.7298682
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.R.: GLUE: a multi-task benchmark and analysis platform for natural language understanding. CoRR abs/1804.07461 (2018).http://arxiv.org/abs/1804.07461
Williams, A., Nangia, N., Bowman, S.R.: A broad-coverage challenge corpus for sentence understanding through inference. CoRR abs/1704.05426 (2017).http://arxiv.org/abs/1704.05426
Zhang, J., et al.: Discriminative nearest neighbor few-shot intent detection by transferring natural language inference. CoRR abs/2010.13009 (2020).https://arxiv.org/abs/2010.13009
Zhang, J.G., et al.: Are pretrained transformers robust in intent classification? A missing ingredient in evaluation of out-of-scope intent detection. In: The 4th Workshop on NLP for Conversational AI. ACL 2022 (2022)
Google Scholar

Download references

Acknowledgments

This work was supported by a grant for research centers in the field of artificial intelligence, provided by the Analytical Center for the Government of the Russian Federation in accordance with the subsidy agreement (agreement identifier 000000D730321P5Q0002) and the agreement with the Moscow Institute of Physics and Technology dated November 1, 2021 No. 70-2021-00138.

Author information

Authors and Affiliations

Moscow Institute of Physics and Technology, Dolgoprudny, Russia
Maksim Savkin & Vasily Konovalov

Authors

Maksim Savkin
View author publications
You can also search for this author inPubMed Google Scholar
Vasily Konovalov
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toMaksim Savkin.

Editor information

Editors and Affiliations

National Research University Higher School of Economics, Moscow, Russia
Dmitry I. Ignatov
Krasovskii Institute of Mathematics and Mechanics of Russian Academy of Sciences, Yekaterinburg, Russia
Michael Khachay
University of Oslo, Oslo, Norway
Andrey Kutuzov
American University of Armenia, Yerevan, Armenia
Habet Madoyan
Artificial Intelligence Research Institute, Moscow, Russia
Ilya Makarov
Universität Hamburg, Hamburg, Germany
Irina Nikishina
Skolkovo Institute of Science and Technology, Moscow, Russia
Alexander Panchenko
Mohamed bin Zayed University of Artificial Intelligence and Technology Innovation Institute, Abu Dhabi, United Arab Emirates
Maxim Panov
Industrial and Systems Engineering, University of Florida, Gainesville, FL, USA
Panos M. Pardalos
National Research University Higher School of Economics, Nizhny Novgorod, Russia
Andrey V. Savchenko
Apptek, Aachen, Nordrhein-Westfalen, Germany
Evgenii Tsymbalov
Kazan Federal University and HSE University, Moscow, Russia
Elena Tutubalina
MTS AI, Moscow, Russia
Sergey Zagoruyko

A Appendix

See Table 6 and Figs. 6,7.

Table 6. Hyperparameter settings used for fine-tuning

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Savkin, M., Konovalov, V. (2024). Tuning-Free Discriminative Nearest Neighbor Few-Shot Intent Detection via Consecutive Knowledge Transfer. In: Ignatov, D.I.,et al. Recent Trends in Analysis of Images, Social Networks and Texts. AIST 2023. Communications in Computer and Information Science, vol 1905. Springer, Cham. https://doi.org/10.1007/978-3-031-67008-4_8

Download citation

DOI:https://doi.org/10.1007/978-3-031-67008-4_8
Published:30 July 2024
Publisher Name:Springer, Cham
Print ISBN:978-3-031-67007-7
Online ISBN:978-3-031-67008-4
eBook Packages:Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Movatterモバイル変換

Tuning-Free Discriminative Nearest Neighbor Few-Shot Intent Detection via Consecutive Knowledge Transfer

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Few-shot out-of-scope intent classification: analyzing the robustness of prompt-based learning

Episode-Based Prompt Learning for Any-Shot Intent Detection

Context Aware Joint Modeling of Domain Classification, Intent Detection and Slot Filling with Zero-Shot Intent Detection Approach

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Appendix

A Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Access this chapter

Subscribe and save

Buy Now