Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

Advertisement

Springer Nature Link
Log in

Tuning-Free Discriminative Nearest Neighbor Few-Shot Intent Detection via Consecutive Knowledge Transfer

  • Conference paper
  • First Online:

Abstract

Few-shot intent classification and out-of-scope (OOS) detection are core components of task-oriented dialogue systems. Solving both tasks can be challenging because of limited data availability. In this study, we aim to develop a few-shot intent classification model capable of OOS detection that does not require fine-tuning on target data. We adopt the discriminative nearest neighbor classification architecture and replace the fine-tuning phase with a consecutive pre-training approach involving natural language inference and paraphrasing tasks. Our approach leverages the training set for predictions, offering a quick and convenient way to adjust the model’s behavior by modifying a set of few labeled examples. When compared to methods that do not require fine-tuning, the developed model exhibits higher scores on various few-shot intent classification datasets.

This is a preview of subscription content,log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 12583
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 10724
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide -see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Similar content being viewed by others

References

  1. Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. CoRR abs/1508.05326 (2015).http://arxiv.org/abs/1508.05326

  2. Burtsev, M., et al.: Deeppavlov: an open source library for conversational AI. In: NIPS (2018).https://openreview.net/pdf?id=BJzyCF6Vn7

  3. Casanueva, I., Temcinas, T., Gerz, D., Henderson, M., Vulic, I.: Efficient intent detection with dual sentence encoders. CoRR abs/2003.04807 (2020).https://arxiv.org/abs/2003.04807

  4. Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 670–680. Association for Computational Linguistics, Copenhagen, Denmark (2017).https://doi.org/10.18653/v1/D17-1070,https://aclanthology.org/D17-1070

  5. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018).http://arxiv.org/abs/1810.04805

  6. Dolan, W.B., Brockett, C.: Automatically constructing a corpus of sentential paraphrases. In: Proceedings of the Third International Workshop on Paraphrasing (IWP2005) (2005).https://aclanthology.org/I05-5002

  7. Gao, T., Yao, X., Chen, D.: SimCSE: simple contrastive learning of sentence embeddings (2022)

    Google Scholar 

  8. Gunel, B., Du, J., Conneau, A., Stoyanov, V.: Supervised contrastive learning for pre-trained language model fine-tuning (2021)

    Google Scholar 

  9. Hendrycks, D., Lee, K., Mazeika, M.: Using pre-training can improve model robustness and uncertainty (2019)

    Google Scholar 

  10. Iskender, B., Xu, Z., Kornblith, S., Chu, E.H., Khademi, M.: Improving dense contrastive learning with dense negative pairs (2023)

    Google Scholar 

  11. Karpov, D., Konovalov, V.: Knowledge transfer between tasks and languages in the multi-task encoder-agnostic transformer-based models. In: Computational Linguistics and Intellectual Technologies, vol. 2023 (2023).https://doi.org/10.28995/2075-7182-2023-22-200-214,https://www.dialog-21.ru/media/5902/karpovdpluskonovalovv002.pdf

  12. Kolesnikova, A., Kuratov, Y., Konovalov, V., Mikhail, M.: Knowledge distillation of Russian language models with reduction of vocabulary. In: Computational Linguistics and Intellectual Technologies. RSUH (2022).https://doi.org/10.28995/2075-7182-2022-21-295-310,https://www.dialog-21.ru/media/5770/kolesnikovaaplusetal036.pdf

  13. Konovalov, V., Gulyaev, P., Sorokin, A., Kuratov, Y., Burtsev, M.: Exploring the BERT cross-lingual transfer for reading comprehension. In: Computational Linguistics and Intellectual Technologies, pp. 445–453 (2020).https://doi.org/10.28995/2075-7182-2020-19-445-453,http://www.dialog-21.ru/media/5100/konovalovvpplusetal-118.pdf

  14. Konovalov, V., Melamud, O., Artstein, R., Dagan, I.: Collecting better training data using biased agent policies in negotiation dialogues. In: Proceedings of WOCHAT, the Second Workshop on Chatbots and Conversational Agent Technologies. Zerotype, Los Angeles (2016).http://workshop.colips.org/wochat/documents/RP-270.pdf

  15. Larson, S., et al.: An evaluation dataset for intent classification and out-of-scope prediction. CoRR abs/1909.02027 (2019).http://arxiv.org/abs/1909.02027

  16. Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. CoRR abs/1907.11692 (2019).http://arxiv.org/abs/1907.11692

  17. Nie, Y., Wang, S., Bansal, M.: Revealing the importance of semantic retrieval for machine reading at scale. CoRR abs/1909.08041 (2019).http://arxiv.org/abs/1909.08041

  18. Ostyakova, L., Molchanova, M., Petukhova, K., Smilga, N., Kornev, D., Burtsev, M.: Corpus with speech function annotation: challenges, advantages, and limitations. In: Computational Linguistics and Intellectual Technologies, pp. 1129–1139 (2022)

    Google Scholar 

  19. Ostyakova, L., PetukhovaO, K., Smilga, V., ZharikovaO, D.: Linguistic annotation generation with chatGPT: a synthetic dataset of speech functions for discourse annotation of casual conversations. In: Proceedings of the International Conference “Dialogue, vol. 2023 (2023)

    Google Scholar 

  20. Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. CoRR abs/1908.10084 (2019).http://arxiv.org/abs/1908.10084

  21. Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2015).https://doi.org/10.1109/cvpr.2015.7298682

  22. Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.R.: GLUE: a multi-task benchmark and analysis platform for natural language understanding. CoRR abs/1804.07461 (2018).http://arxiv.org/abs/1804.07461

  23. Williams, A., Nangia, N., Bowman, S.R.: A broad-coverage challenge corpus for sentence understanding through inference. CoRR abs/1704.05426 (2017).http://arxiv.org/abs/1704.05426

  24. Zhang, J., et al.: Discriminative nearest neighbor few-shot intent detection by transferring natural language inference. CoRR abs/2010.13009 (2020).https://arxiv.org/abs/2010.13009

  25. Zhang, J.G., et al.: Are pretrained transformers robust in intent classification? A missing ingredient in evaluation of out-of-scope intent detection. In: The 4th Workshop on NLP for Conversational AI. ACL 2022 (2022)

    Google Scholar 

Download references

Acknowledgments

This work was supported by a grant for research centers in the field of artificial intelligence, provided by the Analytical Center for the Government of the Russian Federation in accordance with the subsidy agreement (agreement identifier 000000D730321P5Q0002) and the agreement with the Moscow Institute of Physics and Technology dated November 1, 2021 No. 70-2021-00138.

Author information

Authors and Affiliations

  1. Moscow Institute of Physics and Technology, Dolgoprudny, Russia

    Maksim Savkin & Vasily Konovalov

Authors
  1. Maksim Savkin

    You can also search for this author inPubMed Google Scholar

  2. Vasily Konovalov

    You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toMaksim Savkin.

Editor information

Editors and Affiliations

  1. National Research University Higher School of Economics, Moscow, Russia

    Dmitry I. Ignatov

  2. Krasovskii Institute of Mathematics and Mechanics of Russian Academy of Sciences, Yekaterinburg, Russia

    Michael Khachay

  3. University of Oslo, Oslo, Norway

    Andrey Kutuzov

  4. American University of Armenia, Yerevan, Armenia

    Habet Madoyan

  5. Artificial Intelligence Research Institute, Moscow, Russia

    Ilya Makarov

  6. Universität Hamburg, Hamburg, Germany

    Irina Nikishina

  7. Skolkovo Institute of Science and Technology, Moscow, Russia

    Alexander Panchenko

  8. Mohamed bin Zayed University of Artificial Intelligence and Technology Innovation Institute, Abu Dhabi, United Arab Emirates

    Maxim Panov

  9. Industrial and Systems Engineering, University of Florida, Gainesville, FL, USA

    Panos M. Pardalos

  10. National Research University Higher School of Economics, Nizhny Novgorod, Russia

    Andrey V. Savchenko

  11. Apptek, Aachen, Nordrhein-Westfalen, Germany

    Evgenii Tsymbalov

  12. Kazan Federal University and HSE University, Moscow, Russia

    Elena Tutubalina

  13. MTS AI, Moscow, Russia

    Sergey Zagoruyko

A Appendix

A Appendix

See Table 6 and Figs. 6,7.

Table 6. Hyperparameter settings used for fine-tuning
Fig. 6.
figure 6

Comparison of kNN-based methods on test sets of 3 different intent classification datasets. Higher area under curve means more robustness to threshold selection.

Fig. 7.
figure 7

Comparison between fine-tuned baselines and our approach on test sets of 3 different intent classification datasets. Higher area under curve means more robustness to threshold selection.

Rights and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Savkin, M., Konovalov, V. (2024). Tuning-Free Discriminative Nearest Neighbor Few-Shot Intent Detection via Consecutive Knowledge Transfer. In: Ignatov, D.I.,et al. Recent Trends in Analysis of Images, Social Networks and Texts. AIST 2023. Communications in Computer and Information Science, vol 1905. Springer, Cham. https://doi.org/10.1007/978-3-031-67008-4_8

Download citation

Publish with us

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 12583
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 10724
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide -see info

Tax calculation will be finalised at checkout

Purchases are for personal use only


[8]ページ先頭

©2009-2025 Movatter.jp