Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
Springer Nature Link
Log in

All is attention for multi-label text classification

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Multi-label text classification(MLTC) is a key task in natural language processing. Its challenge is to extract latent semantic features from text and effectively exploit label-associated features. This work proposes an MLTC model driven solely by attention mechanisms, which includes Graph Attention(GA), Class-Specific Attention(CSA), and Multi-Head Attention(MHA) modules. The GA module examines and records label dependencies by considering label semantic features as attributes of graph nodes. It uses graph embedding to maintain structural relationships within the label graph. Meanwhile, the CSA module produces distinctive features for each category by utilizing spatial attention scores, thereby improving classification accuracy. Then, the MHA module facilitates extensive feature interactions, enhancing the expressiveness of text features and supporting the handling of long-range dependencies. Experimental evaluations conducted on two MLTC datasets show that our proposed model outperforms existing MLTC algorithms, achieving state-of-the-art performance. These results highlight the effectiveness of our attention-based approach in tackling the complexity of MLTC tasks.

This is a preview of subscription content,log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Schapire RE, Singer Y (2000) Boostexter: A boosting-based system for text categorization. Mach Learn 39:135–168

    Article MATH  Google Scholar 

  2. Katakis I, Tsoumakas G, Vlahavas I (2008) Multilabel text classification for automated tag suggestion. ECML PKDD discovery challenge 75:2008

    MATH  Google Scholar 

  3. Gopal S, Yang Y, Bai B, Niculescu-Mizil A (2012) Bayesian models for large-scale hierarchical classification. Advances in Neural Information Processing Systems25

  4. Zeng Y, Mai S, Hu H (2021) Which is making the contribution: Modulating unimodal and cross-modal dynamics for multimodal sentiment analysis. In: Findings of the Association for Computational Linguistics: EMNLP 2021, 1262–1274

  5. Zhang M-L, Li Y-K, Liu X-Y, Geng X (2018) Binary relevance for multi-label learning: an overview. Front Comp Sci 12:191–202

    Article MATH  Google Scholar 

  6. Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85:333–359

    Article MathSciNet MATH  Google Scholar 

  7. Elisseeff A, Weston J (2001) A kernel method for multi-labelled classification. Advances in neural information processing systems14

  8. Zhang M-L, Zhou Z-H (2007) Ml-knn: A lazy learning approach to multi-label learning. Pattern Recogn 40(7):2038–2048

    Article MATH  Google Scholar 

  9. Read J, Perez-Cruz F (2014) Deep learning for multi-label classification. arXiv preprintarXiv:1502.05988

  10. Chen Y (2015) Convolutional neural network for sentence classification. Master’s thesis, University of Waterloo

  11. Qian Q, Tian B, Huang M, Liu Y, Zhu X, Zhu X (2015) Learning tag embeddings and tag-specific composition functions in recursive neural network. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1365–1374

  12. Wu Z, Pan S, Chen F, Long G, Zhang C, Philip SY (2020) A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems 32(1):4–24

    Article MathSciNet MATH  Google Scholar 

  13. Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprintarXiv:1609.02907

  14. Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y et al (2017) Graph attention networks. stat 1050(20):10–48550

    Google Scholar 

  15. Yang P, Sun X, Li W, Ma S, Wu W, Wang H (2018) Sgm: Sequence generation model for multi-label classification. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 3915–3926

  16. Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprintarXiv:1810.04805

  17. Yarullin R, Serdyukov P (2021) Bert for sequence-to-sequence multi-label text classification. In: Analysis of Images, Social Networks and Texts: 9th International Conference, AIST 2020, Skolkovo, Moscow, Russia, October 15–16, 2020, Revised Selected Papers 9, pp. 187–198

  18. Yang Z, Emmert-Streib F (2024) Optimal performance of binary relevance cnn in targeted multi-label text classification. Knowl-Based Syst 284:111286

    Article MATH  Google Scholar 

  19. Ionescu RT, Butnaru AM (2019) Vector of locally-aggregated word embeddings (vlawe): A novel document-level representation. arXiv preprintarXiv:1902.08850

  20. Song R, Chen X, Liu Z, An H, Zhang Z, Wang X, Xu H (2021) Label mask for multi-label text classification. arXiv preprintarXiv:2106.10076

  21. Liu H, Yuan C, Wang X (2020) Label-wise document pre-training for multi-label text classification. In: Natural Language Processing and Chinese Computing: 9th CCF International Conference, NLPCC 2020, Zhengzhou, China, October 14–18, 2020, Proceedings, Part I 9, pp. 641–653

  22. Xiao L, Huang X, Chen B, Jing L (2019) Label-specific document representation for multi-label text classification. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 466–475

  23. Vu H-T, Nguyen M-T, Nguyen V-C, Pham M-H, Nguyen V-Q, Nguyen V-H (2023) Label-representative graph convolutional network for multi-label text classification. Appl Intell 53(12):14759–14774

    Article MATH  Google Scholar 

  24. Xiao L, Zhang X, Jing L, Huang C, Song M (2021) Does head label help for long-tailed multi-label text classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 14103–14111

  25. Yao L, Mao C, Luo Y (2019) Graph convolutional networks for text classification. Proceedings of the AAAI Conference on Artificial Intelligence 33:7370–7377

    Article MATH  Google Scholar 

  26. Wu F, Souza A, Zhang T, Fifty C, Yu T, Weinberger K (2019) Simplifying graph convolutional networks. In: International Conference on Machine Learning, pp. 6861–6871

  27. Zeng D, Zha E, Kuang J, Shen Y (2024) Multi-label text classification based on semantic-sensitive graph convolutional network. Knowl-Based Syst 284:111303

    Article MATH  Google Scholar 

  28. Li I, Feng A, Wu H, Li T, Suzumura T, Dong R (2021) Ligcn: label-interpretable graph convolutional networks for multi-label text classification. arXiv preprintarXiv:2103.14620

  29. Ma Q, Yuan C, Zhou W, Hu S (2021) Label-specific dual graph neural network for multi-label text classification. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 3855–386

  30. Wang K, Han SC, Poon J (2022) Induct-gcn: Inductive graph convolutional networks for text classification. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 1243–1249

  31. Wei X, Huang J, Zhao R, Yu H, Xu Z (2024) Multi-label text classification model based on multi-level constraint augmentation and label association attention. ACM Transactions on Asian and Low-Resource Language Information Processing 23(1):1–20

    Article MATH  Google Scholar 

  32. Huang X, Chen B, Xiao L, Yu J, Jing L (2022) Label-aware document representation via hybrid attention for extreme multi-label text classification. Neural Processing Letters, 1–17

  33. Wang B, Liu J, Chen S, Ling X, Wang S, Zhang W, Chen L, Zhang J (2021) A residual dynamic graph convolutional network for multi-label text classification. In: Natural Language Processing and Chinese Computing: 10th CCF International Conference, NLPCC 2021, Qingdao, China, October 13–17, 2021, Proceedings, Part I 10, pp. 664–675

  34. Yan Y, Liu F, Zhuang X, Ju J (2023) An r-transformer_bilstm model based on attention for multi-label text classification. Neural Process Lett 55(2):1293–1316

    Article MATH  Google Scholar 

  35. Liu M, Liu L, Cao J, Du Q (2022) Co-attention network with label embedding for text classification. Neurocomputing 471:61–69

    Article MATH  Google Scholar 

  36. Pal A, Selvakumar M, Sankarasubbu M (2020) Multi-label text classification using attention-based graph neural network. arXiv preprintarXiv:2003.11644

  37. Lin C, Zhu C, Zhu W (2022) Multi-label text classification based on graph attention network and self-attention mechanism. In: 2nd International Conference on Information Technology and Intelligent Control (CITIC 2022),12346, 272–280

  38. Sundermeyer M, Schlüter R, Ney H (2012) Lstm neural networks for language modeling. In: Interspeech,2012, 194–197

  39. Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov RR, Le QV (2019) Xlnet: Generalized autoregressive pretraining for language understanding. Advances in neural information processing systems32

  40. Zhu K, Wu J (2021) Residual attention: A simple but effective method for multi-label recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 184–193

  41. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in neural information processing systems30

  42. Gomez R (2018) Understanding categorical cross-entropy loss, binary cross-entropy loss, softmax loss, logistic loss, focal loss and all those confusing names. URL:https://gombru.github.io/2018/05/23/cross_entropy_loss/(visited on 29/03/2019)

  43. Debole F, Sebastiani F (2005) An analysis of the relative hardness of reuters-21578 subsets. J Am Soc Inform Sci Technol 56(6):584–596

    Article MATH  Google Scholar 

  44. Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489

  45. Loshchilov I, Hutter F (2017) Decoupled weight decay regularization. arXiv preprintarXiv:1711.05101

  46. Xu P, Xiao L, Liu B, Lu S, Jing L, Yu J (2023) Label-specific feature augmentation for long-tailed multi-label text classification. Proceedings of the AAAI Conference on Artificial Intelligence 37:10602–10610

    Article MATH  Google Scholar 

  47. Wang G, Du Y, Jiang Y, Liu J, Li X, Chen X, Gao H, Xie C, Lee Y-l (2024) Label-text bi-attention capsule networks model for multi-label text classification. Neurocomputing588, 127671

  48. Chen G, Ye D, Xing Z, Chen J, Cambria E (2017) Ensemble application of convolutional and recurrent neural networks for multi-label text categorization. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 2377–2383

  49. Li B, Chen Y, Zeng L (2024) Kenet: Knowledge-enhanced doc-label attention network for multi-label text classification. In: ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 11961–11965

  50. Liu N, Wang Q, Ren J (2021) Label-embedding bi-directional attentive model for multi-label text classification. Neural Process Lett 53:375–389

    Article MATH  Google Scholar 

  51. Zhou B, Cui Q, Wei X-S, Chen Z-M (2020) Bbn: Bilateral-branch network with cumulative learning for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9719–9728

Download references

Author information

Authors and Affiliations

  1. School of Artificial Intelligence, Chongqing University of Technology, Chongqing, China

    Zhi Liu, Yunjie Huang, Xincheng Xia & Yihao Zhang

Authors
  1. Zhi Liu

    You can also search for this author inPubMed Google Scholar

  2. Yunjie Huang

    You can also search for this author inPubMed Google Scholar

  3. Xincheng Xia

    You can also search for this author inPubMed Google Scholar

  4. Yihao Zhang

    You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toZhi Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Z., Huang, Y., Xia, X.et al. All is attention for multi-label text classification.Knowl Inf Syst67, 1249–1270 (2025). https://doi.org/10.1007/s10115-024-02253-w

Download citation

Keywords

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Advertisement


[8]ページ先頭

©2009-2025 Movatter.jp