- Zhi Wang1,
- Yi Zhu ORCID:orcid.org/0000-0003-3045-25881,2,3,
- Yun Li1,
- Jipeng Qiang1,
- Yunhao Yuan1 &
- …
- Chaowei Zhang1
196Accesses
Abstract
Short-text clustering, which has attracted much attention with the rapid development of social media in recent decades, is a great challenge due to the feature sparsity, high ambiguity, and massive quantity. Recently, pre-trained language models (PLMs)-based methods have achieved fairly good results on this task. However, two main problems still hang in the air: (1) the significant gap of objective forms in pretraining and fine-tuning, which restricts taking full advantage of knowledge in PLMs. (2) Most existing methods require a post-processing operation for clustering label learning, potentially leading to label estimation errors for different data distributions. To address these problems, in this paper, we propose an Asymmetric Short-Text Clustering via Prompt (short for ASTCP), the features learned with our ASTCP are denser and constricted for clustering. Specifically, a subset text of the corpus is first selected by an asymmetric prompt-tuning network, which aims to obtain predicted label as a clustering center. Then, by the propagation of predicted-label information, a fine-tuned model is designed for representation learning. Thus, a clustering module, such as K-means, is built to directly output clustering labels on top of these representations. Extensive experiments conducted on three datasets have demonstrated that our ASTCP can significantly and consistently outperform other SOTA clustering methods. The source code is available athttps://github.com/zhuyi_yzu/ASTCP.
This is a preview of subscription content,log in via an institution to check access.
Access this article
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
Price includes VAT (Japan)
Instant access to the full article PDF.









Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The data of this paper are available from the corresponding author upon reasonable request.
References
Kuhn, A., Ducasse, S., Gîrba, T.: Semantic clustering: identifying topics in source code. Inf. Softw. Technol.49(3), 230–243 (2007)
Zhu, Y., Li, L., Wu, X.: Stacked convolutional sparse auto-encoders for representation learning. ACM Trans. Knowl. Discov. Data (TKDD)15(2), 1–21 (2021)
Leung, K.W.-T., Ng, W., Lee, D.L.: Personalized concept-based clustering of search engine queries. IEEE Trans. Knowl. Data Eng.20(11), 1505–1518 (2008)
Yi, Z., Hu, X., Zhang, Y., Li, P.: Transfer learning with stacked reconstruction independent component analysis. Knowl.-Based Syst.152, 100–106 (2018)
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. (CSUR)31(3), 264–323 (1999)
Zhu, Y., Wu, X., Li, Y., Qiang, J., Yuan, Y.: Self-adaptive imbalanced domain adaptation with deep sparse autoencoder. IEEE Trans. Artif. Intell.4(5), 1293–1304 (2023)
Yin, J., Wang, J.: A Dirichlet multinomial mixture model-based approach for short text clustering. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 233–242 (2014)
Xu, J., Xu, B., Wang, P., Zheng, S., Tian, G., Zhao, J.: Self-taught convolutional neural networks for short text clustering. Neural Netw.88, 22–31 (2017)
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. arXiv preprintarXiv:2107.13586 (2021)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprintarXiv:1810.04805 (2018)
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations. arXiv preprintarXiv:1909.11942 (2019)
Ma, X., Wang, Z., Ng, P., Nallapati, R., Xiang, B.: Universal text representation from BERT: an empirical study. arXiv preprintarXiv:1910.07973 (2019)
Goldberg, Y.: Assessing BERT’s syntactic abilities. arXiv preprintarXiv:1901.05287 (2019)
Tenney, I., Das, D., Pavlick, E.: BERT rediscovers the classical NLP pipeline. arXiv preprintarXiv:1905.05950 (2019)
Sun, C., Qiu, X., Xu, Y., Huang, X.: How to fine-tune BERT for text classification? In: China National Conference on Chinese Computational Linguistics, pp. 194–206. Springer (2019)
Li, G., Liu, F.: A clustering-based approach on sentiment analysis. In: 2010 IEEE International Conference on Intelligent Systems and Knowledge Engineering, pp. 331–337. IEEE (2010)
Zhang, L., Wang, S., Liu, B.: Deep learning for sentiment analysis: a survey. Wiley Interdiscipl. Rev. Data Min. Knowl. Discov.8(4), 1253 (2018)
Dahiya, K., Saini, D., Mittal, A., Shaw, A., Dave, K., Soni, A., Jain, H., Agarwal, S., Varma, M.: DeepXML: a deep extreme multi-label learning framework applied to short text documents. In: Proceedings of the 14th ACM International Conference on Web Search and Data Mining, pp. 31–39 (2021)
Chen, L., Su, W., Wu, M., Pedrycz, W., Hirota, K.: A fuzzy deep neural network with sparse autoencoder for emotional intention understanding in human–robot interaction. IEEE Trans. Fuzzy Syst.28(7), 1252–1264 (2020)
Zhu, Y., Wu, X., Qiang, J., Yuan, Y., Li, Y.: Representation learning with collaborative autoencoder for personalized recommendation. Expert Syst. Appl.186, 115825 (2021)
Liu, Z., Yu, W., Chen, W., Wang, S., Wu, F.: Short text feature selection for micro-blog mining. In: 2010 International Conference on Computational Intelligence and Software Engineering, pp. 1–4. IEEE (2010)
He, X., Kan, M.-Y., Xie, P., Chen, X.: Comment-based multi-view clustering of web 2.0 items. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 771–782 (2014)
Wu, F., Qiao, Y., Chen, J.-H., Wu, C., Qi, T., Lian, J., Liu, D., Xie, X., Gao, J., Wu, W.: MIND: a large-scale dataset for news recommendation. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 3597–3606 (2020)
Blei, D., Ng, A., Jordan, M.: Latent Dirichlet allocation. Adv. Neural Inf. Process. Syst.14, 601–608 (2001)
Wang, P., Xu, B., Xu, J., Tian, G., Liu, C.-L., Hao, H.: Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification. Neurocomputing174, 806–814 (2016)
Caron, M., Bojanowski, P., Joulin, A., Douze, M.: Deep clustering for unsupervised learning of visual features. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 132–149 (2018)
Zhang, D., Nan, F., Wei, X., Li, S., Zhu, H., McKeown, K., Nallapati, R., Arnold, A., Xiang, B.: Supporting clustering with contrastive learning. arXiv preprintarXiv:2103.12953 (2021)
Weng, R., Yu, H., Huang, S., Cheng, S., Luo, W.: Acquiring knowledge from pre-trained model to neural machine translation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 9266–9273 (2020)
Wang, X., Gao, T., Zhu, Z., Zhang, Z., Liu, Z., Li, J., Tang, J.: KEPLER: a unified model for knowledge embedding and pre-trained language representation. Trans. Assoc. Comput. Linguist.9, 176–194 (2021)
Zhu, H., Tiwari, P., Ghoneim, A., Hossain, M.S.: A collaborative AI-enabled pretrained language model for AIOT domain question answering. IEEE Trans. Ind. Inf.18(5), 3387–3396 (2021)
Jawahar, G., Sagot, B., Seddah, D.: What does BERT learn about the structure of language? In: ACL 2019-57th Annual Meeting of the Association for Computational Linguistics (2019)
Wu, S., He, Y.: Enriching pre-trained language model with entity information for relation classification. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 2361–2364 (2019)
Nguyen, D.Q., Vu, T., Nguyen, A.T.: BERTweet: a pre-trained language model for English tweets. arXiv preprintarXiv:2005.10200 (2020)
Ding, N., Chen, Y., Han, X., Xu, G., Xie, P., Zheng, H.-T., Liu, Z., Li, J., Kim, H.-G.: Prompt-learning for fine-grained entity typing. arXiv preprintarXiv:2108.10604 (2021)
Han, X., Zhao, W., Ding, N., Liu, Z., Sun, M.: PTR: prompt tuning with rules for text classification. arXiv preprintarXiv:2105.11259 (2021)
Chen, X., Zhang, N., Xie, X., Deng, S., Yao, Y., Tan, C., Huang, F., Si, L., Chen, H.: KnowPrompt: knowledge-aware prompt-tuning with synergistic optimization for relation extraction. In: Proceedings of the ACM Web Conference 2022, pp. 2778–2788 (2022)
Hu, S., Ding, N., Wang, H., Liu, Z., Li, J., Sun, M.: Knowledgeable prompt-tuning: incorporating knowledge into prompt verbalizer for text classification. arXiv preprintarXiv:2108.02035 (2021)
Shin, T., Razeghi, Y., Logan IV, R.L., Wallace, E., Singh, S.: AutoPrompt: eliciting knowledge from language models with automatically generated prompts. arXiv preprintarXiv:2010.15980 (2020)
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. Adv. Neural Inf. Process. Syst.28 (2015)
Phan, X.-H., Nguyen, L.-M., Horiguchi, S.: Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In: Proceedings of the 17th International Conference on World Wide Web, pp. 91–100 (2008)
Linmei, H., Yang, T., Shi, C., Ji, H., Li, X.: Heterogeneous graph attention networks for semi-supervised short text classification. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 4821–4830 (2019)
Christian, H., Agus, M.P., Suhartono, D.: Single document automatic text summarization using term frequency-inverse document frequency (TF-IDF). ComTech Comput. Math. Eng. Appl.7(4), 285–294 (2016)
Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. arXiv preprintarXiv:1705.02364 (2017)
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. arXiv preprintarXiv:1908.10084 (2019)
Acknowledgements
This research is partially supported by the National Natural Science Foundation of China under Grants (61906060, 62076217), Yangzhou Science and Technology Plan Project City School Cooperation Special Project (YZ2023199), Open Project of Anhui Provincial Key Laboratory for Intelligent Manufacturing of Construction Machinery (IMCM-2023-01).
Author information
Authors and Affiliations
School of Information Engineering, Yangzhou University, 88 South Daxue Road, Jiangsu, 225127, China
Zhi Wang, Yi Zhu, Yun Li, Jipeng Qiang, Yunhao Yuan & Chaowei Zhang
Key Laboratory of Knowledge Engineering with Big Data (the Ministry of Education of China), Hefei University of Technology, 193 Tunxi Road, Anhui, 230009, China
Yi Zhu
School of Computer Science and Information Engineering, Hefei University of Technology, 193 Tunxi Road, Anhui, 230009, China
Yi Zhu
- Zhi Wang
You can also search for this author inPubMed Google Scholar
- Yi Zhu
You can also search for this author inPubMed Google Scholar
- Yun Li
You can also search for this author inPubMed Google Scholar
- Jipeng Qiang
You can also search for this author inPubMed Google Scholar
- Yunhao Yuan
You can also search for this author inPubMed Google Scholar
- Chaowei Zhang
You can also search for this author inPubMed Google Scholar
Corresponding author
Correspondence toYi Zhu.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Wang, Z., Zhu, Y., Li, Y.et al. Asymmetric Short-Text Clustering via Prompt.New Gener. Comput.42, 599–615 (2024). https://doi.org/10.1007/s00354-024-00244-7
Received:
Accepted:
Published:
Issue Date:
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative