Movatterモバイル変換


[0]ホーム

URL:


Distributed Document and Phrase Co-embeddings for Descriptive Clustering

Motoki Sato,Austin J. Brockmeier,Georgios Kontonatsios,Tingting Mu,John Y. Goulermas,Jun’ichi Tsujii,Sophia Ananiadou


Abstract
Descriptive document clustering aims to automatically discover groups of semantically related documents and to assign a meaningful label to characterise the content of each cluster. In this paper, we present a descriptive clustering approach that employs a distributed representation model, namely the paragraph vector model, to capture semantic similarities between documents and phrases. The proposed method uses a joint representation of phrases and documents (i.e., a co-embedding) to automatically select a descriptive phrase that best represents each document cluster. We evaluate our method by comparing its performance to an existing state-of-the-art descriptive clustering method that also uses co-embedding but relies on a bag-of-words representation. Results obtained on benchmark datasets demonstrate that the paragraph vector-based method obtains superior performance over the existing approach in both identifying clusters and assigning appropriate descriptive labels to them.
Anthology ID:
E17-1093
Volume:
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Mirella Lapata,Phil Blunsom,Alexander Koller
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
991–1001
Language:
URL:
https://aclanthology.org/E17-1093/
DOI:
Bibkey:
Cite (ACL):
Motoki Sato, Austin J. Brockmeier, Georgios Kontonatsios, Tingting Mu, John Y. Goulermas, Jun’ichi Tsujii, and Sophia Ananiadou. 2017.Distributed Document and Phrase Co-embeddings for Descriptive Clustering. InProceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 991–1001, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
Distributed Document and Phrase Co-embeddings for Descriptive Clustering (Sato et al., EACL 2017)
Copy Citation:
PDF:
https://aclanthology.org/E17-1093.pdf


[8]ページ先頭

©2009-2025 Movatter.jp