Movatterモバイル変換

Part of the book series:Lecture Notes in Computer Science ((LNTCS,volume 3406))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

Abstract

In this paper, we propose a method to improve the precision of top retrieved documents by re-ordering the retrieved documents in the initial retrieval. To re-order the documents, we first automatically extract key terms from top N (N<=30) retrieved documents, then we collect key terms that occur in query and their document frequencies in top N retrieved documents, finally we use these collected terms to re-order the initially retrieved documents. Each collected term is assigned a weight by its length and its document frequency in top N retrieved documents. Each document is re-ranked by the sum of weights of collected terms it contains. In our experiments on 42 query topics in NTCIR3 Cross Lingual Information Retrieval (CLIR) dataset, an average 17.8%-27.5% improvement can be made for top 10 documents and an average 6.6%-12% improvement can be made for top 100 documents at relax/rigid relevance judgment and different parameter setting.

This is a preview of subscription content,log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Pseudo-Relevance Feedback Based on Locally-Built Co-occurrence Graphs

A Multiple-Stage Approach to Re-ranking Medical Documents

Investigating the Dynamic Decision Mechanisms of Users’ Relevance Judgment for Information Retrieval via Log Analysis

References

Bear, J., Israel, D., Petit, J., Martin, D.: Using Information Extraction to Improve Document Retrieval. In: Proceedings of the Sixth Text Retrieval Conference (1997)
Google Scholar
Fuhr, N.: Probabilistic Models in Information Retrieval. The Computer Journal 35(3), 243–254 (1992)
Article MATH Google Scholar
Ji, D.H., Yang, L.P., Nie, Y.: Chinese Language IR based on Term Extraction. In: The Third NTCIR Workshop (2002)
Google Scholar
Kamps, J.: Improving Retrieval Effectiveness by Reranking Documents Based on Controlled Vocabulary. In: The 21th European Conference on Information Retrieval (2004)
Google Scholar
Kwok, K.L.: Comparing Representation in Chinese Information Retrieval. In: Proceedings of the ACM SIGIR 1997, pp. 34–41 (1997)
Google Scholar
Lee, K., Park, Y., Choi, K.S.: Document Re-ranking model using clusters. Information Processing & Management V (2001)
Google Scholar
Li, P.: Research on Improvement of Single Chinese Character Indexing Method. Journal of the China Society for Scientific and Technical Information 18(5) (1999)
Google Scholar
Mitra., M., Singhal, A., Buckley, C.: Improving Automatic Query Expansion. In: Proc. ACM SIGIR 1998 (August 1998)
Google Scholar
Nie, J.Y., Gao, J., Zhang, J., Zhou, M.: On the Use of Words and N-grams for Chinese Information Retrieval. In: Proceedings of the Fifth International Workshop on Information Retrieval with Asian Languages, IRAL 2000, pp. 141–148 (2000)
Google Scholar
Qu, Y.L., Xu, G.W., Wang, J.: Rerank Method Based on Individual Thesaurus. In: Proceedings of NTCIR2 Workshop (2000)
Google Scholar
Robertson, S.E., Walker, S.: Microsoft Cambridge at TREC-9: Filtering track. In: NIST Special Pub. 500-264: The Eight Text Retrieval Conference (TREC-8), Gaithersburg, MD, pp. 151–161 (2001)
Google Scholar
Salton, G., Mcgill, M.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)
MATH Google Scholar
Schutze, H.: The hypertext concordance: a better back-of-the-book index. In: Proceedings of First Workshop on Computational Terminology, pp. 101–104 (1998)
Google Scholar
Yang, L.P., Ji, D.H., Tang, L.: Document Re-ranking Based on Automatically Acquired Key Terms in Chinese Information Retrieval. In: Proceedings of 20th International Conference on Computational Linguistics, COLING 2004 (2004)
Google Scholar
Yang, L.P., Ji, D.H., Tang, L.: Chinese Information Retrieval based on Terms and Ontology. In: Proceedings of NTCIR4 Workshop (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Infocomm Research, 21, Heng Mui Keng Terrace, 119613, Singapore
Yang Lingpeng, Ji Donghong, Nie Yu & Zhou Guodong

Authors

Yang Lingpeng
View author publications
You can also search for this author inPubMed Google Scholar
Ji Donghong
View author publications
You can also search for this author inPubMed Google Scholar
Nie Yu
View author publications
You can also search for this author inPubMed Google Scholar
Zhou Guodong
View author publications
You can also search for this author inPubMed Google Scholar

Editor information

Editors and Affiliations

National Polytechnic Institute, Center for Computing Research, 07738, Mexico City, México
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lingpeng, Y., Donghong, J., Yu, N., Guodong, Z. (2005). Document Re-ordering Based on Key Terms in Top Retrieved Documents. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2005. Lecture Notes in Computer Science, vol 3406. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30586-6_61

Download citation

DOI:https://doi.org/10.1007/978-3-540-30586-6_61
Publisher Name:Springer, Berlin, Heidelberg
Print ISBN:978-3-540-24523-0
Online ISBN:978-3-540-30586-6
eBook Packages:Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Movatterモバイル変換

Document Re-ordering Based on Key Terms in Top Retrieved Documents

Abstract

Access this chapter

Preview

Similar content being viewed by others

Pseudo-Relevance Feedback Based on Locally-Built Co-occurrence Graphs

A Multiple-Stage Approach to Re-ranking Medical Documents

Investigating the Dynamic Decision Mechanisms of Users’ Relevance Judgment for Information Retrieval via Log Analysis

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Access this chapter