Movatterモバイル変換

Papers Papers/2022 Papers Papers/2022

The Search is performed on all of the following fields:

Note: Please use complete words only.

Publication Title
Abstract
Publication Keywords
DOI
Proceeding Title
Proceeding Foreword
ISBN (Completed)
Insticc Ontology
Author Affiliation
Author Name
Editor Name

If you already have a Primoris Account you can use the same username/password here.

{1}

##LOC[OK]##

{1}

##LOC[OK]####LOC[Cancel]##

{1}

##LOC[OK]####LOC[Cancel]##

Research.Publish.Connect.

*Please fill out at least one Field.*Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search*Please fill out at least one Field.*Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

The Search is performed on all of the following fields:

Note: Please use complete words only.

Publication Title
Abstract
Publication Keywords
DOI
Proceeding Title
Proceeding Foreword
ISBN (Completed)
Insticc Ontology
Author Affiliation
Author Name
Editor Name

If you're looking for an exact phrase use quotation marks on text fields.

Paper

A Word Association Based Approach for Improving Retrieval Performance from Noisy OCRed Text

Topics:Clustering and Classification Methods

In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 0IC3K,450-456,2014, Rome, Italy

Authors:Anirban Chakraborty¹;Kripabandhu Ghosh¹ andUtpal Roy²

Affiliations:¹Indian Statistical Institute, India;²Visva-Bharati, India

Keyword(s):Erroneous Text, Cooccurrence, Pointwise Mutual Information.

RelatedOntology Subjects/Areas/Topics:Artificial Intelligence ;Clustering and Classification Methods ;Knowledge Discovery and Information Retrieval ;Knowledge-Based Systems ;Symbolic Systems

Abstract:OCR errors hurt retrieval performance to a great extent. Research has been done on modelling and correctionof OCR errors. However, most of the existing systems use language dependent resources or training textsfor studying the nature of errors. Not much research has been reported on improving retrieval performancefrom erroneous text when no training data is available. We propose an algorithm of detecting OCR errors andimproving retrieval performance from the erroneous corpus. We present two versions of the algorithm: onebased on word cooccurrence and the other based on Pointwise Mutual Information. Our algorithm does notuse any training data or any language specific resources like thesaurus. It also does not use any knowledgeabout the language except that the word delimiter is a blank space. We have tested our algorithm on erroneousBangla FIRE collection and obtained significant improvements.

Full Text

Download

Please type the code

Generate New Image

Enter text above before pressing[Download] (case sensitive)

CC BY-NC-ND 4.0

Guests can use SciTePress Digital Library without having a SciTePress account. However, guests have limited access to downloading full text versions of papers and no access to special options.

Guest:Register as new SciTePress user now for free.

Download limit per month - 500 recent papers or 4000 papers more than 2 years old.

SciTePress user: please login.

My Papers

Unable to see papers previously downloaded, because you haven't logged in as SciTePress Member.

If you are already a member please login.

You are not signed in, therefore limits apply to your IP address 153.126.140.213

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Full Text

Download

Please type the code

Generate New Image

Enter the captcha text before pressing[Download] (case sensitive)

Paper citation in several formats:

Chakraborty, A., Ghosh, K. and Roy, U. (2014).A Word Association Based Approach for Improving Retrieval Performance from Noisy OCRed Text. InProceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2014) - KDIR; ISBN 978-989-758-048-2; ISSN 2184-3228, SciTePress, pages 450-456. DOI: 10.5220/0005157304500456

@conference{kdir14,
author={Anirban Chakraborty and Kripabandhu Ghosh and Utpal Roy},
title={A Word Association Based Approach for Improving Retrieval Performance from Noisy OCRed Text},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2014) - KDIR},
year={2014},
pages={450-456},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005157304500456},
isbn={978-989-758-048-2},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2014) - KDIR
TI - A Word Association Based Approach for Improving Retrieval Performance from Noisy OCRed Text
SN - 978-989-758-048-2
IS - 2184-3228
AU - Chakraborty, A.
AU - Ghosh, K.
AU - Roy, U.
PY - 2014
SP - 450
EP - 456
DO - 10.5220/0005157304500456
PB - SciTePress