Movatterモバイル変換


[0]ホーム

URL:


Information and Media Technologies
Online ISSN : 1881-0896
ISSN-L : 1881-0896
Media (processing) and Interaction
Noise-aware Character Alignment for Extracting Transliteration Fragments
Katsuhito SudohShinsuke MoriMasaaki Nagata
Author information
  • Katsuhito Sudoh

    Communication Science Laboratories, NTT Corporation
    Graduate School of Informatics, Kyoto University

  • Shinsuke Mori

    Academic Center for Computing and Media Studies, Kyoto University

  • Masaaki Nagata

    Communication Science Laboratories, NTT Corporation

Corresponding author

ORCID
Keywords:Statistical Machine Transliteration,Bayesian Many-to-many Alignment,Machine Translation
JOURNALFREE ACCESS

2015 Volume 10Issue 1Pages 88-112

DOIhttps://doi.org/10.11185/imt.10.88
Details
  • Published: 2015Received: December 27, 2013Released on J-STAGE: March 15, 2015Accepted: -Advance online publication: -Revised: April 11, 2014
Download PDF(921K)
Download citationRIS

(compatible with EndNote, Reference Manager, ProCite, RefWorks)

BIB TEX

(compatible with BibDesk, LaTeX)

Text
How to download citation
Contact us
Article overview
Share
Abstract
This paper proposes a novel noise-aware character alignment method for automatically extracting transliteration fragments in phrase pairs that are extracted from parallel corpora. The proposed method extends a many-to-many Bayesian character alignment method by distinguishing transliteration (signal) parts from non-transliteration (noise) parts. The model can be trained efficiently by a state-based blocked Gibbs sampling algorithm with signal and noise states. The proposed method bootstraps statistical machine transliteration using the extracted transliteration fragments to train transliteration models. In experiments using Japanese-English patent data, the proposed method was able to extract transliteration fragments with much less noise than an IBM-model-based baseline, and achieved better transliteration performance than sample-wise extraction in transliteration bootstrapping.
References (23)
Related articles (0)
Figures (0)
Content from these authors
Supplementary material (0)
Result List ()
Cited by (0)
© 2015 The Association for Natural Language Processing
Previous articleNext article
Favorites & Alerts
Related articles

Recently viewed articles
    Share this page
    feedback
    Top

    Register with J-STAGE for free!

    Register

    Already have an account? Sign inhere


    [8]ページ先頭

    ©2009-2025 Movatter.jp