Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

A phenomenon-wise evaluation dataset for Japanese-English machine translation robustness. The dataset is based on the MTNT dataset, with additional annotations of four linguistic phenomena; Proper Noun, Abbreviated Noun, Colloquial Expression, and Variant. COLING 2020.

NotificationsYou must be signed in to change notification settings

cl-tohoku/PheMT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

PheMT is a phenomenon-wise dataset designed for evaluating the robustness of Japanese-English machine translation systems.The dataset is based on the MTNT dataset[1], with additional annotations of four linguistic phenomena common in UGC; Proper Noun, Abbreviated Noun, Colloquial Expression, and Variant.COLING 2020.

Seethe paper for more information.

New!! ready-to-useevaluation tools are now available! (Feb. 2021)

About this repository

This repository contains the following.

.├── README.md├── mtnt_approp_annotated.tsv # pre-filtered MTNT dataset with annotated appropriateness (See Appendix A)├── proper│   ├── proper.alignment # translations of targeted expressions│   ├── proper.en # references│   ├── proper.ja # source sentences│   └── proper.tsv├── abbrev│   ├── abbrev.alignment│   ├── abbrev.en│   ├── abbrev.norm.ja # normalized source sentences│   ├── abbrev.orig.ja # original source sentences│   └── abbrev.tsv├── colloq│   ├── colloq.alignment│   ├── colloq.en│   ├── colloq.norm.ja│   ├── colloq.orig.ja│   └── colloq.tsv├── variant│   ├── variant.alignment│   ├── variant.en│   ├── variant.norm.ja│   ├── variant.orig.ja│   └── variant.tsv└── src    └── calc_acc.py # script for calculating translation accuracy

Please feed both original and normalized versions of source sentences to your model to get the difference of arbitrary metrics as a robustness measure.Also, we extracted translations for expressions presenting targeted phenomena.We recommend usingsrc/calc_acc.py to measure the effect of each phenomenon more directly with the help of translation accuracy.

USAGE:python calc_acc.py system_output {proper, abbrev, colloq, variant}.alignment

Basic statistics and examples from the dataset

  • Statistics
Dataset# sent.# unique expressions (ratio)average edit distance
Proper Noun943747 (79.2%)(no normalized version)
Abbreviated Noun348234 (67.2%)5.04
Colloquial Expression172153 (89.0%)1.77
Variant10397 (94.2%)3.42
  • Examples
- Abbreviated Nounoriginal source : 地味なアプデ (apude, meaning update) だがnormalized source : 地味なアップデート (update) だがreference : That’s a plain update thoughalignment : update- Colloquial Expressionoriginal source : ここまで描いて飽きた、かなちい (kanachii, meaning sad)normalized source : ここまで描いて飽きた、かなしい (kanashii)reference : Drawing this much then getting bored, how sad.alignment : sad

Citation

If you use our dataset for your research, please cite the following paper:

@inproceedings{fujii-etal-2020-phemt,    title = "{P}he{MT}: A Phenomenon-wise Dataset for Machine Translation Robustness on User-Generated Contents",    author = "Fujii, Ryo  and      Mita, Masato  and      Abe, Kaori  and      Hanawa, Kazuaki  and      Morishita, Makoto  and      Suzuki, Jun  and      Inui, Kentaro",    booktitle = "Proceedings of the 28th International Conference on Computational Linguistics",    month = dec,    year = "2020",    address = "Barcelona, Spain (Online)",    publisher = "International Committee on Computational Linguistics",    url = "https://www.aclweb.org/anthology/2020.coling-main.521",    pages = "5929--5943",}

Reference

[1] Michel and Neubig (2018), MTNT: A Testbed for Machine Translation of Noisy Text.

About

A phenomenon-wise evaluation dataset for Japanese-English machine translation robustness. The dataset is based on the MTNT dataset, with additional annotations of four linguistic phenomena; Proper Noun, Abbreviated Noun, Colloquial Expression, and Variant. COLING 2020.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp