Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

COPA Dataset in Japanese

License

NotificationsYou must be signed in to change notification settings

nlp-titech/copa-japanese

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Choice of Plausible Alternatives (COPA) is a dataset for open-domain commonsense causal reasoning.This dataset (Japanese COPA) provides Japanese translations of all sentences (premise, answer1, and answer2) in the original English dataset.

Dataset Description

Each line of thedataset presents a question (premise, answer1, answer2, etc) inJSON Lines format.

NameDescription
idQuestion id
premisePremise for the question
asks_forType of the answer: 原因 (reason) or 結果 (result)
correct_answerCorrect answer: 1 or 2
answer1Answer 1
answer2Answer 2
{  "id": 1,  "premise": "草の上に私の影ができた。",  "asks_for": "原因",  "correct_answer": 1,  "answer1": "太陽が昇っていた。",  "answer2": "草が刈られていた。"}

Questions with IDs ranging from 1 to 500 provide the development set, and those from 501 to 1000 provide the test set.

Anaphora resolution

Some premises and answers have Japanese pronouns such as 彼 (he, him, his), 彼女 (she, her), それ (it), and 彼/彼女ら (they, their, them).Anaphoras of these pronouns were resolved and described in the dataset.

Format: [pronoun](antecedent)

Example (id: 3)

  • Premise: 女性たちは会ってコーヒーを飲みに行った。
  • Answer(原因): [彼女ら](女性たち)は互いの近況を語り合いたかった。

Example (id: 14)

  • Premise: 犯罪者が仮釈放の条件に違反した。
  • Answer(結果): [彼女](犯罪者)は刑務所に送り返された。

We can replace pronouns with antecedents by running the Python code:

importrep=re.compile(r'\[([^]]*)\]\(([^)]*)\)')input_text='[彼女ら](女性たち)は互いの近況を語り合いたかった。'text_with_pronoun=p.sub(r'\1',input_text)text_with_antecedent=p.sub(r'\2',input_text)print(text_with_pronoun)# 彼女らは互いの近況を語り合いたかった。print(text_with_antecedent)# 女性たちは互いの近況を語り合いたかった。

License

BSD 2-Clause License

About

COPA Dataset in Japanese

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors2

  •  
  •  

[8]ページ先頭

©2009-2025 Movatter.jp