Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

License

NotificationsYou must be signed in to change notification settings

jqk09a/japanese-daily-dialogue

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Japanese Daily Dialogue, or日本語日常対話コーパス in Japanese, is a high-quality multi-turn dialogue dataset containing daily conversations on five topics: dailylife, school, travel, health, and entertainment. All dialogues are written in standard Japanese using basic vocabulary and word order. The dataset has been manually created and processed.

Statistics

Here are the statistics for the dataset:

Topic# of dialogue# of utterance
Topic 1 - Dailylife1,0708,462
Topic 2 - School1,0588,197
Topic 3 - Travel1,0218,459
Topic 4 - Health1,0618,344
Topic 5 - Entertainment1,0518,318
Total5,26141,780

Data Format

The dataset is structured with separate JSON files for each topic, which are stored in thedata directory. Each JSON file follows the format:

{"topic_id":3,"topic_name":"Travel","dialogue_id":611,"dialogue_length":8,"utterances": [        {"turn_num":1,"speaker":"A","utterance":"おはようございます。高原の朝は冷えますね。"        },        {"turn_num":2,"speaker":"B","utterance":"おはようございます。本当ですね。羽織るものが欲しいです。"        },        ,,,        {"turn_num":8,"speaker":"B","utterance":"ロッジのオーナーに聞いてみましょう。"        }    ]}

Reference

For more details, please refer to the following paper:

@InProceedings{akama2023jdd,  title = {{日本語日常対話コーパスの構築}},  author = {赤間 怜奈 and 磯部 順子 and 鈴木 潤 and 乾 健太郎},  booktitle = {言語処理学会 第29回年次大会 発表論文集},  pages = {108--113},  year = {2023};  url = {https://www.anlp.jp/proceedings/annual_meeting/2023/pdf_dir/H1-1.pdf}}

Copyright

All rights to this dataset belong to our research group.
The dataset is only for research purposes. It is accessible to commercial companies for research and evaluation, but it may not be utilized as a service, such as a chatbot or any other similar application.
The dataset may not be distributed to others.

License

Japanese Dialy Dialogue dataset is licensed under aCreative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

CC BY-NC-ND 4.0

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp