Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

ロールプレイで収集した日本語のカウンセリング対話データセット

License

NotificationsYou must be signed in to change notification settings

UEC-InabaLab/KokoroChat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 

Repository files navigation

KokoroChat Logo

CC BY-NC-ND 4.0Hugging Face DatasetHugging Face ModelsarXiv

KokoroChat: A Japanese Psychological Counseling Dialogue Dataset Collected via Role-Playing by Trained Counselors

KokoroChat is the largest human-collected Japanese psychological counseling dialogue dataset to date (as of June 2025). It was created through role-playing between trained counselors and includes rich, long-form dialogues and detailed client feedback on counseling quality. The dataset supports research on empathetic response generation, dialogue evaluation, and mental health-oriented language modeling.

This work has beenaccepted to the main conference of ACL 2025.📄View Paper (arXiv)

Example Dialogue and Feedback

🌟 Key Features

  • 6,589 dialogues, collected between 2020 and 2024
  • Avg. 91.2 utterances per dialogue
  • 480 trained counselors simulating online text-based counseling sessions
  • 20-dimension Likert-scale client feedback for every session
  • Broad topic coverage: mental health, school, family, workplace, romantic issues, etc.

Topic Distribution

📊 Dataset Statistics

CategoryTotalCounselorClient
# Dialogues6,589--
# Speakers480424463
# Utterances600,939306,495294,444
Avg. utterances/dialogue91.2046.5244.69
Avg. length/utterance28.3935.8420.63

📁 Dataset Structure

Each sample contains:

  • A full counseling dialogue with role labels (counselor / client) and message timestamps
  • Structured client feedback on 20 dimensions (0–5 Likert scale)
  • Flags for ethical concern checks (optional)
  • Predicted topic label (automatically annotated by GPT-4o-mini)

👉 See thekokorochat_dialogues folder for the complete dataset.

🤗 Access on Hugging Face

You can also access our full dataset and fine-tuned models via Hugging Face:

We fine-tuned three counseling dialogue models based onLlama-3.1-Swallow-8B-Instruct-v0.3, using different subsets of the KokoroChat dataset filtered by client feedback score:

📄 Citation

If you use this dataset, please cite the following paper:

@inproceedings{qi2025kokorochat,title     ={KokoroChat: A Japanese Psychological Counseling Dialogue Dataset Collected via Role-Playing by Trained Counselors},author    ={Zhiyang Qi and Takumasa Kaneko and Keiko Takamizo and Mariko Ukiyo and Michimasa Inaba},booktitle ={Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics},year      ={2025},url       ={https://github.com/UEC-InabaLab/KokoroChat}}

⚖️ License

KokoroChat is released under theCreative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) license.

CC BY-NC-ND 4.0

About

ロールプレイで収集した日本語のカウンセリング対話データセット

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp