UEC-InabaLab/KokoroChatPublic

NotificationsYou must be signed in to change notification settings
Fork0
Star9

ロールプレイで収集した日本語のカウンセリング対話データセット

License

View license

9 stars 0 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
images		images
kokorochat_dialogues		kokorochat_dialogues
LICENSE		LICENSE
README.md		README.md

Repository files navigation

KokoroChat: A Japanese Psychological Counseling Dialogue Dataset Collected via Role-Playing by Trained Counselors

KokoroChat is the largest human-collected Japanese psychological counseling dialogue dataset to date (as of June 2025). It was created through role-playing between trained counselors and includes rich, long-form dialogues and detailed client feedback on counseling quality. The dataset supports research on empathetic response generation, dialogue evaluation, and mental health-oriented language modeling.

This work has beenaccepted to the main conference of ACL 2025.📄View Paper (arXiv)

🌟 Key Features

6,589 dialogues, collected between 2020 and 2024
Avg. 91.2 utterances per dialogue
480 trained counselors simulating online text-based counseling sessions
20-dimension Likert-scale client feedback for every session
Broad topic coverage: mental health, school, family, workplace, romantic issues, etc.

📊 Dataset Statistics

Category	Total	Counselor	Client
# Dialogues	6,589	-	-
# Speakers	480	424	463
# Utterances	600,939	306,495	294,444
Avg. utterances/dialogue	91.20	46.52	44.69
Avg. length/utterance	28.39	35.84	20.63

📁 Dataset Structure

Each sample contains:

A full counseling dialogue with role labels (counselor / client) and message timestamps
Structured client feedback on 20 dimensions (0–5 Likert scale)
Flags for ethical concern checks (optional)
Predicted topic label (automatically annotated by GPT-4o-mini)

👉 See thekokorochat_dialogues folder for the complete dataset.

🤗 Access on Hugging Face

You can also access our full dataset and fine-tuned models via Hugging Face:

📁Dataset:KokoroChat-dataset

We fine-tuned three counseling dialogue models based onLlama-3.1-Swallow-8B-Instruct-v0.3, using different subsets of the KokoroChat dataset filtered by client feedback score:

🔵Llama-3.1-KokoroChat-Low: Fine-tuned on 3,870 dialogues with feedback scores< 70
🟢Llama-3.1-KokoroChat-High: Fine-tuned on 2,601 dialogues with feedback scoresbetween 70 and 98
⚫Llama-3.1-KokoroChat-Full: Fine-tuned on 6,471 dialogues with feedback scores≤ 98

📄 Citation

If you use this dataset, please cite the following paper:

@inproceedings{qi2025kokorochat,title     ={KokoroChat: A Japanese Psychological Counseling Dialogue Dataset Collected via Role-Playing by Trained Counselors},author    ={Zhiyang Qi and Takumasa Kaneko and Keiko Takamizo and Mariko Ukiyo and Michimasa Inaba},booktitle ={Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics},year      ={2025},url       ={https://github.com/UEC-InabaLab/KokoroChat}}

⚖️ License

KokoroChat is released under theCreative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) license.

About

ロールプレイで収集した日本語のカウンセリング対話データセット

Releases1

KokoroChat v1.0.0 Latest

Jun 3, 2025

Packages

No packages published

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

KokoroChat: A Japanese Psychological Counseling Dialogue Dataset Collected via Role-Playing by Trained Counselors

🌟 Key Features

📊 Dataset Statistics

📁 Dataset Structure

🤗 Access on Hugging Face

📄 Citation

⚖️ License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases1

Packages

Uh oh!

Movatterモバイル変換

License

UEC-InabaLab/KokoroChat

Folders and files

Latest commit

History

Repository files navigation

KokoroChat: A Japanese Psychological Counseling Dialogue Dataset Collected via Role-Playing by Trained Counselors

🌟 Key Features

📊 Dataset Statistics

📁 Dataset Structure

🤗 Access on Hugging Face

📄 Citation

⚖️ License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases1

Packages0

Uh oh!

Packages