Psychological counseling human-machine dialogue dataset construction for dialogue generation and mental disorder detection

doi:10.11772/j.issn.1001-9081.2024050705

Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (5): 1395-1402.DOI: 10.11772/j.issn.1001-9081.2024050705

• China Conference on Data Mining 2024 (CCDM 2024) • Previous Articles

Psychological counseling human-machine dialogue dataset construction for dialogue generation and mental disorder detection

Bo XU¹(), Dezhi HAO¹, Erchen YU¹, Hongfei LIN¹, Linlin ZONG²

^1.School of Computer Science and Technology，Dalian University of Technology，Dalian Liaoning 116024，China
^2.School of Software，Dalian University of Technology，Dalian Liaoning 116024，China

Received:2024-05-29 Revised:2024-08-02 Accepted:2024-08-20 Online:2024-09-25 Published:2025-05-10
Contact: Bo XU
About author:XU Bo， born in 1988， Ph. D.， associate professor. His research interests include mental health computing， natural language processing.
HAO Dezhi， born in 1998， M. S. His research interests include psychological counseling human-machine dialogue.
YU Erchen， born in 2001， M. S. candidate. His research interests include multimodal affective computing.
LIN Hongfei， born in 1962， Ph. D.， professor. His research interests include natural language processing.
ZONG Linlin， born in 1987， Ph. D.， associate professor. Her research interests include multimodal affective computing.
Supported by:
Liaoning Provincial Social Science Planning Fund(L21CXW003)

面向对话生成和心理疾病检测的心理咨询式人机对话数据集构建

徐博¹(), 郝德志¹, 于迩晨¹, 林鸿飞¹, 宗林林²

^1.大连理工大学计算机科学与技术学院，辽宁大连 116024
^2.大连理工大学软件学院，辽宁大连 116024

通讯作者: 徐博
作者简介:徐博（1988—），男，辽宁大连人，副教授，博士，CCF会员，主要研究方向：心理健康计算、自然语言处理
郝德志（1998—），男，山东临沂人，硕士，主要研究方向：心理咨询式人机对话
于迩晨（2001—），男，辽宁鞍山人，硕士研究生，主要研究方向：多模态情感计算
林鸿飞（1962—），男，内蒙古通辽人，教授，博士，主要研究方向：自然语言处理
宗林林（1987—），女，河北沧州人，副教授，博士，主要研究方向：多模态情感计算。
基金资助:
辽宁省社会科学规划基金资助项目(L21CXW003)

Abstract

Abstract:

To address the lack of publicly available data for modeling effective dialogue models in psychological counseling human-machine dialogues， a psychological counseling dialogue dataset was constructed for dialogue generation and mental disorder detection. Firstly， a multi-round dialogue dataset containing 3 268 doctor-patient conversations was collected from an online medical consultation platform， enriched with comprehensive metadata including hospital affiliations， medical departments， disease categories， and patient self-descriptions. Secondly， a knowledge-enhanced dialogue model named Empathy Bidirectional and Auto-Regressive Transformers （EmBART） was proposed to enhance the empathic capabilities of the dialogue model. Finally， an experimental evaluation of the dataset usability was conducted through psychological response generation and mental disorder detection tasks. In psychological response generation， EmBART trained on this dataset performed excellently on all metrics in both automatic and human evaluations， with the perplexity reduced by 2.31 compared to baseline model CDial-GPT（Chinese Dialogue Generative Pre-trained Transformer）. In mental disorder detection， CPT （Chinese Pre-trained unbalanced Transformer） and RoBERTa （Robustly optimized Bidirectional Encoder Representations from Transformers approach） trained on this dataset demonstrated outstanding mental disorder prediction capabilities. Experimental results confirm the strong utility of this dataset in generating empathic dialogues and detecting mental disorders， providing a data base for future research on psychological counseling human-machine dialogues.

Key words: psychological counseling dialogue, mental disorder detection, dialogue generation, empathic response, emotion analysis

摘要：

针对心理咨询式人机对话中缺乏用于建立有效对话模型的公开数据的问题，构建一个面向对话生成和心理疾病检测的心理医疗咨询对话数据集。首先，通过在线医疗问诊平台获取包含3 268个医生和患者之间的多轮对话数据集，并附有广泛的相关元数据，包括就诊医院、就诊科室、疾病类型和患者自我陈述等；其次，提出一个知识增强的对话模型——情感感知双向自回归模型（EmBART），以增强对话模型的共情能力；最后，通过心理医疗响应生成和心理疾病检测进行数据集可用性的实验评估。在心理医疗响应生成中，基于所提数据集训练的EmBART模型在自动评估与人工评估中的各项指标上均表现出色，其中困惑度较基准模型CDial-GPT（Chinese Dialogue Generative Pre-trained Transformer）降低了2.31；在心理疾病检测中，基于所提数据集训练的CPT（Chinese Pre-trained unbalanced Transformer）和RoBERTa（Robustly optimized Bidirectional Encoder Representations from Transformers approach）模型具有出色的心理疾病检测能力。实验结果表明，本数据集在生成共情对话和检测心理疾病方面具有较强的实用性，能为未来基于心理咨询式人机对话研究提供数据基础。

关键词: 心理咨询对话, 心理疾病检测, 对话生成, 共情响应, 情感分析

CLC Number:

TP391.1

Bo XU, Dezhi HAO, Erchen YU, Hongfei LIN, Linlin ZONG. Psychological counseling human-machine dialogue dataset construction for dialogue generation and mental disorder detection[J]. Journal of Computer Applications, 2025, 45(5): 1395-1402.

徐博, 郝德志, 于迩晨, 林鸿飞, 宗林林. 面向对话生成和心理疾病检测的心理咨询式人机对话数据集构建[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1395-1402.

Figures/Tables 9

References 29

1	World Health Organization. Responding to community spread of COVID-19： interim guidance［R］. Geneva： World Health Organization， 2020.
2	KROENKE K， SPITZER R L. The PHQ-9： a new depression diagnostic and severity measure［J］. Psychiatric Annals， 2002， 32（9）： 509-515.
3	HART J， GRATCH J， MARSELLA S. How virtual reality training can win friends and influence people［M］// Fundamental issues in defense training and simulation. London： CRC Press， 2013： 235-249.
4	XU L， ZHOU Q， GONG K， et al. End-to-end knowledge-routed relational dialogue system for automatic diagnosis［C］// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2019： 7346-7353.
5	XIA Y， ZHOU J， SHI Z， et al. Generative adversarial regularized mutual information policy gradient framework for automatic diagnosis［C］//Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2020： 1062-1069.
6	WEI Z， LIU Q， PENG B， et al. Task-oriented dialogue system for automatic diagnosis［C］// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics （Volume 2： Short Papers）. Stroudsburg： ACL， 2018： 201-207.
7	ZHANG Y， JIANG Z， ZHANG T， et al. MIE： a medical information extractor towards medical dialogues［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2020： 6460-6469.
8	DU N， WANG M， TRAN L， et al. Learning to infer entities， properties and their relations from clinical conversations［C］// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg： ACL， 2019： 4979-4990.
9	SHI X， HU H， CHE W， et al. Understanding medical conversations with scattered keyword attention and weak supervision from responses［C］// Proceedings of the 34th AAAI Conference on artificial intelligence. Palo Alto： AAAI Press， 2020： 8838-8845.
10	ZHANG Y， SUN S， GALLEY M， et al. DialoGPT： large-scale generative pre-training for conversational response generation［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics： System Demonstrations. Stroudsburg： ACL， 2020： 270-278.
11	RADFORD A， WU J， CHILD R， et al. Language models are unsupervised multitask learners［EB/OL］. ［2024-10-13］. .
12	NI J， YOUNG T， PANDELEA V， et al. Recent advances in deep learning based dialogue systems： a systematic survey［J］. Artificial Intelligence Review， 2023， 56（4）： 3055-3155.
13	LIN X， HE X， CHEN Q， et al. Enhancing dialogue symptom diagnosis with global attention and symptom graph［C］// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg： ACL， 2019： 5033-5042.
14	YANG W， ZENG G， TAN B， et al. On the generation of medical dialogs for COVID-19［C］// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing （Volume 2： Short Papers）. Stroudsburg： ACL， 2021： 886-896. .
15	ZENG G， YANG W， JU Z， et al. MedDialog： large-scale medical dialogue datasets［C］// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2020： 9241-9250.
16	YAO B， SHI C， ZOU L， et al . D 4： a Chinese dialogue dataset for depression-diagnosis-oriented chat［C］// Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2022： 2438-2459.
17	LIU S， ZHENG C， DEMASI O， et al. Towards emotional support dialog systems［C］// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing （Volume 1： Long Papers）. Stroudsburg： ACL， 2021： 3469-3483.
18	GRATCH J， ARTSTEIN R， LUCAS G M， et al. The distress analysis interview corpus of human and computer interviews［C］// Proceedings of the 9th International Conference on Language Resources and Evaluation. Paris： European Language Resources Association， 2014： 3123-3128.
19	SAHA T， CHOPRA S， SAHA S， et al. A large-scale dataset for motivational dialogue system： an application of natural language generation to mental health［C］// Proceedings of the 2021 International Joint Conference on Neural Networks. Piscataway： IEEE， 2021： 1-8.
20	李丹亚，胡铁军，诸文雁，等.中文医学主题词表检索系统［J］.中华医学图书馆杂志，2001，10（4）：1-2.
	LI D Y， HU T J， ZHU W Y， et al. Retrieval system for the Chinese medical subject headings［J］. Chinese Journal of Medical Library， 2001， 10（4）：1-2.
21	VASWANI A， SHAZEER N M， PARMAR N， et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2017： 6000-6010.
22	WANG Y， KE P， ZHENG Y， et al. A large-scale Chinese short-text conversation dataset［C］// Proceedings of the 9th CCF Conference on Natural Language Processing and Chinese Computing， LNCS 12430. Cham： Springer， 2020： 91-103.
23	WU Q， LI L， ZHOU H， et al. Importance-aware learning for neural headline editing［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2020： 9282-9289.
24	LEWIS M， LIU Y， GOYAL N， et al. BART： denoising sequence-to-sequence pre-training for natural language generation， translation， and comprehension［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2020： 7871-7880.
25	DEVLIN J， CHANG M W， LEE K， et al. BERT： pre-training of deep bidirectional Transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1 （Long and Short Papers）. Stroudsburg： ACL， 2019： 4171-4186.
26	LIU Y， OTT M， GOYAL N， et al. RoBERTa： a robustly optimized BERT pretraining approach［EB/OL］. ［2023-11-26］. .
27	SHAO Y， GENG Z， LIU Y， et al. CPT： a pre-trained unbalanced transformer for both Chinese language understanding and generation［J］. Science China Information Sciences， 2024， 67（5）： No.152102.
28	PAPINENI K， ROUKOS S， WARD T， et al. BLEU： a method for automatic evaluation of machine translation［C］// Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2002： 311-318.
29	LI J， GALLEY M， BROCKETT C， et al. A diversity-promoting objective function for neural conversation models［C］// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg： ACL， 2016： 110-119.

数据集	对话数	语句数	疾病数	主要领域
MZ	710	—	4	儿科
DX	527	2 168	5	儿科
CMDD	2 067	87 005	4	儿科
COVID‑EN	603	—	1	COVID
COVID‑CN	1 088	—	1	COVID
DAIC‑WOZ	189	—	1	抑郁症
MotiVAte	4 000	14 809	1	抑郁症
D⁴	1 339	81 558	1	抑郁症
MedDialog‑CN	3 407 494	11 260 564	174	普通医学领域
PsychDialog	3 268	18 869	11	心理疾病领域

数据集	对话数	语句数	疾病数	主要领域
MZ	710	—	4	儿科
DX	527	2 168	5	儿科
CMDD	2 067	87 005	4	儿科
COVID‑EN	603	—	1	COVID
COVID‑CN	1 088	—	1	COVID
DAIC‑WOZ	189	—	1	抑郁症
MotiVAte	4 000	14 809	1	抑郁症
D⁴	1 339	81 558	1	抑郁症
MedDialog‑CN	3 407 494	11 260 564	174	普通医学领域
PsychDialog	3 268	18 869	11	心理疾病领域

类别	总数	医生	患者
对话历史	3 268	—	—
对话平均回合数	2.9	—	—
对话平均语句数	5.8	2.8	2.9
对话平均单词数	250.7	117.0	134.0
语句平均单词数	43.6	20.3	23.3
患者自述平均单词数	272.1	—	—
心理疾病种类	11	—	—

类别	总数	医生	患者
对话历史	3 268	—	—
对话平均回合数	2.9	—	—
对话平均语句数	5.8	2.8	2.9
对话平均单词数	250.7	117.0	134.0
语句平均单词数	43.6	20.3	23.3
患者自述平均单词数	272.1	—	—
心理疾病种类	11	—	—

模型	Perplexity	Dist-1	Dist-2	BLEU-2	Entropy-4	METEOR
Transformer	26.45	0.181	0.384	0.014	4.152	0.019
CDial-GPT	25.55	0.116	0.501	0.856	6.213	1.145
BERT-GPT	27.09	0.125	0.571	0.017	5.316	0.075
BART	26.12	0.051	0.268	0.033	7.814	0.074
EmBART	23.24	0.196	0.601	0.904	7.959	1.215

Psychological counseling human-machine dialogue dataset construction for dialogue generation and mental disorder detection

面向对话生成和心理疾病检测的心理咨询式人机对话数据集构建

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 9

References 29

Related Articles 3

Recommended Articles

Metrics

模型	流畅性	情感丰富度	句子准确性
Transformer	2.02	1.15	2.11
CDial-GPT	2.21	1.22	2.15
BERT-GPT	2.11	1.04	2.07
BART	2.07	1.13	2.03
EmBART	2.39	2.03	2.27

模型	准确率	P_Mi	R_Mi	F1_Mi	P_Wei	R_Wei	F1_Wei
BERT	0.414	0.404	0.395	0.394	0.411	0.414	0.409
RoBERTa	0.454	0.443	0.374	0.386	0.451	0.454	0.442
CPT	0.431	0.397	0.402	0.457	0.441	0.457	0.445
BERT^*	0.503	0.489	0.489	0.484	0.498	0.503	0.499
RoBERTa^*	0.536	0.507	0.472	0.473	0.520	0.526	0.517
CPT^*	0.564	0.531	0.528	0.501	0.511	0.524	0.515

[1]	Cheng FANG, Bei LI, Ping HAN, Qiong WU. Fine-grained emotion classification of Chinese microblog based on syntactic dependency graph [J]. Journal of Computer Applications, 2023, 43(4): 1056-1061.
[2]	DENG Yang, ZHANG Chenxi, LI Jiangfeng. Video shot recommendation model based on emotion analysis using time-sync comments [J]. Journal of Computer Applications, 2017, 37(4): 1065-1070.
[3]	LIN Jianghao, ZHOU Yongmei, YANG Aimin, CHEN Yuhong, CHEN Xiaofan. Analysis of public emotion evolution based on probabilistic latent semantic analysis [J]. Journal of Computer Applications, 2015, 35(10): 2747-2751.