Attention mechanism based Stack-CNN model to support Chinese medical questions and answers

doi:10.11772/j.issn.1001-9081.2021071272

Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (4): 1125-1130.DOI: 10.11772/j.issn.1001-9081.2021071272

• The 36 CCF National Conference of Computer Applications (CCF NCCA 2020) • Previous Articles Next Articles

Attention mechanism based Stack-CNN model to support Chinese medical questions and answers

Teng TENG, Haiwei PAN(), Kejia ZHANG, Xuelian MU, Ximing ZHANG, Weipeng CHEN

College of Computer Science and Technology，Harbin Engineering University，Harbin Heilongjiang 150001，China

Received:2021-07-16 Revised:2022-01-01 Accepted:2022-01-04 Online:2022-04-28 Published:2022-04-10
Contact: Haiwei PAN
About author:TENG Teng， born in 1996，M. S. candidate. His research interestsinclude intelligent healthcare，intelligent question-answering
ZHANG Kejia， born in 1983， Ph. D. ， associate professor. His research interests include medical image， edge computing.
MU Xuelian， born in 1997， M. S. candidate. Her research interests include intelligent healthcare， machine learning.
ZHANG Ximing， born in 1997， M. S. candidate. His research interests include intelligent healthcare， intelligent question-answering， machine learning.
CHEN Weipeng， born in 1998， M. S. candidate. His research interests include intelligent healthcare， natural language processing， machine learning.
Supported by:
National Natural Science Foundation of China(62072135)

支持中文医疗问答的基于注意力机制的栈卷积神经网络模型

滕腾, 潘海为(), 张可佳, 牟雪莲, 张锡明, 陈伟鹏

哈尔滨工程大学计算机科学与技术学院，哈尔滨 150001

通讯作者: 潘海为
作者简介:滕腾（1996—），男，黑龙江哈尔滨人，硕士研究生，主要研究方向：智慧医疗、智能问答
张可佳（1983—），男，黑龙江哈尔滨人，副教授，博士，主要研究方向：医疗图像、边缘计算
牟雪莲（1997—），女，黑龙江佳木斯人，硕士研究生，主要研究方向：智慧医疗、机器学习
张锡明（1997—），男，广东广州人，硕士研究生，主要研究方向：智慧医疗、智能问答、机器学习
陈伟鹏（1998—），男，山东临沂人，硕士研究生，主要研究方向：智慧医疗、自然语言处理、机器学习。
基金资助:
国家自然科学基金资助项目(62072135)

Abstract

Abstract:

Most of the current Chinese questions and answers matching technologies require word segmentation first， and the word segmentation problem of Chinese medical text requires maintenance of medical dictionaries to reduce the impact of segmentation errors on subsequent tasks. However， maintaining dictionaries requires a lot of manpower and knowledge， making word segmentation problem always be a great challenge. At the same time， the existing Chinese medical questions and answers matching methods all model the questions and the answers separately， and do not consider the relationship between the keywords contained in the questions and the answers respectively. Therefore， an Attention mechanism based Stack Convolutional Neural Network （Att-StackCNN） model was proposed to solve the problem of Chinese medical questions and answers matching. Firstly， character embedding was used to encode the questions and answers to obtain the respective character embedding matrices. Then， the respective feature attention mapping matrices were obtained by constructing the attention matrix using the character embedding matrices of the questions and answers. After that， Stack Convolutional Neural Network （Stack-CNN） model was used to perform convolution operation to the above matrices at the same time to obtain the respective semantic representations of the questions and answers. Finally， the similarity was calculated， and the max-margin loss was calculated by using the similarity to update the network parameters. On the cMedQA dataset， the Top-1 accuracy of proposed model was about 1 percentage point higher than that of Stack-CNN model and about 0.5 percentage point higher than that of Multi-CNNs model. Experimental results show that Att-StackCNN model can improve the matching effect of Chinese medical questions and answers.

Key words: character embedding, attention, Stack Convolutional Neural Network (Stack-CNN), Chinese medical text, questions and answers matching

摘要：

当前的中文问答匹配技术大多都需要先进行分词，中文医疗文本的分词问题需要维护医学词典来缓解分词错误对后续任务影响，而维护词典需要大量人力和知识，致使分词问题一直具有极大的挑战性。同时，现有的中文医疗问答匹配方法都是对问题和答案分开建模，并未考虑问题和答案中各自包含的关键词汇间的关联关系。因此，提出了一种基于注意力机制的栈卷积神经网络（Att-StackCNN）模型来解决中文医疗问答匹配问题。首先，使用字嵌入对问题和答案进行编码以得到二者各自的字嵌入矩阵；然后，通过利用问题和答案的字嵌入矩阵构造注意力矩阵来得到二者各自的特征注意力映射矩阵；接着，利用栈卷积神经网络（Stack-CNN）模型同时对上述矩阵进行卷积操作，从而得到问题和答案各自的语义表示；最后，进行相似度计算，并利用相似度计算最大边际损失以更新网络参数。所提模型在cMedQA数据集上的Top-1正确率比Stack-CNN模型高接近1个百分点，比Multi-CNNs模型高接近0.5个百分点。实验结果表明，Att-StackCNN模型可以提升中文医疗问答匹配效果。

关键词: 嵌入, 注意力, 栈卷积神经网络, 中文医疗文本, 问答匹配

CLC Number:

TP391.4

Teng TENG, Haiwei PAN, Kejia ZHANG, Xuelian MU, Ximing ZHANG, Weipeng CHEN. Attention mechanism based Stack-CNN model to support Chinese medical questions and answers[J]. Journal of Computer Applications, 2022, 42(4): 1125-1130.

滕腾, 潘海为, 张可佳, 牟雪莲, 张锡明, 陈伟鹏. 支持中文医疗问答的基于注意力机制的栈卷积神经网络模型[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1125-1130.

Figures/Tables 11

References 22

1	于倩倩. 面向医疗领域的中文自动问答系统的设计与实现［D］. 北京：北京邮电大学， 2020： 1-3.
	YU Q Q. Design and implementation of Chinese automatic question answering system oriented medical field［D］. Beijing： Beijing University of Posts and Telecommunications， 2020：1-3.
2	Class_guy. 问答系统综述［EB/OL］. （2018-08-09）［2021-07-14］.［EB/OL］. （2018-08-09）［2021-07-14］.）. 10.1002/9781119209164
3	HSU W N， ZHANG Y， GLASS J. Recurrent neural network encoder with attention for community question answering［EB/OL］. （2016-03-23）［2021-07-14］.. 10.18653/v1/s16-1128
4	WANG J， MAN C T， ZHAO Y F， et al. An answer recommendation algorithm for medical community question answering systems［C］// Proceedings of the 2016 IEEE International Conference on Service Operations and Logistics， and Informatics. Piscataway： IEEE， 2016： 139-144. 10.1109/soli.2016.7551676
5	LI T C， HAO Y， ZHU X Y， et al. A Chinese question answering system for specific domain［C］// Proceedings of the 2014 International Conference on Web-Age Information Management， LNCS 8485. Cham： Springer， 2014： 590-601.
6	WANG B Y， NIU J B， MA L Q， et al. A Chinese question answering approach integrating count-based and embedding-based features［C］// Proceedings of the 2016 International Conference on Computer Processing of Oriental Languages/ National CCF Conference on Natural Language Processing and Chinese Computing. Cham： Springer， 2016： 934-941. 10.1007/978-3-319-50496-4_88
7	ZHANG S， ZHANG X， WANG H， et al. Chinese medical question answer matching using end-to-end character-level multi-scale CNNs［J］. Applied Sciences， 2017， 7（8）： No.767. 10.3390/app7080767
8	ZHANG Y T， LU W P， OU W H， et al. Chinese medical question answer matching with stack-CNN［C］// Proceedings of the 2018 International Symposium on Artificial Intelligence and Robotics， SCI 810. Cham： Springer， 2018： 455-462.
9	BENGIO Y， DUCHARME R， VINCENT P， et al. A neural probabilistic language model［J］. Journal of Machine Learning Research， 2003， 3： 1137-1155. 10.1007/3-540-33486-6_6
10	MIKOLOV T， SUTSKEVER I， CHEN K， et al. Distributed representations of words and phrases and their compositionality［C］// Proceedings of the 26th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2013： 3111-3119.
11	TADDY M. Document classification by inversion of distributed language representations［C］// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing （Volume 2： Short Papers）. Stroudsburg， PA： Association for Computational Linguistics， 2015： 45-49. 10.3115/v1/p15-2008
12	HUANG C C， QIU X P， HUANG X J. Text classification with document embeddings［C］// Proceedings of the 2014 International Symposium on Natural Language Processing Based on Naturally Annotated Big Data/ China National Conference on Chinese Computational Linguistics， LNCS 8801. Cham： Springer， 2014： 131-140.
13	LEVY O， GOLDBERG Y. Neural word embedding as implicit matrix factorization［C］// Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2014： 2177-2185.
14	ZHANG X， ZHAO J B， LeCUN Y. Character-level convolutional networks for text classification［C］// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2015： 647-657. 10.1109/icip.2015.7351229
15	KIM Y. Convolutional neural networks for sentence classification［C］// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： Association for Computational Linguistics， 2014： 1746-1751. 10.3115/v1/d14-1181
16	YIN W P， SCHÜTZE H， XIANG B， et al. ABCNN： attention-based convolutional neural network for modeling sentence pairs［J］. Transactions of the Association for Computational Linguistics， 2016， 4： 259-272. 10.1162/tacl_a_00097
17	GOEURIOT L， JONES G J F， KELLY L， et al. Medical information retrieval： introduction to the special issue［J］. Information Retrieval Journal， 2016， 19（1/2）： 1-5. 10.1007/s10791-015-9277-8
18	MANNING C D， PRABHAKAR R， SCHÜTZE H. Introduction to Information Retrieval［M］. Cambridge： Cambridge University Press， 2008：151-177. 10.1017/cbo9780511809071
19	李岩，郭凤英，翟兴，等. 基于jieba中文分词的在线医疗网站医生画像研究［J］. 医学信息学杂志， 2020， 41（7）： 14-18. 10.3969/j.issn.1673-6036.2020.07.003
	LI Y， GUO F Y， ZHAI X， et al. Study on doctors’ portraits of online medical website based on jieba Chinese word segmentation［J］. Journal of Medical Informatics， 2020， 41（7）： 14-18. 10.3969/j.issn.1673-6036.2020.07.003
20	徐玮. 医疗问答系统的中文分词算法研究［D］. 武汉：华中科技大学， 2019： 7-8.
	XU W. The study of the Chinese word segmentation algorithm in medical question answering system［D］. Wuhan： Huazhong University of Science and Technology， 2019：7-8.
21	CUI Y M， LIU T， CHEN Z P， et al. Consensus attention-based neural networks for Chinese reading comprehension［C］// Proceedings of the 26th International Conference on Computational Linguistics： Technical Papers. Stroudsburg， PA： Association for Computational Linguistics， 2016： 1777-1786. 10.18653/v1/p17-1055
22	YU L， HERMANN K M， BLUNSOM P， et al. Deep learning for answer sentence selection［EB/OL］. （2014-12-04）［2021-07-14］..

数据集	问题句子数	答案句子数	每句问题平均字数	每句答案平均字数
总计	54 000	101 743	119	212
训练集	50 000	94 134	120	212
开发集	2 000	3 774	117	216
测试集	2 000	3 835	119	211

数据集	问题句子数	答案句子数	每句问题平均字数	每句答案平均字数
总计	54 000	101 743	119	212
训练集	50 000	94 134	120	212
开发集	2 000	3 774	117	216
测试集	2 000	3 835	119	211

编号	嵌入方式	模型	正确率/%
编号	嵌入方式	模型	开发集	训练集
1	无	随机选择	0.10	0.10
2		词匹配（jieba）	37.05	36.60
3		词匹配（ICTCLAS）	35.11	36.22
4		字匹配	33.65	34.90
5		BM25（jieba）	37.60	40.00
6		BM25（ICTCLAS）	40.25	41.25
7		BM25（字）	44.80	45.40
8	词（jieba）	平均嵌入	15.60	16.80
9	词（ICTCLAS）		18.05	18.75
10	字		24.90	24.00
11	词（jieba）	嵌入匹配	24.55	23.65
12	词（ICTCLAS）		27.85	29.10
13	字		30.80	32.30

编号	嵌入方式	模型	正确率/%
编号	嵌入方式	模型	开发集	训练集
1	无	随机选择	0.10	0.10
2		词匹配（jieba）	37.05	36.60
3		词匹配（ICTCLAS）	35.11	36.22
4		字匹配	33.65	34.90
5		BM25（jieba）	37.60	40.00
6		BM25（ICTCLAS）	40.25	41.25
7		BM25（字）	44.80	45.40
8	词（jieba）	平均嵌入	15.60	16.80
9	词（ICTCLAS）		18.05	18.75
10	字		24.90	24.00
11	词（jieba）	嵌入匹配	24.55	23.65
12	词（ICTCLAS）		27.85	29.10
13	字		30.80	32.30

模型	正确率
模型	开发集	测试集
Multi-CNNs	48.40	51.15
Stack-CNN	46.03	47.62
Att-StackCNN	46.22	47.60

Attention mechanism based Stack-CNN model to support Chinese medical questions and answers

支持中文医疗问答的基于注意力机制的栈卷积神经网络模型

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 11

References 22

Related Articles 15

Recommended Articles

Metrics

[1]	Jin ZHANG, Peiqi QU, Cheng SUN, Meng LUO. Safety helmet wearing detection algorithm based on improved YOLOv5 [J]. Journal of Computer Applications, 2022, 42(4): 1292-1300.
[2]	Shoulong JIAO, Youxiang DUAN, Qifeng SUN, Zihao ZHUANG, Chenhao SUN. Knowledge representation learning method incorporating entity description information and neighbor node features [J]. Journal of Computer Applications, 2022, 42(4): 1050-1056.
[3]	Yongfeng DONG, Yuehua SUN, Lichao GAO, Peng HAN, Haipeng JI. Fault diagnosis method based on improved one-dimensional convolutional and bidirectional long short-term memory neural networks [J]. Journal of Computer Applications, 2022, 42(4): 1207-1215.
[4]	Junhua GU, Shuai FAN, Ningning LI, Suqi ZHANG. Long- and short-term recommendation model and updating method based on knowledge graph preference attention network [J]. Journal of Computer Applications, 2022, 42(4): 1079-1086.
[5]	Zhihua LIU, Wenjie CHEN, Aibin CHEN. Homologous spectrogram feature fusion with self-attention mechanism for bird sound classification [J]. Journal of Computer Applications, 2022, 42(4): 1260-1268.
[6]	Junhua GU, Rui WANG, Ningning LI, Suqi ZHANG. Knowledge graph attention network fusing collaborative filtering information [J]. Journal of Computer Applications, 2022, 42(4): 1087-1092.
[7]	Wangjing TANG, Bin XU, Meihan TONG, Meihuan HAN, Liming WANG, Qi ZHONG. Popular science text classification model enhanced by knowledge graph [J]. Journal of Computer Applications, 2022, 42(4): 1072-1078.
[8]	Xinrong HU, Junyu ZHANG, Tao PENG, Junping LIU, Ruhan HE, Kai HE. Cascaded cross-domain feature fusion for virtual try-on [J]. Journal of Computer Applications, 2022, 42(4): 1269-1274.
[9]	Wenjing JIANG, Xi XIONG, Zhongzhi LI, Binyong LI. Recommendation system based on non-sampling collaborative knowledge graph network [J]. Journal of Computer Applications, 2022, 42(4): 1057-1064.
[10]	Tingxiu CHEN, Jianqin YIN. Audio visual joint action recognition based on key frame selection network [J]. Journal of Computer Applications, 2022, 42(3): 731-735.
[11]	Wenqiu ZHU, Guang ZOU, Zhigao ZENG. Object tracking algorithm with hierarchical features and hybrid attention [J]. Journal of Computer Applications, 2022, 42(3): 833-843.
[12]	Na YU, Yan LIU, Xiongju WEI, Yuan WAN. Semantic segmentation of RGB-D indoor scenes based on attention mechanism and pyramid fusion [J]. Journal of Computer Applications, 2022, 42(3): 844-853.
[13]	Mao WANG, Yaxiong PENG, Anjiang LU. Cross-modal chiastopic-fusion attention network for visual question answering [J]. Journal of Computer Applications, 2022, 42(3): 854-859.
[14]	Yuming ZHAO, Shenkai GU. Adversarial attack defense model with residual dense block self-attention mechanism and generative adversarial network [J]. Journal of Computer Applications, 2022, 42(3): 921-929.
[15]	Pengwei LIU, Yuan GAO, Pinle QIN, Zhe YIN, Lifang WANG. Medical MRI image super-resolution reconstruction based on multi-receptive field generative adversarial network [J]. Journal of Computer Applications, 2022, 42(3): 938-945.