Attention mechanism based Stack-CNN model to support Chinese medical questions and answers

doi:10.11772/j.issn.1001-9081.2021071272

Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (4): 1125-1130.DOI: 10.11772/j.issn.1001-9081.2021071272

Special Issue: CCF第36届中国计算机应用大会 (CCF NCCA 2021)

• The 36 CCF National Conference of Computer Applications (CCF NCCA 2020) • Previous Articles Next Articles

Attention mechanism based Stack-CNN model to support Chinese medical questions and answers

Teng TENG, Haiwei PAN(), Kejia ZHANG, Xuelian MU, Ximing ZHANG, Weipeng CHEN

College of Computer Science and Technology，Harbin Engineering University，Harbin Heilongjiang 150001，China

Received:2021-07-16 Revised:2022-01-01 Accepted:2022-01-04 Online:2022-04-28 Published:2022-04-10
Contact: Haiwei PAN
About author:TENG Teng， born in 1996，M. S. candidate. His research interestsinclude intelligent healthcare，intelligent question-answering
ZHANG Kejia， born in 1983， Ph. D. ， associate professor. His research interests include medical image， edge computing.
MU Xuelian， born in 1997， M. S. candidate. Her research interests include intelligent healthcare， machine learning.
ZHANG Ximing， born in 1997， M. S. candidate. His research interests include intelligent healthcare， intelligent question-answering， machine learning.
CHEN Weipeng， born in 1998， M. S. candidate. His research interests include intelligent healthcare， natural language processing， machine learning.
Supported by:
National Natural Science Foundation of China(62072135)

支持中文医疗问答的基于注意力机制的栈卷积神经网络模型

滕腾, 潘海为(), 张可佳, 牟雪莲, 张锡明, 陈伟鹏

哈尔滨工程大学计算机科学与技术学院，哈尔滨 150001

通讯作者: 潘海为
作者简介:滕腾（1996—），男，黑龙江哈尔滨人，硕士研究生，主要研究方向：智慧医疗、智能问答
张可佳（1983—），男，黑龙江哈尔滨人，副教授，博士，主要研究方向：医疗图像、边缘计算
牟雪莲（1997—），女，黑龙江佳木斯人，硕士研究生，主要研究方向：智慧医疗、机器学习
张锡明（1997—），男，广东广州人，硕士研究生，主要研究方向：智慧医疗、智能问答、机器学习
陈伟鹏（1998—），男，山东临沂人，硕士研究生，主要研究方向：智慧医疗、自然语言处理、机器学习。
基金资助:
国家自然科学基金资助项目(62072135)

Abstract

Abstract:

Most of the current Chinese questions and answers matching technologies require word segmentation first， and the word segmentation problem of Chinese medical text requires maintenance of medical dictionaries to reduce the impact of segmentation errors on subsequent tasks. However， maintaining dictionaries requires a lot of manpower and knowledge， making word segmentation problem always be a great challenge. At the same time， the existing Chinese medical questions and answers matching methods all model the questions and the answers separately， and do not consider the relationship between the keywords contained in the questions and the answers respectively. Therefore， an Attention mechanism based Stack Convolutional Neural Network （Att-StackCNN） model was proposed to solve the problem of Chinese medical questions and answers matching. Firstly， character embedding was used to encode the questions and answers to obtain the respective character embedding matrices. Then， the respective feature attention mapping matrices were obtained by constructing the attention matrix using the character embedding matrices of the questions and answers. After that， Stack Convolutional Neural Network （Stack-CNN） model was used to perform convolution operation to the above matrices at the same time to obtain the respective semantic representations of the questions and answers. Finally， the similarity was calculated， and the max-margin loss was calculated by using the similarity to update the network parameters. On the cMedQA dataset， the Top-1 accuracy of proposed model was about 1 percentage point higher than that of Stack-CNN model and about 0.5 percentage point higher than that of Multi-CNNs model. Experimental results show that Att-StackCNN model can improve the matching effect of Chinese medical questions and answers.

Key words: character embedding, attention, Stack Convolutional Neural Network (Stack-CNN), Chinese medical text, questions and answers matching

摘要：

当前的中文问答匹配技术大多都需要先进行分词，中文医疗文本的分词问题需要维护医学词典来缓解分词错误对后续任务影响，而维护词典需要大量人力和知识，致使分词问题一直具有极大的挑战性。同时，现有的中文医疗问答匹配方法都是对问题和答案分开建模，并未考虑问题和答案中各自包含的关键词汇间的关联关系。因此，提出了一种基于注意力机制的栈卷积神经网络（Att-StackCNN）模型来解决中文医疗问答匹配问题。首先，使用字嵌入对问题和答案进行编码以得到二者各自的字嵌入矩阵；然后，通过利用问题和答案的字嵌入矩阵构造注意力矩阵来得到二者各自的特征注意力映射矩阵；接着，利用栈卷积神经网络（Stack-CNN）模型同时对上述矩阵进行卷积操作，从而得到问题和答案各自的语义表示；最后，进行相似度计算，并利用相似度计算最大边际损失以更新网络参数。所提模型在cMedQA数据集上的Top-1正确率比Stack-CNN模型高接近1个百分点，比Multi-CNNs模型高接近0.5个百分点。实验结果表明，Att-StackCNN模型可以提升中文医疗问答匹配效果。

关键词: 嵌入, 注意力, 栈卷积神经网络, 中文医疗文本, 问答匹配

CLC Number:

TP391.4

Teng TENG, Haiwei PAN, Kejia ZHANG, Xuelian MU, Ximing ZHANG, Weipeng CHEN. Attention mechanism based Stack-CNN model to support Chinese medical questions and answers[J]. Journal of Computer Applications, 2022, 42(4): 1125-1130.

滕腾, 潘海为, 张可佳, 牟雪莲, 张锡明, 陈伟鹏. 支持中文医疗问答的基于注意力机制的栈卷积神经网络模型[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1125-1130.

Figures/Tables 11

References 22

1	于倩倩. 面向医疗领域的中文自动问答系统的设计与实现［D］. 北京：北京邮电大学， 2020： 1-3.
	YU Q Q. Design and implementation of Chinese automatic question answering system oriented medical field［D］. Beijing： Beijing University of Posts and Telecommunications， 2020：1-3.
2	Class_guy. 问答系统综述［EB/OL］. （2018-08-09）［2021-07-14］.［EB/OL］. （2018-08-09）［2021-07-14］.）. 10.1002/9781119209164
3	HSU W N， ZHANG Y， GLASS J. Recurrent neural network encoder with attention for community question answering［EB/OL］. （2016-03-23）［2021-07-14］.. 10.18653/v1/s16-1128
4	WANG J， MAN C T， ZHAO Y F， et al. An answer recommendation algorithm for medical community question answering systems［C］// Proceedings of the 2016 IEEE International Conference on Service Operations and Logistics， and Informatics. Piscataway： IEEE， 2016： 139-144. 10.1109/soli.2016.7551676
5	LI T C， HAO Y， ZHU X Y， et al. A Chinese question answering system for specific domain［C］// Proceedings of the 2014 International Conference on Web-Age Information Management， LNCS 8485. Cham： Springer， 2014： 590-601.
6	WANG B Y， NIU J B， MA L Q， et al. A Chinese question answering approach integrating count-based and embedding-based features［C］// Proceedings of the 2016 International Conference on Computer Processing of Oriental Languages/ National CCF Conference on Natural Language Processing and Chinese Computing. Cham： Springer， 2016： 934-941. 10.1007/978-3-319-50496-4_88
7	ZHANG S， ZHANG X， WANG H， et al. Chinese medical question answer matching using end-to-end character-level multi-scale CNNs［J］. Applied Sciences， 2017， 7（8）： No.767. 10.3390/app7080767
8	ZHANG Y T， LU W P， OU W H， et al. Chinese medical question answer matching with stack-CNN［C］// Proceedings of the 2018 International Symposium on Artificial Intelligence and Robotics， SCI 810. Cham： Springer， 2018： 455-462.
9	BENGIO Y， DUCHARME R， VINCENT P， et al. A neural probabilistic language model［J］. Journal of Machine Learning Research， 2003， 3： 1137-1155. 10.1007/3-540-33486-6_6
10	MIKOLOV T， SUTSKEVER I， CHEN K， et al. Distributed representations of words and phrases and their compositionality［C］// Proceedings of the 26th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2013： 3111-3119.
11	TADDY M. Document classification by inversion of distributed language representations［C］// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing （Volume 2： Short Papers）. Stroudsburg， PA： Association for Computational Linguistics， 2015： 45-49. 10.3115/v1/p15-2008
12	HUANG C C， QIU X P， HUANG X J. Text classification with document embeddings［C］// Proceedings of the 2014 International Symposium on Natural Language Processing Based on Naturally Annotated Big Data/ China National Conference on Chinese Computational Linguistics， LNCS 8801. Cham： Springer， 2014： 131-140.
13	LEVY O， GOLDBERG Y. Neural word embedding as implicit matrix factorization［C］// Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2014： 2177-2185.
14	ZHANG X， ZHAO J B， LeCUN Y. Character-level convolutional networks for text classification［C］// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2015： 647-657. 10.1109/icip.2015.7351229
15	KIM Y. Convolutional neural networks for sentence classification［C］// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： Association for Computational Linguistics， 2014： 1746-1751. 10.3115/v1/d14-1181
16	YIN W P， SCHÜTZE H， XIANG B， et al. ABCNN： attention-based convolutional neural network for modeling sentence pairs［J］. Transactions of the Association for Computational Linguistics， 2016， 4： 259-272. 10.1162/tacl_a_00097
17	GOEURIOT L， JONES G J F， KELLY L， et al. Medical information retrieval： introduction to the special issue［J］. Information Retrieval Journal， 2016， 19（1/2）： 1-5. 10.1007/s10791-015-9277-8
18	MANNING C D， PRABHAKAR R， SCHÜTZE H. Introduction to Information Retrieval［M］. Cambridge： Cambridge University Press， 2008：151-177. 10.1017/cbo9780511809071
19	李岩，郭凤英，翟兴，等. 基于jieba中文分词的在线医疗网站医生画像研究［J］. 医学信息学杂志， 2020， 41（7）： 14-18. 10.3969/j.issn.1673-6036.2020.07.003
	LI Y， GUO F Y， ZHAI X， et al. Study on doctors’ portraits of online medical website based on jieba Chinese word segmentation［J］. Journal of Medical Informatics， 2020， 41（7）： 14-18. 10.3969/j.issn.1673-6036.2020.07.003
20	徐玮. 医疗问答系统的中文分词算法研究［D］. 武汉：华中科技大学， 2019： 7-8.
	XU W. The study of the Chinese word segmentation algorithm in medical question answering system［D］. Wuhan： Huazhong University of Science and Technology， 2019：7-8.
21	CUI Y M， LIU T， CHEN Z P， et al. Consensus attention-based neural networks for Chinese reading comprehension［C］// Proceedings of the 26th International Conference on Computational Linguistics： Technical Papers. Stroudsburg， PA： Association for Computational Linguistics， 2016： 1777-1786. 10.18653/v1/p17-1055
22	YU L， HERMANN K M， BLUNSOM P， et al. Deep learning for answer sentence selection［EB/OL］. （2014-12-04）［2021-07-14］..

数据集	问题句子数	答案句子数	每句问题平均字数	每句答案平均字数
总计	54 000	101 743	119	212
训练集	50 000	94 134	120	212
开发集	2 000	3 774	117	216
测试集	2 000	3 835	119	211

数据集	问题句子数	答案句子数	每句问题平均字数	每句答案平均字数
总计	54 000	101 743	119	212
训练集	50 000	94 134	120	212
开发集	2 000	3 774	117	216
测试集	2 000	3 835	119	211

编号	嵌入方式	模型	正确率/%
编号	嵌入方式	模型	开发集	训练集
1	无	随机选择	0.10	0.10
2		词匹配（jieba）	37.05	36.60
3		词匹配（ICTCLAS）	35.11	36.22
4		字匹配	33.65	34.90
5		BM25（jieba）	37.60	40.00
6		BM25（ICTCLAS）	40.25	41.25
7		BM25（字）	44.80	45.40
8	词（jieba）	平均嵌入	15.60	16.80
9	词（ICTCLAS）		18.05	18.75
10	字		24.90	24.00
11	词（jieba）	嵌入匹配	24.55	23.65
12	词（ICTCLAS）		27.85	29.10
13	字		30.80	32.30

编号	嵌入方式	模型	正确率/%
编号	嵌入方式	模型	开发集	训练集
1	无	随机选择	0.10	0.10
2		词匹配（jieba）	37.05	36.60
3		词匹配（ICTCLAS）	35.11	36.22
4		字匹配	33.65	34.90
5		BM25（jieba）	37.60	40.00
6		BM25（ICTCLAS）	40.25	41.25
7		BM25（字）	44.80	45.40
8	词（jieba）	平均嵌入	15.60	16.80
9	词（ICTCLAS）		18.05	18.75
10	字		24.90	24.00
11	词（jieba）	嵌入匹配	24.55	23.65
12	词（ICTCLAS）		27.85	29.10
13	字		30.80	32.30

模型	正确率
模型	开发集	测试集
Multi-CNNs	48.40	51.15
Stack-CNN	46.03	47.62
Att-StackCNN	46.22	47.60

Attention mechanism based Stack-CNN model to support Chinese medical questions and answers

支持中文医疗问答的基于注意力机制的栈卷积神经网络模型

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 11

References 22

Related Articles 15

Recommended Articles

Metrics

[1]	Shunyong LI, Shiyi LI, Rui XU, Xingwang ZHAO. Incomplete multi-view clustering algorithm based on self-attention fusion [J]. Journal of Computer Applications, 2024, 44(9): 2696-2703.
[2]	Liehong REN, Lyuwen HUANG, Xu TIAN, Fei DUAN. Multivariate long-term series forecasting method with DFT-based frequency-sensitive dual-branch Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2739-2746.
[3]	Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974.
[4]	Xiyuan WANG, Zhancheng ZHANG, Shaokang XU, Baocheng ZHANG, Xiaoqing LUO, Fuyuan HU. Unsupervised cross-domain transfer network for 3D/2D registration in surgical navigation [J]. Journal of Computer Applications, 2024, 44(9): 2911-2918.
[5]	Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738.
[6]	Hang YANG, Wanggen LI, Gensheng ZHANG, Zhige WANG, Xin KAI. Multi-layer information interactive fusion algorithm based on graph neural network for session-based recommendation [J]. Journal of Computer Applications, 2024, 44(9): 2719-2725.
[7]	Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892.
[8]	Yeheng LI, Guangsheng LUO, Qianmin SU. Logo detection algorithm based on improved YOLOv5 [J]. Journal of Computer Applications, 2024, 44(8): 2580-2587.
[9]	Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392.
[10]	Yuqing WANG, Guangli ZHU, Wenjie DUAN, Shuyu LI, Ruotong ZHOU. Sentiment classification model of psychological counseling text based on attention over attention mechanism [J]. Journal of Computer Applications, 2024, 44(8): 2393-2399.
[11]	Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406.
[12]	Tong CHEN, Fengyu YANG, Yu XIONG, Hong YAN, Fuxing QIU. Construction method of voiceprint library based on multi-scale frequency-channel attention fusion [J]. Journal of Computer Applications, 2024, 44(8): 2407-2413.
[13]	Caiqin WANG, Yuhao ZHOU, Shunxiang ZHANG, Yanhui WANG, Xiaolong WANG. Aspect-opinion pair extraction of new energy vehicle complaint text based on context enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2430-2436.
[14]	Yuhan LIU, Genlin JI, Hongping ZHANG. Video pedestrian anomaly detection method based on skeleton graph and mixed attention [J]. Journal of Computer Applications, 2024, 44(8): 2551-2557.
[15]	Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594.