基于深度学习的问答匹配方法

doi:10.11772/j.issn.1001-9081.2017.10.2861

计算机应用 ›› 2017, Vol. 37 ›› Issue (10): 2861-2865.DOI: 10.11772/j.issn.1001-9081.2017.10.2861

基于深度学习的问答匹配方法

荣光辉, 黄震华

同济大学计算机科学与技术系, 上海 201800

收稿日期:2017-05-03 修回日期:2017-07-09 发布日期:2017-10-16 出版日期:2017-10-10
通讯作者: 黄震华(1980-),男,福建泉州人,教授,博士,CCF会员,主要研究方向:数据分析、数据挖掘、机器学习,E-mail:huangzhenhua@tongji.edu.cn
作者简介:荣光辉(1992-),男,安徽六安人,硕士研究生,主要研究方向:深度学习、自然语言处理;黄震华(1980-),男,福建泉州人,教授,博士,CCF会员,主要研究方向:数据分析、数据挖掘、机器学习.
基金资助:
中央高校基本科研业务费专项资金资助项目（1600219256）；上海市青年科技启明星计划项目（15QA1403900）；上海市自然科学基金资助项目（17ZR1445900）；霍英东教育基金会高等院校青年教师基金资助项目（142002）。

Question answer matching method based on deep learning

RONG Guanghui, HUANG Zhenhua

Department of Computer Science and Technology, Tongji University, Shanghai 201800, China

Received:2017-05-03 Revised:2017-07-09 Online:2017-10-16 Published:2017-10-10
Supported by:
This work is partially supported by the Fundamental Research Funds for the Central Universities (1600219256), the Sponsored by Shanghai Rising-Star Program (15QA1403900), the Shanghai Natural Science Foundation (17ZR1445900), the Fok Ying-Tong Education Foundation for Young Teachers in the Higher Education Institutions of China (142002).

摘要/Abstract

摘要： 面向中文问答匹配任务，提出基于深度学习的问答匹配方法，以解决机器学习模型因人工构造特征而导致的特征不足和准确率偏低的问题。在该方法中，主要有三种不同的模型。首先应用组合式的循环神经网络（RNN）与卷积神经网络（CNN）模型去学习句子中的深层语义特征，并计算特征向量的相似度距离。在此模型的基础上，加入两种不同的注意力机制，根据问题构造答案的特征表示去学习问答对中细致的语义匹配关系。实验结果表明，基于组合式的深度神经网络模型的实验效果要明显优于基于特征构造的机器学习方法，而基于注意力机制的混合模型可以进一步提高匹配准确率，其结果最高在平均倒数排序（MRR）和Top-1 accuray评测指标上分别可以达到80.05%和68.73%。

关键词: 问答匹配, 深度学习, 循环神经网络, 卷积神经网络, 注意力机制, 机器学习

Abstract: For Chinese question answer matching tasks, a question answer matching method based on deep learning was proposed to solve the problem of lack of features and low accuracy due to artificial structural feature in machine learning. This method mainly includes 3 different models. The first model is the combination of Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN), which is used to learn the deep semantic features in the sentence and calculate the similarity distance of feature vectors. Moreover, adding two different attention mechanism into this model, the feature representation of answer was constructed according to the question to learn the detailed semantic matching relation of them. Experimental results show that the combined deep nerual network model is superior to the method of feature construction based on machine learning, and the hybrid model based on attention mechanism can further improve the matching accuracy where the best results can reach 80.05% and 68.73% in the standard evaluation of Mean Reciprocal Rank (MRR) and Top-1 accuracy respectively.

Key words: question answer matching, deep learning, Recurrent Neural Network (RNN), Convolution Neural Network (CNN), attention mechanism, machine learning

中图分类号:

TP183

荣光辉, 黄震华. 基于深度学习的问答匹配方法[J]. 计算机应用, 2017, 37(10): 2861-2865.

RONG Guanghui, HUANG Zhenhua. Question answer matching method based on deep learning[J]. Journal of Computer Applications, 2017, 37(10): 2861-2865.

参考文献

[1] 王元卓, 贾岩涛, 刘大伟, 等. 基于开放网络知识的信息检索与数据挖掘[J]. 计算机研究与发展, 2015, 52(2): 456-474. (WANG Y Z, JIA Y T, LIU D W, et al. Open Web knowledge aided information search and data mining[J]. Journal of Computer Research and Development, 2015, 52(2): 456-474.)
[2] ZHOU T C, LYU M R, KING I. A classification-based approach to question routing in community question answering[C]//Proceedings of the 21st International Conference on World Wide Web. New York: ACM, 2012: 783-790.
[3] JOHNSON R, ZHANG T. Effective use of word order for text categorization with convolutional neural networks[EB/OL]. [2017-01-10]. https://arxiv.org/pdf/1412.1058.pdf.
[4] ZHANG D, WANG D. Relation classification via recurrent neural network[EB/OL]. [2017-01-10]. https://arxiv.org/pdf/1508.01006.pdf.
[5] SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks[C]//NIPS 2014: Proceedings of the 27th International Conference on Neural Information Processing Systems. New York: ACM, 2014: 3104-3112.
[6] HU B, LU Z, LI H, et al. Convolutional neural network architectures for matching natural language sentences[C]//NIPS 2014: Proceedings of the 27th International Conference on Neural Information Processing Systems. New York: ACM, 2014: 2042-2050.
[7] GONDEK D C, LALLY A, KALYANPUR A, et al. A framework for merging and ranking of answers in DeepQA[J]. IBM Journal of Research and Development, 2012, 56(3): 399-410.
[8] WANG C, KALYANPUR A, FAN J, et al. Relation extraction and scoring in DeepQA[J]. IBM Journal of Research and Development, 2012, 56(3): 339-350.
[9] KASNECI G, SUCHANEK F M, IFRIM G, et al. NAGA: harvesting, searching and ranking knowledge[C]//Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. New York: ACM, 2008: 1285-1288.
[10] YIH W T, CHANG M W, MEEK C, et al. Question answering using enhanced lexical semantic models[EB/OL]. [2017-01-10]. https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/QA-SentSel-Updated-PostACL.pdf.
[11] WANG D, NYBERG E. A long short-term memory model for answer sentence selection in question answering[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2015: 707-712.
[12] FENG M, XIANG B, GLASS M R, et al. Applying deep learning to answer selection: a study and an open task[C]//Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding. Piscataway, NJ: IEEE, 2015: 813-820.
[13] ROCKTASCHEL T, GREFENSTETTE E, HERMANN K M, et al. Reasoning about entailment with neural attention [EB/OL]. [2017-01-10]. https://arxiv.org/pdf/1509.06664.pdf.
[14] YIN W, SCHUTZE H, XIANG B, et al. ABCNN: attention-based convolutional neural network for modeling sentence pairs [EB/OL]. [2017-01-10]. https://arxiv.org/pdf/1512.05193.pdf.
[15] CHUNG J, GULCEHRE C, CHO K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling [EB/OL]. [2017-01-10]. https://arxiv.org/pdf/1412.3555.pdf.
[16] SRIVASTAVA N, HINTON G E, KRIZHEVSKY A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15(1): 1929-1958.
[17] HERMANN K M, KOCISKY T, GREFENSTETTE E, et al. Teaching machines to read and comprehend [EB/OL]. [2017-01-10]. https://arxiv.org/pdf/1506.03340.pdf.
[18] TAN M, SANTOS C, XIANG B, et al. LSTM-based deep learning models for non-factoid answer selection [EB/OL]. [2017-01-10]. https://arxiv.org/pdf/1511.04108.pdf.
[19] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space [EB/OL]. [2017-01-10]. https://arxiv.org/pdf/1301.3781.pdf.
[20] KINGMA D, BA J. Adam: a method for stochastic optimization [EB/OL]. [2017-01-10]. https://arxiv.org/pdf/1412.6980.pdf.
[21] WU F, YANG M, ZHAO T, et al. A hybrid approach to DBQA[C]//Proceedings of the 5th CCF Conference on Natural Language Processing and Chinese Computing, and the 24th International Conference on Computer Processing of Oriental Languages. Berlin: Springer, 2016: 926-933.
[22] WANG B, NIU J, MA L, et al. A Chinese question answering approach integrating count-based and embedding-based features[C]//Proceedings of the 5th CCF Conference on Natural Language Processing and Chinese Computing, and the 24th International Conference on Computer Processing of Oriental Languages. Berlin: Springer, 2016: 934-941.

基于深度学习的问答匹配方法

Question answer matching method based on deep learning

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	李顺勇, 李师毅, 胥瑞, 赵兴旺. 基于自注意力融合的不完整多视图聚类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2696-2703.
[2]	秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974.
[3]	王熙源, 张战成, 徐少康, 张宝成, 罗晓清, 胡伏原. 面向手术导航3D/2D配准的无监督跨域迁移网络[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2911-2918.
[4]	李力铤, 华蓓, 贺若舟, 徐况. 基于解耦注意力机制的多变量时序预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2732-2738.
[5]	潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877.
[6]	李云, 王富铕, 井佩光, 王粟, 肖澳. 基于不确定度感知的帧关联短视频事件检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2903-2910.
[7]	赵志强, 马培红, 黑新宏. 基于双重注意力机制的人群计数方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2886-2892.
[8]	黄云川, 江永全, 黄骏涛, 杨燕. 基于元图同构网络的分子毒性预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2964-2969.
[9]	陈虹, 齐兵, 金海波, 武聪, 张立昂. 融合1D-CNN与BiGRU的类不平衡流量异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2493-2499.
[10]	赵宇博, 张丽萍, 闫盛, 侯敏, 高茂. 基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2421-2429.
[11]	张春雪, 仇丽青, 孙承爱, 荆彩霞. 基于两阶段动态兴趣识别的购买行为预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2365-2371.
[12]	薛凯鹏, 徐涛, 廖春节. 融合自监督和多层交叉注意力的多模态情感分析网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2387-2392.
[13]	汪雨晴, 朱广丽, 段文杰, 李书羽, 周若彤. 基于交互注意力机制的心理咨询文本情感分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2393-2399.
[14]	高鹏淇, 黄鹤鸣, 樊永红. 融合坐标与多头注意力机制的交互语音情感识别[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2400-2406.
[15]	刘禹含, 吉根林, 张红苹. 基于骨架图与混合注意力的视频行人异常检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2551-2557.