Abstract:For Chinese question answer matching tasks, a question answer matching method based on deep learning was proposed to solve the problem of lack of features and low accuracy due to artificial structural feature in machine learning. This method mainly includes 3 different models. The first model is the combination of Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN), which is used to learn the deep semantic features in the sentence and calculate the similarity distance of feature vectors. Moreover, adding two different attention mechanism into this model, the feature representation of answer was constructed according to the question to learn the detailed semantic matching relation of them. Experimental results show that the combined deep nerual network model is superior to the method of feature construction based on machine learning, and the hybrid model based on attention mechanism can further improve the matching accuracy where the best results can reach 80.05% and 68.73% in the standard evaluation of Mean Reciprocal Rank (MRR) and Top-1 accuracy respectively.
[1] 王元卓, 贾岩涛, 刘大伟, 等. 基于开放网络知识的信息检索与数据挖掘[J]. 计算机研究与发展, 2015, 52(2): 456-474. (WANG Y Z, JIA Y T, LIU D W, et al. Open Web knowledge aided information search and data mining[J]. Journal of Computer Research and Development, 2015, 52(2): 456-474.) [2] ZHOU T C, LYU M R, KING I. A classification-based approach to question routing in community question answering[C]//Proceedings of the 21st International Conference on World Wide Web. New York: ACM, 2012: 783-790. [3] JOHNSON R, ZHANG T. Effective use of word order for text categorization with convolutional neural networks[EB/OL]. [2017-01-10]. https://arxiv.org/pdf/1412.1058.pdf. [4] ZHANG D, WANG D. Relation classification via recurrent neural network[EB/OL]. [2017-01-10]. https://arxiv.org/pdf/1508.01006.pdf. [5] SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks[C]//NIPS 2014: Proceedings of the 27th International Conference on Neural Information Processing Systems. New York: ACM, 2014: 3104-3112. [6] HU B, LU Z, LI H, et al. Convolutional neural network architectures for matching natural language sentences[C]//NIPS 2014: Proceedings of the 27th International Conference on Neural Information Processing Systems. New York: ACM, 2014: 2042-2050. [7] GONDEK D C, LALLY A, KALYANPUR A, et al. A framework for merging and ranking of answers in DeepQA[J]. IBM Journal of Research and Development, 2012, 56(3): 399-410. [8] WANG C, KALYANPUR A, FAN J, et al. Relation extraction and scoring in DeepQA[J]. IBM Journal of Research and Development, 2012, 56(3): 339-350. [9] KASNECI G, SUCHANEK F M, IFRIM G, et al. NAGA: harvesting, searching and ranking knowledge[C]//Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. New York: ACM, 2008: 1285-1288. [10] YIH W T, CHANG M W, MEEK C, et al. Question answering using enhanced lexical semantic models[EB/OL]. [2017-01-10]. https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/QA-SentSel-Updated-PostACL.pdf. [11] WANG D, NYBERG E. A long short-term memory model for answer sentence selection in question answering[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2015: 707-712. [12] FENG M, XIANG B, GLASS M R, et al. Applying deep learning to answer selection: a study and an open task[C]//Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding. Piscataway, NJ: IEEE, 2015: 813-820. [13] ROCKTASCHEL T, GREFENSTETTE E, HERMANN K M, et al. Reasoning about entailment with neural attention [EB/OL]. [2017-01-10]. https://arxiv.org/pdf/1509.06664.pdf. [14] YIN W, SCHUTZE H, XIANG B, et al. ABCNN: attention-based convolutional neural network for modeling sentence pairs [EB/OL]. [2017-01-10]. https://arxiv.org/pdf/1512.05193.pdf. [15] CHUNG J, GULCEHRE C, CHO K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling [EB/OL]. [2017-01-10]. https://arxiv.org/pdf/1412.3555.pdf. [16] SRIVASTAVA N, HINTON G E, KRIZHEVSKY A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15(1): 1929-1958. [17] HERMANN K M, KOCISKY T, GREFENSTETTE E, et al. Teaching machines to read and comprehend [EB/OL]. [2017-01-10]. https://arxiv.org/pdf/1506.03340.pdf. [18] TAN M, SANTOS C, XIANG B, et al. LSTM-based deep learning models for non-factoid answer selection [EB/OL]. [2017-01-10]. https://arxiv.org/pdf/1511.04108.pdf. [19] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space [EB/OL]. [2017-01-10]. https://arxiv.org/pdf/1301.3781.pdf. [20] KINGMA D, BA J. Adam: a method for stochastic optimization [EB/OL]. [2017-01-10]. https://arxiv.org/pdf/1412.6980.pdf. [21] WU F, YANG M, ZHAO T, et al. A hybrid approach to DBQA[C]//Proceedings of the 5th CCF Conference on Natural Language Processing and Chinese Computing, and the 24th International Conference on Computer Processing of Oriental Languages. Berlin: Springer, 2016: 926-933. [22] WANG B, NIU J, MA L, et al. A Chinese question answering approach integrating count-based and embedding-based features[C]//Proceedings of the 5th CCF Conference on Natural Language Processing and Chinese Computing, and the 24th International Conference on Computer Processing of Oriental Languages. Berlin: Springer, 2016: 934-941.