Abstract:To solve the problem that semantic matching between questions and relations in a single fact-based question answering system is difficult to obtain high accuracy in small-scale labeled samples, a transfer learning model based on Recurrent Neural Network (RNN) was proposed. Firstly, by the way of reconstructing sequences, an RNN-based sequence-to-sequence unsupervised learning algorithm was used to learn the semantic distribution (word vector and RNN) of questions in a large number of unlabeled samples. Then, by assigning values to the parameters of a neural network, the semantic distribution was used as the parameters of the supervised semantic matching algorithm. Finally, by the inner product of the question features and relation features, the semantic matching model was trained and generated in labeled samples. The experimental results show that compared with the supervised learning method Embed-AVG and RNNrandom, the accuracy of semantic matching of the proposed model is averagely increased by 5.6 and 8.8 percentage points respectively in an environment with a small number of labeled samples and a large number of unlabeled samples. The proposed model can significantly improve the accuracy of semantic matching in an environment with labeled samples by pre-learning the semantic distribution of a large number of unlabeled samples.
鲁强, 刘兴昱. 基于迁移学习的知识图谱问答语义匹配模型[J]. 计算机应用, 2018, 38(7): 1846-1852.
LU Qiang, LIU Xingyu. Semantic matching model of knowledge graph in question answering system based on transfer learning. Journal of Computer Applications, 2018, 38(7): 1846-1852.
[1] 刘康,张元哲,纪国良,等.基于表示学习的知识库问答研究进展与展望[J].自动化学报,2016,42(6):807-818.(LIU K, ZHANG Y Z, JI G L, et al. Representation learning for question answering over knowledge base:an overview[J]. Acta Automatica Sinica, 2016, 42(6):807-818.) [2] 王东升,王卫民,王石,等.面向限定领域问答系统的自然语言理解方法综述[J].计算机科学,2017,44(8):1-8.(WANG D S, WANG W M, WANG S, et al. Research on domain-specific question answering system oriented natural language understanding:a survey[J]. Computer Science, 2017, 44(8):1-8.) [3] BORDES A, CHOPRA S, WESTON J. Question answering with subgraph embeddings[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:ACL, 2014:615-620. [4] YIH W T, CHANG M W, HE X, et al. Semantic parsing via staged query graph generation:question answering with knowledge base[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing. Stroudsburg, PA:ACL, 2015:1321-1331. [5] DAI Z, LI L, XU W. CFO:conditional focused neural question answering with large-scale knowledge bases[C]//Proceedings of the 2016 Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:ACL, 2016:800-810. [6] 荣光辉,黄震华.基于深度学习的问答匹配方法[J].计算机应用,2017,37(10):2861-2865.(RONG G H, HUANG Z H. Question answer matching method based on deep learning[J]. Journal of Computer Applications, 2017, 37(10):2861-2865.) [7] 庄福振,罗平,何清,等.迁移学习研究进展[J].软件学报,2015,26(1):26-39.(ZHUANG F Z, LUO P, HE Q, et al. Survey on transfer learning research[J]. Journal of Software, 2015, 26(1):26-39.) [8] DAI W, YANG Q, XUE G R, et al. Boosting for transfer learning[C]//Proceedings of the 2007 International Conference on Machine Learning. New York:ACM, 2007:193-200. [9] OQUAB M, BOTTOUL, LAPTEV I, et al. Learning and transferring mid-level image representations using convolutional neural networks[C]//Proceedings of the 2014 International Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2014:1717-1724. [10] BOUSMALIS K, TRIGEORGIS G, SILBERMAN N, et al. Domain separation networks[C]//Proceedings of the 2016 Annual Conference on Neural Information Processing Systems. Cambridge, MA:MIT Press, 2016:343-351. [11] GANIN Y, USTINOVA E, AJAKAN H, et al. Domain-adversarial training of neural networks[J]. Journal of Machine Learning Research, 2017, 17(1):2096-2030. [12] RAINA R, BATTLE A, LEE H, et al. Self-taught learning:transfer learning from unlabeled data[C]//Proceedings of the 2007 International Conference on Machine Learning. New York:ACM, 2007:759-766. [13] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 2013 Annual Conference on Neural Information Processing Systems. Cambridge, MA:MIT Press, 2013:3111-3119. [14] PENNINGTON J, SOCHER R, MANNING C. Glove:global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:ACL, 2014:1532-1543. [15] BERANT J, CHOU A, FROSTIG R, et al. Semantic parsing on freebase from question-answer pairs[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:ACL, 2013:1533-1544. [16] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[EB/OL]. (2013-09-07)[2017-03-15]. https://arxiv.org/pdf/1301.3781.pdf. [17] SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks[C]//Proceedings of the 2014 Annual Conference on Neural Information Processing Systems. Cambridge, MA:MIT Press, 2014:3104-3112. [18] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 2012, 9(8):1735-1780. [19] BORDES A, USUNIER N, CHOPRA S, et al. Large-scale simple question answering with memory networks[EB/OL]. (2015-06-05)[2017-04-12]. https://arxiv.org/pdf/1506.02075.pdf. [20] BOLLACKER K, KVANS C, PARITOSH P, et al. Freebase:a collaboratively created graph database for structuring human knowledge[C]//Proceedings of the 2008 International Conference on Management of Data. New York:ACM, 2008:1247-1250. [21] LI P, LI W, HE Z, et al. Dataset and neural recurrent sequence labeling model for open-domain factoid question answering[EB/OL]. (2016-09-01)[2017-04-17]. https://arxiv.org/pdf/1607.06275.pdf. [22] FADER A, ZETTLEMOYER L, ETZIONI O. Paraphrase-driven learning for open question answering[C]//Proceedings of the 2013 Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:ACL, 2013:1608-1618. [23] ABADI M, BARHAM P, CHEN J, et al. TensorFlow:a system for large-scale machine learning[C]//Proceedings of the 2016 Symposium on Operating Systems Design and Implementation. Berkeley, CA:USENIX, 2016:265-283. [24] KINGMA D P, BA J. Adam:a method for stochastic optimization[EB/OL]. (2017-01-30)[2017-04-21]. https://arxiv.org/pdf/1412.6980.pdf. [25] LAURENS V D M, HINTON G, HINTON V D M G. Visualizing data using t-SNE[J]. Journal of Machine Learning Research, 2008, 9:2579-2605. [26] YU L, ZHANG W, WANG J, et al. SeqGAN:sequence generative adversarial nets with policy gradient[C]//Proceedings of the 2016 Conference on Artificial Intelligence. Menlo Park, CA:AAAI, 2016:2852-2858.