基于迁移学习的知识图谱问答语义匹配模型

doi:10.11772/j.issn.1001-9081.2018010186

计算机应用 ›› 2018, Vol. 38 ›› Issue (7): 1846-1852.DOI: 10.11772/j.issn.1001-9081.2018010186

基于迁移学习的知识图谱问答语义匹配模型

鲁强, 刘兴昱

石油数据挖掘北京市重点实验室(中国石油大学(北京)), 北京 102249

收稿日期:2018-01-22 修回日期:2018-03-16 出版日期:2018-07-10 发布日期:2018-07-12
通讯作者: 鲁强
作者简介:鲁强(1977-),男,河北唐山人,副教授,博士,CCF会员,主要研究方向:知识工程、演化计算;刘兴昱(1992-),男,河北廊坊人,硕士研究生,主要研究方向:自然语言处理。
基金资助:
国家自然科学基金资助项目（61402532）；中国石油大学（北京）青年基础科研基金资助项目（01JB0415）。

Semantic matching model of knowledge graph in question answering system based on transfer learning

LU Qiang, LIU Xingyu

Beijing Key Laboratory of Petroleum Data Mining(China University of Petroleum-Beijing), Beijing 102249, China

Received:2018-01-22 Revised:2018-03-16 Online:2018-07-10 Published:2018-07-12
Supported by:
This work is partially supported by the National Natural Science Foundation of China (61402532), the Science Foundation for Youth Basic Research of China University of Petroleum-Beijing (01JB0415).

摘要/Abstract

摘要： 针对单一事实类问答系统中问句和关系的语义匹配在小规模标注样本中难以获得较高准确率的问题，提出一种基于循环神经网络（RNN）的迁移学习模型。首先，使用基于RNN的序列到序列无监督学习算法，通过序列重构的方式在大量无标注样本中学习问句的语义空间分布，即词向量和RNN；然后，通过给神经网络参数赋值的方式，使用此语义空间分布作为有监督语义匹配算法的参数；最后，通过使用问句特征和关系特征计算内积的方式，在有标注样本中训练并生成语义匹配模型。实验结果表明，在有标注数据量较少而无标注数据量较大的环境下，与有监督学习方法Embed-AVG和RNNrandom相比，所提模型的语义匹配准确率分别平均提高5.6和8.8个百分点。所提模型通过预学习大量无标注样本的语义空间分布可以明显提高在小规模标注样本环境下的语义匹配准确率。

关键词: 语义匹配, 迁移学习, 知识图谱, 问答系统, 循环神经网络

Abstract: To solve the problem that semantic matching between questions and relations in a single fact-based question answering system is difficult to obtain high accuracy in small-scale labeled samples, a transfer learning model based on Recurrent Neural Network (RNN) was proposed. Firstly, by the way of reconstructing sequences, an RNN-based sequence-to-sequence unsupervised learning algorithm was used to learn the semantic distribution (word vector and RNN) of questions in a large number of unlabeled samples. Then, by assigning values to the parameters of a neural network, the semantic distribution was used as the parameters of the supervised semantic matching algorithm. Finally, by the inner product of the question features and relation features, the semantic matching model was trained and generated in labeled samples. The experimental results show that compared with the supervised learning method Embed-AVG and RNNrandom, the accuracy of semantic matching of the proposed model is averagely increased by 5.6 and 8.8 percentage points respectively in an environment with a small number of labeled samples and a large number of unlabeled samples. The proposed model can significantly improve the accuracy of semantic matching in an environment with labeled samples by pre-learning the semantic distribution of a large number of unlabeled samples.

Key words: semantic matching, transfer learning, knowledge graph, question answering system, Recurrent Neural Network (RNN)

中图分类号:

TP391.1
TP18

鲁强, 刘兴昱. 基于迁移学习的知识图谱问答语义匹配模型[J]. 计算机应用, 2018, 38(7): 1846-1852.

LU Qiang, LIU Xingyu. Semantic matching model of knowledge graph in question answering system based on transfer learning[J]. Journal of Computer Applications, 2018, 38(7): 1846-1852.

参考文献

[1] 刘康,张元哲,纪国良,等.基于表示学习的知识库问答研究进展与展望[J].自动化学报,2016,42(6):807-818.(LIU K, ZHANG Y Z, JI G L, et al. Representation learning for question answering over knowledge base:an overview[J]. Acta Automatica Sinica, 2016, 42(6):807-818.)
[2] 王东升,王卫民,王石,等.面向限定领域问答系统的自然语言理解方法综述[J].计算机科学,2017,44(8):1-8.(WANG D S, WANG W M, WANG S, et al. Research on domain-specific question answering system oriented natural language understanding:a survey[J]. Computer Science, 2017, 44(8):1-8.)
[3] BORDES A, CHOPRA S, WESTON J. Question answering with subgraph embeddings[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:ACL, 2014:615-620.
[4] YIH W T, CHANG M W, HE X, et al. Semantic parsing via staged query graph generation:question answering with knowledge base[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing. Stroudsburg, PA:ACL, 2015:1321-1331.
[5] DAI Z, LI L, XU W. CFO:conditional focused neural question answering with large-scale knowledge bases[C]//Proceedings of the 2016 Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:ACL, 2016:800-810.
[6] 荣光辉,黄震华.基于深度学习的问答匹配方法[J].计算机应用,2017,37(10):2861-2865.(RONG G H, HUANG Z H. Question answer matching method based on deep learning[J]. Journal of Computer Applications, 2017, 37(10):2861-2865.)
[7] 庄福振,罗平,何清,等.迁移学习研究进展[J].软件学报,2015,26(1):26-39.(ZHUANG F Z, LUO P, HE Q, et al. Survey on transfer learning research[J]. Journal of Software, 2015, 26(1):26-39.)
[8] DAI W, YANG Q, XUE G R, et al. Boosting for transfer learning[C]//Proceedings of the 2007 International Conference on Machine Learning. New York:ACM, 2007:193-200.
[9] OQUAB M, BOTTOUL, LAPTEV I, et al. Learning and transferring mid-level image representations using convolutional neural networks[C]//Proceedings of the 2014 International Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2014:1717-1724.
[10] BOUSMALIS K, TRIGEORGIS G, SILBERMAN N, et al. Domain separation networks[C]//Proceedings of the 2016 Annual Conference on Neural Information Processing Systems. Cambridge, MA:MIT Press, 2016:343-351.
[11] GANIN Y, USTINOVA E, AJAKAN H, et al. Domain-adversarial training of neural networks[J]. Journal of Machine Learning Research, 2017, 17(1):2096-2030.
[12] RAINA R, BATTLE A, LEE H, et al. Self-taught learning:transfer learning from unlabeled data[C]//Proceedings of the 2007 International Conference on Machine Learning. New York:ACM, 2007:759-766.
[13] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 2013 Annual Conference on Neural Information Processing Systems. Cambridge, MA:MIT Press, 2013:3111-3119.
[14] PENNINGTON J, SOCHER R, MANNING C. Glove:global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:ACL, 2014:1532-1543.
[15] BERANT J, CHOU A, FROSTIG R, et al. Semantic parsing on freebase from question-answer pairs[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:ACL, 2013:1533-1544.
[16] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[EB/OL]. (2013-09-07)[2017-03-15]. https://arxiv.org/pdf/1301.3781.pdf.
[17] SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks[C]//Proceedings of the 2014 Annual Conference on Neural Information Processing Systems. Cambridge, MA:MIT Press, 2014:3104-3112.
[18] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 2012, 9(8):1735-1780.
[19] BORDES A, USUNIER N, CHOPRA S, et al. Large-scale simple question answering with memory networks[EB/OL]. (2015-06-05)[2017-04-12]. https://arxiv.org/pdf/1506.02075.pdf.
[20] BOLLACKER K, KVANS C, PARITOSH P, et al. Freebase:a collaboratively created graph database for structuring human knowledge[C]//Proceedings of the 2008 International Conference on Management of Data. New York:ACM, 2008:1247-1250.
[21] LI P, LI W, HE Z, et al. Dataset and neural recurrent sequence labeling model for open-domain factoid question answering[EB/OL]. (2016-09-01)[2017-04-17]. https://arxiv.org/pdf/1607.06275.pdf.
[22] FADER A, ZETTLEMOYER L, ETZIONI O. Paraphrase-driven learning for open question answering[C]//Proceedings of the 2013 Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:ACL, 2013:1608-1618.
[23] ABADI M, BARHAM P, CHEN J, et al. TensorFlow:a system for large-scale machine learning[C]//Proceedings of the 2016 Symposium on Operating Systems Design and Implementation. Berkeley, CA:USENIX, 2016:265-283.
[24] KINGMA D P, BA J. Adam:a method for stochastic optimization[EB/OL]. (2017-01-30)[2017-04-21]. https://arxiv.org/pdf/1412.6980.pdf.
[25] LAURENS V D M, HINTON G, HINTON V D M G. Visualizing data using t-SNE[J]. Journal of Machine Learning Research, 2008, 9:2579-2605.
[26] YU L, ZHANG W, WANG J, et al. SeqGAN:sequence generative adversarial nets with policy gradient[C]//Proceedings of the 2016 Conference on Artificial Intelligence. Menlo Park, CA:AAAI, 2016:2852-2858.

基于迁移学习的知识图谱问答语义匹配模型

Semantic matching model of knowledge graph in question answering system based on transfer learning

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	刘雅璇, 钟勇. 基于头实体注意力的实体关系联合抽取方法[J]. 计算机应用, 2021, 41(9): 2517-2522.
[2]	刘子辰, 李小娟, 韦伟. 基于循环神经网络的专利价格自动评估[J]. 计算机应用, 2021, 41(9): 2532-2538.
[3]	余敦辉, 万鹏, 王社. 基于企业知识图谱构建的实体关联查询系统[J]. 计算机应用, 2021, 41(9): 2510-2516.
[4]	赵宏, 孔东一. 图像特征注意力与自适应注意力融合的图像内容中文描述[J]. 计算机应用, 2021, 41(9): 2496-2503.
[5]	丁尹, 桑楠, 李晓瑜, 吴飞舟. 基于循环神经网络的电信行业容量数据预测方法[J]. 计算机应用, 2021, 41(8): 2373-2378.
[6]	田玲, 张谨川, 张晋豪, 周望涛, 周雪. 知识图谱综述——表示、构建、推理与知识超图理论[J]. 计算机应用, 2021, 41(8): 2161-2186.
[7]	刘欢, 李晓戈, 胡立坤, 胡飞雄, 王鹏华. 基于知识图谱驱动的图神经网络推荐模型[J]. 计算机应用, 2021, 41(7): 1865-1870.
[8]	余敦辉, 张蕗怡, 张笑笑, 毛亮. 基于知识图谱和重启随机游走的跨平台用户推荐方法[J]. 计算机应用, 2021, 41(7): 1871-1877.
[9]	赵小虎, 李晓. 基于多特征提取的图像语义描述算法[J]. 计算机应用, 2021, 41(6): 1640-1646.
[10]	倪水平, 李慧芳. 基于一维卷积神经网络与长短期记忆网络结合的电池荷电状态预测方法[J]. 计算机应用, 2021, 41(5): 1514-1521.
[11]	陈争涛, 黄灿, 杨波, 赵立, 廖勇. 基于迁移学习的并行卷积神经网络牦牛脸识别算法[J]. 计算机应用, 2021, 41(5): 1332-1336.
[12]	车冰倩, 周栋. 融合网络结构信息及文本内容的标签推荐方法[J]. 计算机应用, 2021, 41(4): 976-983.
[13]	秦丽, 郝志刚, 李国亮. 国家食品安全标准图谱的构建及关联性分析[J]. 计算机应用, 2021, 41(4): 1005-1011.
[14]	王锦凯, 贾旭. 基于迁移孪生非负矩阵分解的静脉识别算法[J]. 计算机应用, 2021, 41(3): 898-903.
[15]	孙建强, 许少华. 基于可微神经计算机和贝叶斯网络的知识推理方法[J]. 计算机应用, 2021, 41(2): 337-342.