融合实体描述信息和邻居节点特征的知识表示学习方法

doi:10.11772/j.issn.1001-9081.2021071227

《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (4): 1050-1056.DOI: 10.11772/j.issn.1001-9081.2021071227

所属专题： CCF第36届中国计算机应用大会 (CCF NCCA 2021)

• CCF第36届中国计算机应用大会 (CCF NCCA 2021) • 上一篇下一篇

融合实体描述信息和邻居节点特征的知识表示学习方法

焦守龙(), 段友祥, 孙歧峰, 庄子浩, 孙琛皓

中国石油大学（华东）计算机科学与技术学院，山东青岛 266555

收稿日期:2021-07-14 修回日期:2021-08-22 接受日期:2021-08-23 发布日期:2022-04-28 出版日期:2022-04-10
通讯作者: 焦守龙
作者简介:段友祥（1964—），男，山东东营人，教授，博士，CCF会员，主要研究方向：网络与服务计算、计算机技术在油气领域的应用
孙歧峰（1976—），男，山东东营人，讲师，博士，主要研究方向：计算机技术在油气领域的应用
庄子浩（1997—），男，山东威海人，硕士研究生，主要研究方向：人工智能
孙琛皓（1997—），男，山东临沂人，硕士研究生，主要研究方向：人工智能。
基金资助:
中央高校基本科研业务费专项资金资助项目(20CX05017A);中石油重大科技项目(ZD2019-183-006)

Knowledge representation learning method incorporating entity description information and neighbor node features

Shoulong JIAO(), Youxiang DUAN, Qifeng SUN, Zihao ZHUANG, Chenhao SUN

College of Computer Science and Technology，China University of Petroleum，Qingdao Shandong 266555，China

Received:2021-07-14 Revised:2021-08-22 Accepted:2021-08-23 Online:2022-04-28 Published:2022-04-10
Contact: Shoulong JIAO
About author:DUAN Youxiang， born in 1964， Ph. D.， professor. His research interests include network and service computing， computer technology application in field of oil and gas.
SUN Qifeng， born in 1976， Ph. D.， lecturer. His research interests include computer technology application in field of oil and gas.
ZHUANG Zihao， born in 1997， M. S. candidate. His research interests include artificial intelligence.
SUN Chenhao， born in 1997， M. S. candidate. His research interests include artificial intelligence.
Supported by:
Fundamental Research Funds for the Central Universities(20CX05017A);Major Scientific and Technological Project of CNPC(ZD2019-183-006)

摘要/Abstract

摘要：

知识图谱表示学习旨在将实体和关系映射到一个低维稠密的向量空间中。现有的大多数相关模型更注重于学习三元组的结构特征，忽略了三元组内的实体关系的语义信息特征和三元组外的实体描述信息特征，因此知识表达能力较差。针对以上问题，提出了一种融合多源信息的知识表示学习模型BAGAT。首先，结合知识图谱特征来构造三元组实体目标节点和邻居节点，并使用图注意力网络（GAT）聚合三元组结构的语义信息表示；然后，使用BERT词向量模型对实体描述信息进行嵌入表示；最后，将两种表示方法映射到同一个向量空间中进行联合知识表示学习。实验结果表明，BAGAT性能较其他模型有较大提升，在公共数据集FB15K-237链接预测任务的Hits@1与Hits@10指标上，与翻译模型TransE相比分别提升了25.9个百分点和22.0个百分点，与图神经网络模型KBGAT相比分别提升了1.8个百分点和3.5个百分点。可见，融合实体描述信息和三元组结构语义信息的多源信息表示方法可以获得更强的表示学习能力。

关键词: 知识图谱, 知识表示学习, 图注意力网络, BERT, 多源信息融合

Abstract:

Knowledge graph representation learning aims to map entities and relations into a low-dimensional dense vector space. Most existing related models pay more attention to learn the structural features of the triples while ignoring the semantic information features of the entity relationships within the triples and the entity description information features outside the triples， so that the abilities of knowledge expression of these models are poor. In response to the above problem， a knowledge representation learning method BAGAT （knowledge representation learning based on BERT model And Graph Attention Network） was proposed by fusing multi-source information. First， the entity target nodes and neighbor nodes of the triples were constructed by combining knowledge graph features， and Graph Attention Network （GAT） was used to aggregate the semantic information representation of the triple structure. Then， the Bidirectional Encoder Representations from Transformers （BERT） word vector model was used to perform the embedded representation of entity description information. Finally， the both representation methods were mapped to the same vector space for joint knowledge representation learning. Experimental results show that BAGAT has a large improvement compared to other models. Among the indicators Hits@1 and Hits@10 on the public dataset FB15K-237， compared with the translation model TransE （Translating Embeddings）， BAGAT is increased by 25.9 percentage points and 22.0 percentage points respectively， and compared with the graph neural network model KBGAT （Learning attention-based embeddings for relation prediction in knowledge graphs）， BAGAT is increased by 1.8 percentage points and 3.5 percentage points respectively， indicating that the multi-source information representation method incorporating entity description information and semantic information of the triple structure can obtain stronger representation learning capability.

Key words: knowledge graph, knowledge representation learning, Graph Attention Network (GAT), Bidirectional Encoder Representations from Transformers (BERT), multi-source information fusion

中图分类号:

TP391

焦守龙, 段友祥, 孙歧峰, 庄子浩, 孙琛皓. 融合实体描述信息和邻居节点特征的知识表示学习方法[J]. 计算机应用, 2022, 42(4): 1050-1056.

Shoulong JIAO, Youxiang DUAN, Qifeng SUN, Zihao ZHUANG, Chenhao SUN. Knowledge representation learning method incorporating entity description information and neighbor node features[J]. Journal of Computer Applications, 2022, 42(4): 1050-1056.

图/表 9

参考文献 24

1	BENGIO Y， COURVILLE A， VINCENT P. Representation learning： a review and new perspectives［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2013， 35（8）： 1798-1828. 10.1109/tpami.2013.50
2	王智悦，于清，王楠，等. 基于知识图谱的智能问答研究综述［J］. 计算机工程与应用， 2020， 56（23）：1-11. 10.1109/aeeca52519.2021.9574313
	WANG Z Y， YU Q， WANG N， et al. Survey of intelligent question answering research based on knowledge graph［J］. Computer Engineering and Applications， 2020， 56（23）：1-11. 10.1109/aeeca52519.2021.9574313
3	WANG Q， MAO Z D， WANG B， et al. Knowledge graph embedding： a survey of approaches and applications［J］. IEEE Transactions on Knowledge and Data Engineering， 2017， 29（12）： 2724-2743. 10.1109/TKDE.2017.2754499
4	BENGIO Y. Learning deep architectures for AI［J］. Foundations and Trends in Machine Learning， 2009， 2（1）： 1-127. 10.1561/2200000006
5	TURIAN J， RATINOV L， BENGIO Y. Word representations： a simple and general method for semi-supervised learning［C］// Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Stroudsburg， PA： Association for Computational Linguistics， 2010： 384-394.
6	刘知远，孙茂松，林衍凯，等. 知识表示学习研究进展［J］. 计算机研究与发展， 2016， 53（2）：247-261. 10.7544/issn1000-1239.2016.20160020
	LIU Z Y， SUN M S， LIN Y K， et al. Knowledge representation learning： a review［J］. Journal of Computer Research and Development， 2016， 53（2）： 247-261. 10.7544/issn1000-1239.2016.20160020
7	张正航，钱育蓉，行艳妮，等. 基于TransE的表示学习方法研究综述［J］. 计算机应用研究， 2021， 38（3）：656-663. 10.19734/j.issn.1001-3695.2020.02.0028
	ZHANG Z H， QIAN Y R， XING Y N， et al. Survey of representation learning methods based on TransE［J］. Application Research of Computers， 2021， 38（3）： 656-663. 10.19734/j.issn.1001-3695.2020.02.0028
8	NATHANI D， CHAUHAN J， SHARMA C， et al. Learning attention-based embeddings for relation prediction in knowledge graphs［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg， PA： Association for Computational Linguistics， 2019： 4710-4723. 10.18653/v1/p19-1466
9	VELIČKOVIĆ P， CUCURULL G， CASANOVA A， et al. Graph attention networks［EB/OL］（2018-02-04）［2021-05-22］..
10	DEVLIN J， CHANG M W， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies （Long and Short Papers）. Stroudsburg， PA： Association for Computational Linguistics， 2019，1： 4171-4186.
11	CHANG L， ZHU M L， GU T L， et al. Knowledge graph embedding by dynamic translation［J］. IEEE Access， 2017， 5： 20898-20907. 10.1109/access.2017.2759139
12	BORDES A， USUNIER N， GARCIA-DURÁN A， et al. Translating embeddings for modeling multi-relational data［C］// Proceedings of the 26th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2013： 2787-2795. 10.1007/978-3-662-44848-9_28
13	WANG Z， ZHANG J W， FENG J L， et al. Knowledge graph embedding by translating on hyperplanes［C］// Proceedings of the 28th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2014： 1112-1119. 10.1609/aaai.v33i01.33017152
14	LIN Y K， LIU Z Y， SUN M S， et al. Learning entity and relation embeddings for knowledge graph completion［C］// Proceedings of the 29th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2015： 2181-2187. 10.1609/aaai.v34i05.6281
15	JI G L， HE S Z， XU L H， et al. Knowledge graph embedding via dynamic mapping matrix［C］// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing （Long Papers）. Stroudsburg， PA： Association for Computational Linguistics， 2015， 1：687-696. 10.3115/v1/p15-1067
16	XIE R B， LIU Z Y， JIA J， et al. Representation learning of knowledge graphs with entity descriptions［C］// Proceedings of the 30th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2016： 2659-2665.
17	DETTMERS T， MINERVINI P， STENETORP P， et al. Convolutional 2D knowledge graph embeddings［C］// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2018： 1811-1818.
18	NGUYEN D Q， NGUYEN T D， NGUYEN D Q， et al. A novel embedding model for knowledge base completion based on convolutional neural network［C］// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies，（Short Papers）. Stroudsburg. PA： Association for Computational Linguistics， 2018， 2：327-333. 10.18653/v1/n18-2053
19	SCHLICHTKRULL M， KIPF T N， BLOEM P， et al. Modeling relational data with graph convolutional networks［C］// Proceedings of the 2018 European Extended Semantic Web Conference， LNCS 10843. Cham： Springer， 2018： 593-607.
20	MIKOLOV T， SUTSKEVER I， CHEN K， et al. Distributed representations of words and phrases and their compositionality［C］// Proceedings of the 26th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2013：3111-3119.
21	COLLOBERT R， WESTON J， BOTTOU L， et al. Natural language processing （almost） from scratch［J］. Journal of Machine Learning Research， 2011， 12：2493-2537.
22	XU J M， XU B， TIAN G H， et al. Short text hashing improved by integrating multi-granularity topics and tags［C］// Proceedings of the 2015 International Conference on Intelligent Text Processing and Computational Linguistics， LNCS 9041. Cham： Springer， 2015： 444-455.
23	BOLLACKER K， EVANS C， PARITOSH P， et al. Freebase： a collaboratively created graph database for structuring human knowledge［C］// Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. New York： ACM， 2008： 1247-1250. 10.1145/1376616.1376746
24	MILLER G A. WordNet： a lexical database for English［J］. Communications of the ACM， 1995， 38（11）： 39-41. 10.1145/219717.219748

数据集	#Ent	#Rel	#Train	#Valid	#Text
FB15K-237	14 541	237	272 115	17 535	20 466
WN18RR	40 943	11	86 835	3 034	3 134

数据集	#Ent	#Rel	#Train	#Valid	#Text
FB15K-237	14 541	237	272 115	17 535	20 466
WN18RR	40 943	11	86 835	3 034	3 134

模型	准确率		模型	准确率
模型	FB15K-237	WN18RR	模型	FB15K-237	WN18RR
TransE	66.7	58.5	DKRL	76.3	77.1
TransH	75.1	73.2	KBGAT	78.4	77.5
TransR	73.3	74.2	BAGAT	82.1	78.6
TransD	69.6	68.8

模型	准确率		模型	准确率
模型	FB15K-237	WN18RR	模型	FB15K-237	WN18RR
TransE	66.7	58.5	DKRL	76.3	77.1
TransH	75.1	73.2	KBGAT	78.4	77.5
TransR	73.3	74.2	BAGAT	82.1	78.6
TransD	69.6	68.8

模型	MR	Hits@n/%
模型	MR	@1	@3	@10
TransE	330	19.8	37.6	44.1
DKRL	217	20.3	32.7	47.9
ConvKB	226	19.8	32.4	50.1
ConvE	247	22.5	34.2	49.6
R-GCN	600	10.0	18.1	30.0
KBGAT	210	43.9	55.6	62.6
BAGAT	153	45.7	56.0	66.1

融合实体描述信息和邻居节点特征的知识表示学习方法

Knowledge representation learning method incorporating entity description information and neighbor node features

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献 24

相关文章 15

编辑推荐

Metrics

模型	MR	Hits@n/%
模型	MR	@1	@3	@10
TransE	2 310	4.25	44.00	52.90
DKRL	2 197	10.30	45.30	54.70
ConvKB	1 734	5.73	44.30	55.30
ConvE	3 457	38.90	43.50	53.10
R-GCN	6 679	8.00	13.50	20.80
KBGAT	1 940	36.10	48.30	58.10
BAGAT	1 878	36.40	48.50	59.30

[1]	薛桂香, 王辉, 周卫峰, 刘瑜, 李岩. 基于知识图谱和时空扩散图卷积网络的港口交通流量预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2952-2957.
[2]	武杰, 张安思, 吴茂东, 张仪宗, 王从宝. 知识图谱在装备故障诊断领域的研究与应用综述[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2651-2659.
[3]	杨航, 李汪根, 张根生, 王志格, 开新. 基于图神经网络的多层信息交互融合算法用于会话推荐[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2719-2725.
[4]	赵宇博, 张丽萍, 闫盛, 侯敏, 高茂. 基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2421-2429.
[5]	姚迅, 秦忠正, 杨捷. 生成式标签对抗的文本分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1781-1785.
[6]	沈君凤, 周星辰, 汤灿. 基于改进的提示学习方法的双通道情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1796-1806.
[7]	于右任, 张仰森, 蒋玉茹, 黄改娟. 融合多粒度语言知识与层级信息的中文命名实体识别模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1706-1712.
[8]	柯添赐, 刘建华, 孙水华, 郑智雄, 蔡子杰. 融合强关联依赖和简洁语法的方面级情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1786-1795.
[9]	李健京, 李贯峰, 秦飞舟, 李卫军. 基于不确定知识图谱嵌入的多关系近似推理模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1751-1759.
[10]	郭洁, 林佳瑜, 梁祖红, 罗孝波, 孙海涛. 基于知识感知和跨层次对比学习的推荐方法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1121-1127.
[11]	赵晓焱, 匡燕, 王梦含, 袁培燕. 基于知识图谱的端到端内容共享机制[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 995-1001.
[12]	余杭, 周艳玲, 翟梦鑫, 刘涵. 基于预训练模型与标签融合的文本分类[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 709-714.
[13]	郭磊, 贾真, 李天瑞. 面向方面级情感分析的交互式关系图注意力网络[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 696-701.
[14]	徐大鹏, 侯新民. 基于网络结构设计的图神经网络特征选择方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 663-670.
[15]	赖华, 孙童, 王文君, 余正涛, 高盛祥, 董凌. 多模态特征的越南语语音识别文本标点恢复[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 418-423.