《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (4): 1050-1056.DOI: 10.11772/j.issn.1001-9081.2021071227

• CCF第36届中国计算机应用大会 (CCF NCCA 2021) • 上一篇    

融合实体描述信息和邻居节点特征的知识表示学习方法

焦守龙(), 段友祥, 孙歧峰, 庄子浩, 孙琛皓   

  1. 中国石油大学 (华东) 计算机科学与技术学院,山东 青岛 266555
  • 收稿日期:2021-07-14 修回日期:2021-08-22 接受日期:2021-08-23 发布日期:2022-04-28 出版日期:2022-04-10
  • 通讯作者: 焦守龙
  • 作者简介:段友祥(1964—),男,山东东营人,教授,博士,CCF会员,主要研究方向:网络与服务计算、计算机技术在油气领域的应用
    孙歧峰(1976—),男,山东东营人,讲师,博士,主要研究方向:计算机技术在油气领域的应用
    庄子浩(1997—),男,山东威海人,硕士研究生,主要研究方向:人工智能
    孙琛皓(1997—),男,山东临沂人,硕士研究生,主要研究方向:人工智能。
  • 基金资助:
    中央高校基本科研业务费专项资金资助项目(20CX05017A);中石油重大科技项目(ZD2019-183-006)

Knowledge representation learning method incorporating entity description information and neighbor node features

Shoulong JIAO(), Youxiang DUAN, Qifeng SUN, Zihao ZHUANG, Chenhao SUN   

  1. College of Computer Science and Technology,China University of Petroleum,Qingdao Shandong 266555,China
  • Received:2021-07-14 Revised:2021-08-22 Accepted:2021-08-23 Online:2022-04-28 Published:2022-04-10
  • Contact: Shoulong JIAO
  • About author:DUAN Youxiang, born in 1964, Ph. D., professor. His research interests include network and service computing, computer technology application in field of oil and gas.
    SUN Qifeng, born in 1976, Ph. D., lecturer. His research interests include computer technology application in field of oil and gas.
    ZHUANG Zihao, born in 1997, M. S. candidate. His research interests include artificial intelligence.
    SUN Chenhao, born in 1997, M. S. candidate. His research interests include artificial intelligence.
  • Supported by:
    Fundamental Research Funds for the Central Universities(20CX05017A);Major Scientific and Technological Project of CNPC(ZD2019-183-006)

摘要:

知识图谱表示学习旨在将实体和关系映射到一个低维稠密的向量空间中。现有的大多数相关模型更注重于学习三元组的结构特征,忽略了三元组内的实体关系的语义信息特征和三元组外的实体描述信息特征,因此知识表达能力较差。针对以上问题,提出了一种融合多源信息的知识表示学习模型BAGAT。首先,结合知识图谱特征来构造三元组实体目标节点和邻居节点,并使用图注意力网络(GAT)聚合三元组结构的语义信息表示;然后,使用BERT词向量模型对实体描述信息进行嵌入表示;最后,将两种表示方法映射到同一个向量空间中进行联合知识表示学习。实验结果表明,BAGAT性能较其他模型有较大提升,在公共数据集FB15K-237链接预测任务的Hits@1与Hits@10指标上,与翻译模型TransE相比分别提升了25.9个百分点和22.0个百分点,与图神经网络模型KBGAT相比分别提升了1.8个百分点和3.5个百分点。可见,融合实体描述信息和三元组结构语义信息的多源信息表示方法可以获得更强的表示学习能力。

关键词: 知识图谱, 知识表示学习, 图注意力网络, BERT, 多源信息融合

Abstract:

Knowledge graph representation learning aims to map entities and relations into a low-dimensional dense vector space. Most existing related models pay more attention to learn the structural features of the triples while ignoring the semantic information features of the entity relationships within the triples and the entity description information features outside the triples, so that the abilities of knowledge expression of these models are poor. In response to the above problem, a knowledge representation learning method BAGAT (knowledge representation learning based on BERT model And Graph Attention Network) was proposed by fusing multi-source information. First, the entity target nodes and neighbor nodes of the triples were constructed by combining knowledge graph features, and Graph Attention Network (GAT) was used to aggregate the semantic information representation of the triple structure. Then, the Bidirectional Encoder Representations from Transformers (BERT) word vector model was used to perform the embedded representation of entity description information. Finally, the both representation methods were mapped to the same vector space for joint knowledge representation learning. Experimental results show that BAGAT has a large improvement compared to other models. Among the indicators Hits@1 and Hits@10 on the public dataset FB15K-237, compared with the translation model TransE (Translating Embeddings), BAGAT is increased by 25.9 percentage points and 22.0 percentage points respectively, and compared with the graph neural network model KBGAT (Learning attention-based embeddings for relation prediction in knowledge graphs), BAGAT is increased by 1.8 percentage points and 3.5 percentage points respectively, indicating that the multi-source information representation method incorporating entity description information and semantic information of the triple structure can obtain stronger representation learning capability.

Key words: knowledge graph, knowledge representation learning, Graph Attention Network (GAT), Bidirectional Encoder Representations from Transformers (BERT), multi-source information fusion

中图分类号: