计算机应用 ›› 2018, Vol. 38 ›› Issue (11): 3089-3093.DOI: 10.11772/j.issn.1001-9081.2018041238

• 第七届中国数据挖掘会议(CCDM 2018) • 上一篇    下一篇

基于实体相似度信息的知识图谱补全算法

王子涵1, 邵明光1, 刘国军1, 郭茂祖2, 毕建东1, 刘扬1   

  1. 1. 哈尔滨工业大学 计算机科学与技术学院, 哈尔滨 150001;
    2. 北京建筑大学 建筑大数据智能处理方法研究北京市重点实验室, 北京 100044
  • 收稿日期:2018-04-30 修回日期:2018-06-07 出版日期:2018-11-10 发布日期:2018-11-10
  • 通讯作者: 刘扬
  • 作者简介:王子涵(1995-),女,江苏徐州人,硕士研究生,主要研究方向:知识图谱;邵明光(1995-),男,山东济宁人,硕士研究生,主要研究方向:知识图谱;刘国军(1979-),男,天津宝坻人,副教授,博士,CCF会员,主要研究方向:机器学习、计算机视觉、图像处理、模式识别;郭茂祖(1966-),男,山东夏津人,教授,博士,CCF会员,主要研究方向:机器学习、生物信息学、城市计算;毕建东(1964-),男,黑龙江齐齐哈尔人,副教授,博士,主要研究方向:机器学习;刘扬(1976-),男,黑龙江哈尔滨人,副教授,博士,主要研究方向:机器学习。
  • 基金资助:
    国家自然科学基金资助项目(61671188,61571164,61502122);国家重点研发计划项目(2016YFC0901902)。

Knowledge graph completion algorithm based on similarity between entities

WANG Zihan1, SHAO Mingguang1, LIU Guojun1, GUO Maozu2, BI Jiandong1, LIU Yang1   

  1. 1. College of Computer Science and Technology, Harbin Institute of Technology, Harbin Heilongjiang 150001, China;
    2. Beijing Key Laboratory of Intelligent Processing for Building Big Data, Beijing University of Civil Engineering and Architecture, Beijing 100044, China
  • Received:2018-04-30 Revised:2018-06-07 Online:2018-11-10 Published:2018-11-10
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61671188,61571164,61502122), the National Key R&D Program of China (2016YFC0901902).

摘要: 为了解决知识图谱的链接预测问题,提出了一种共享变量的神经网络模型(LCPE),该模型通过将实体和关系嵌入到向量空间中实现对链接的预测。通过分析Unstructured Model,推导出在向量空间中两个有关系的实体嵌入距离更近,即相似的实体之间更可能具有关系,LCPE模型将ProjE模型和实体之间的相似度信息进行融合,在判断两个实体是否有关系的基础上判断具体关系类型。三元组预测实验中,LCPE模型在与ProjE模型参数规模相同的情况下,在公开数据集WN18中,正例三元组的平均得分排名(Mean Rank)比ProjE提前了11,而正例三元组在前10名中出现的概率Hit@10比ProjE提升了0.2个百分点;在FB15k中,Mean Rank提前了7.5,Hits@10平均提升了3.05个百分点:证明了LCPE模型能够将实体相似度信息融入ProjE中并有效提升预测准确度。

关键词: 知识图谱, 链接预测, 嵌入向量, 神经网络, 相似度

Abstract: In order to solve the link prediction problem of knowledge graph, a shared variable network model named LCPE (Local Combination Projection Embedding) was proposed, which realized the prediction of links by embedding entities and relationships into vector space. By analyzing the Unstructured Model, it was derived that the distance between related entities' embedding was shorter in the vector space, in other words, similar entities were more likely to be related. In LCPE model, ProjE model was used based on similarity between two entities to judge whether the two entities were related and the relation type between them. The experiment shows that with the same number of parameters, the LCPE improves Mean Rank by 11 and lifts Hit@10 0.2 percentage points in dataset WN18 while improves Mean Rank 7.5 and lifts Hit@10 3.05 percentage points in dataset FB15k, which proves that the similarity between entities, as auxiliary information, can improve predictive capability of the ProjE model.

Key words: knowledge graph, link prediction, embedding vector, neural network, similarity

中图分类号: