Journal of Computer Applications ›› 2019, Vol. 39 ›› Issue (9): 2489-2493.DOI: 10.11772/j.issn.1001-9081.2019020357

• Artificial intelligence •     Next Articles

Adversarial negative sample generation for knowledge representation learning

ZHANG Zhao, JI Jianmin, CHEN Xiaoping   

  1. School of Computer Science and Technology, University of Science and Technology of China, Hefei Anhui 230027, China
  • Received:2019-03-06 Revised:2019-04-20 Online:2019-09-10 Published:2019-05-14
  • Supported by:

    This work is partially supported by the National Natural Science Foundation of China (U1613216, 61573386), the Science and Technology Planning Project of Guangdong Province (2017B010110011).

用于知识表示学习的对抗式负样本生成

张钊, 吉建民, 陈小平   

  1. 中国科学技术大学 计算机科学与技术学院, 合肥 230027
  • 通讯作者: 吉建民
  • 作者简介:张钊(1994-),男,河北衡水人,硕士研究生,主要研究方向:知识表示学习、知识图谱、自然语言处理;吉建民(1984-),男,甘肃定西人,副教授,博士,CCF会员,主要研究方向:认知机器人、知识表示与推理;陈小平(1955-),男,北京人,教授,博士生导师,博士,主要研究方向:人工智能逻辑、多智能体系统、智能机器人。
  • 基金资助:

    国家自然科学基金资助项目(U1613216,61573386);广东省科技计划项目(2017B010110011)。

Abstract:

Knowledge graph embedding is to embed symbolic relations and entities of the knowledge graph into low dimensional continuous vector space. Despite the requirement of negative samples for training knowledge graph embedding models, only positive examples are stored in the form of triplets in most knowledge graphs. Moreover, negative samples generated by negative sampling of conventional knowledge graph embedding methods are easy to be discriminated by the model and contribute less and less as the training going on. To address this problem, an Adversarial Negative Generator (ANG) model was proposed. The generator applied the encoder-decoder pipeline, the encoder readed in positive triplets whose head or tail entities were replaced as context information, and then the decoder filled the replaced entity with the triplet using the encoding information provided by the encoder, so as to generate negative samples. Several existing knowledge graph embedding models were used to play an adversarial game with the proposed generator to optimize the knowledge representation vectors. By comparing with existing knowledge graph embedding models, it can be seen that the proposed method has better mean ranking of link prediction and more accurate triple classification result on FB15K237, WN18 and WN18RR datasets.

Key words: knowledge representation learning, knowledge graph, generative adversarial network, deep learning, knowledge graph embedding

摘要:

知识表示学习目的是将知识图谱中符号化表示的关系与实体嵌入到低维连续向量空间。知识表示模型在训练过程中需要大量负样本,但多数知识图谱只以三元组的形式存储正样本。传统知识表示学习方法中通常使用负采样方法,这种方法生成的负样本很容易被模型判别,随着训练的进行对性能提升的贡献也会越来越小。为了解决这个问题,提出了对抗式负样本生成器(ANG)模型。生成器采用编码-解码架构,编码器读入头或尾实体被替换的正样本作为上下文信息,然后解码器利用编码器提供的编码信息为三元组填充被替换的实体,从而构建负样本。训练过程采用已有的知识表示学习模型与生成器进行对抗训练以优化知识表示向量。在链接预测和三元组分类任务上评估了该方法,实验结果表明该方法对已有知识表示学习模型在FB15K237、WN18和WN18RR数据集上的链接预测平均排名与三元组分类准确度都有提升。

关键词: 知识表示学习, 知识图谱, 生成对抗网络, 深度学习, 知识图谱嵌入

CLC Number: