Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (8): 2470-2476.DOI: 10.11772/j.issn.1001-9081.2024081076

• The 21th CCF Conference on Web Information Systems and Applications (WISA 2024) • Previous Articles    

Heterogeneous graph attention network for relation extraction based on feature combination

Jiaxin YAN1,2,3, Yanping CHEN1,2,3(), Weizhe YANG1,2,3, Ruizhang HUANG1,2,3, Yongbin QIN1,2,3   

  1. 1.Engineering Research Center of Ministry of Education for Text Computing and Cognitive Intelligence,Guizhou University,Guiyang Guizhou 550025,China
    2.State Key Laboratory of Public Big Data (Guizhou University),Guiyang Guizhou 550025,China
    3.College of Computer Science and Technology,Guizhou University,Guiyang Guizhou 550025,China
  • Received:2024-08-01 Revised:2024-08-12 Accepted:2024-08-15 Online:2024-09-12 Published:2025-08-10
  • Contact: Yanping CHEN
  • About author:YAN Jiaxin, born in 2000, M. S. candidate. His research interests include natural language processing, information extraction.
    YANG Weizhe, born in 1996, Ph. D. candidate. His research interests include natural language processing.
    HUANG Ruizhang, born in 1979, Ph. D., professor. Her research interests include data fusion analysis, text mining, network mining, knowledge discovery.
    QIN Yongbin, born in 1980, Ph. D., professor. His research interests include big data management and application, multi-source data fusion.
  • Supported by:
    Major Science and Technology Foundation of Guizhou Province([2024]003);National Key Research and Development Program of China(2023YFC3304500);National Natural Science Foundation of China(62166007)

基于特征组合的异构图注意力网络关系抽取

闫家鑫1,2,3, 陈艳平1,2,3(), 杨卫哲1,2,3, 黄瑞章1,2,3, 秦永彬1,2,3   

  1. 1.贵州大学 文本计算与认知智能教育部工程研究中心,贵阳 550025
    2.公共大数据国家重点实验室(贵州大学),贵阳 550025
    3.贵州大学 计算机科学与技术学院,贵阳 550025
  • 通讯作者: 陈艳平
  • 作者简介:闫家鑫(2000—),男,四川成都人,硕士研究生,CCF会员,主要研究方向:自然语言处理、信息抽取
    杨卫哲(1996—),男,河南新乡人,博士研究生,主要研究方向:自然语言处理
    黄瑞章(1979—),女,天津人,教授,博士,CCF会员,主要研究方向:数据融合分析、文本挖掘、网络挖掘、知识发现
    秦永彬(1980—),男,山东烟台人,教授,博士,CCF高级会员,主要研究方向:大数据管理与应用、多源数据融合。
  • 基金资助:
    贵州省科学技术基金重点资助项目([2024]003);国家重点研发计划项目(2023YFC3304500);国家自然科学基金资助项目(62166007)

Abstract:

Relation extraction aims to identify predefined semantic relationships between two entities within a sentence. Traditional graph neural network-based relation extraction methods generally use dependency trees to construct a graphical representation structure of the sentence. However, the graph structure constructed by dependency tree has limited expression ability and is unable to fully capture rich syntactic structure information of the target entity. To address these issues, a relation extraction method of Heterogeneous Graph ATtention network (HGAT) based on feature combination was proposed. Firstly, atomic features were extracted from the sentence, and composite features were obtained by combining these atomic features. Secondly, the composite features and relation labels were represented as two types of nodes on the heterogeneous graph to construct a “feature-relation bipartite graph”. Finally, a graph attention network was used to update the nodes dynamically to perform relation extraction. In this method, the composite features and the syntactic structure information in the sentence were utilized effectively, thereby enhancing the performance of relation extraction. Experimental results on ACE05 English dataset and SemEval-2010 task 8 dataset show that this method achieves F1-scores of 84.11% and 90.67%, respectively, demonstrating the effectiveness of the proposed method.

Key words: relation extraction, atomic feature, feature combination, heterogeneous graph, graph attention network

摘要:

关系抽取旨在从句子中提取2个实体之间的预定义语义关系。传统基于图神经网络的关系抽取方法一般通过依赖树构建句子的图表示结构;然而,使用依赖树构建出的图结构表达能力单一,且无法完整捕捉到目标实体丰富的语法结构信息。针对这些问题,提出基于特征组合的异构图注意力网络(HGAT)关系抽取方法。首先,抽取句子中的原子特征,并通过组合这些原子特征得到句子的组合特征;其次,把组合特征和关系标签表示为异构图上的两种节点以构建“特征-关系二部图”;最后,使用图注意力网络动态地更新节点,进而实现关系抽取。所提方法能有效利用组合特征和句子中的语法结构信息,进而提升关系抽取的性能。在ACE05英文数据集和SemEval-2010 task 8数据集上的实验结果表明,所提方法分别达到了84.11%和90.67%的F1值,证明了所提方法的有效性。

关键词: 关系抽取, 原子特征, 特征组合, 异构图, 图注意力网络

CLC Number: