《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (10): 3003-3010.DOI: 10.11772/j.issn.1001-9081.2021101792

• 人工智能 • 上一篇    

基于层次结构感知的细粒度实体分类方法

谢斌红, 李书宁, 张英俊   

  1. 太原科技大学 计算机科学与技术学院,太原 030024
  • 收稿日期:2021-10-20 修回日期:2021-12-16 接受日期:2021-12-23 发布日期:2022-04-08 出版日期:2022-10-10
  • 通讯作者: 李书宁
  • 作者简介:第一联系人:谢斌红(1972—),男,山西万荣人,副教授,硕士,CCF会员,主要研究方向:智能化软件工程、机器学习
    李书宁(1996—),男,山西晋中人,硕士研究生,主要研究方向:深度学习、自然语言处理 18435998756@163.com
    张英俊(1969—),男,山西河津人,教授级高级工程师,硕士,CCF高级会员,主要研究方向:软件体系结构、智能软件、软件工程。
  • 基金资助:
    山西省重点研发计划项目(201803D121055)

Fine-grained entity typing method based on hierarchy awareness

Binhong XIE, Shuning LI, Yingjun ZHANG   

  1. College of Computer Science and Technology,Taiyuan University of Science and Technology,Taiyuan Shanxi 030024,China
  • Received:2021-10-20 Revised:2021-12-16 Accepted:2021-12-23 Online:2022-04-08 Published:2022-10-10
  • Contact: Shuning LI
  • About author:XIE Binhong, born in 1972, M. S. , associate professor. His research interests include intelligent software engineering, machine learning.
    LI Shuning, born in 1996, M. S. candidate. His research interests include deep learning, natural language processing.
    ZHANG Yingjun, born in 1969, M. S. , professorate senior engineer. His research interests include software architecture, intelligent software, software engineering.
  • Supported by:
    Shanxi Province Key Research and Development Program(201803D121055)

摘要:

针对现有细粒度实体分类(FGET)任务的工作多着眼于如何更好地编码实体和上下文的语义信息,而忽略了标签层次结构中标签之间的依赖关系及其本身的语义信息的问题,提出了一种基于层次结构感知的细粒度实体分类(HAFGET)方法。首先,利用基于图卷积网络(GCN)的层次结构编码器对不同层级标签之间的依赖关系进行建模,提出了基于层次结构感知的细粒度实体分类多标签注意力(HAFGET-MLA)模型和基于层次结构感知的细粒度实体分类实体特征传播(HAFGET-MFP)模型;然后,利用HAFGET-MLA模型和HAFGET-MFP模型对实体上下文特征进行层次结构感知和分类,前者通过层次编码器学习层次结构感知标签嵌入,并与实体特征通过注意力融合后进行标签分类,后者则直接将实体特征输入到层次结构编码器更新特征表示后进行分类。在FIGER、OntoNotes和KNET三个公开数据集上的实验结果表明,与基线模型相比,HAFGET-MLA模型和HAFGET-MFP模型的准确率和宏平均F1值均提升了2%以上,验证了所提方法能够有效提升分类效果。

关键词: 细粒度实体分类, 图卷积网络, 注意力机制, 条件概率, 层次结构编码器

Abstract:

Most work of Fine-Grained Entity Typing (FGET) focuses on how the semantic information of mention and context can be better coded, while ignoring the dependency between labels in the label hierarchy and their semantic information. In order to solve the above problem, a Hierarchy-Aware Fine-Grained Entity Typing (HAFGET) method was proposed. Firstly, the hierarchical encoder based on Graph Convolutional Network (GCN) was used to model the dependency between labels in different levels. Hierarchy-Aware Fine-Grained Entity Typing Multi-Label Attention (HAFGET-MLA) model and Hierarchy-Aware Fine-Grained Entity Typing Mention Feature Propagation (HAFGET-MFP) model were proposed. Then, HAFGET-MLA model and HAFGET-MFP model were carried out by using multi-label attention model and mention feature propagation model. In the former, the hierarchical perceptual label embedded was learned through the hierarchical encoder and the labels were classified after attention fusion with the mention features. In the latter, the mention features were directly input into the hierarchical encoder to update the feature representation and then classified. Experimental results on three public datasets FIGER, OntoNotes and KNET show that the accuracy and macro F1 scores of HAFGET-MLA model and HAFGET-MFP model are both improved by more than 2% compared with those of the baseline model. It is verified that the proposed method can effectively improve the typing effect.

Key words: Fine-Grained Entity Typing (FGET), Graph Convolutional Network (GCN), attention mechanism, conditional probability, hierarchical encoder