计算机应用 ›› 2020, Vol. 40 ›› Issue (6): 1601-1606.DOI: 10.11772/j.issn.1001-9081.2019111959

• 人工智能 • 上一篇    下一篇

基于动态标签的关系抽取方法

薛露, 宋威   

  1. 江南大学 物联网工程学院,江苏 无锡214122
  • 收稿日期:2019-11-18 修回日期:2020-01-08 出版日期:2020-06-10 发布日期:2020-06-18
  • 通讯作者: 宋威(1981—)
  • 作者简介:薛露(1994—),女,重庆人,硕士研究生,主要研究方向:自然语言处理、关系抽取。宋威(1981—),男,湖北恩施人,副教授,博士,CCF会员,主要研究方向:数据挖掘、人工智能。
  • 基金资助:
    国家自然科学基金资助项目(61673193);中央高校基本科研业务费专项资金资助项目(JUSRP51635B);中国博士后科学基金资助项目(2017M621625);江苏省自然科学基金资助项目(BK20181341)。

Relation extraction method based on dynamic label

XUE Lu, SONG Wei   

  1. School of Internet of Things Engineering, Jiangnan University, Wuxi Jiangsu 214122, China
  • Received:2019-11-18 Revised:2020-01-08 Online:2020-06-10 Published:2020-06-18
  • Contact: SONG Wei, born in 1981, Ph. D., associate professor. His research interests include data mining, artificial intelligence.
  • About author:XUE Lu, born in 1994, M. S. candidate. Her research interests include natural language processing, relation extraction.SONG Wei, born in 1981, Ph. D., associate professor. His research interests include data mining, artificial intelligence.
  • Supported by:
    National Natural Science Foundation of China (61673193), the Fundamental Research Fund for the Central Universities (JUSRP51635B), the China Postdoctoral Science Foundation (2017M621625), the Natural Science Foundation of Jiangsu Province (BK20181341).

摘要: 针对远程监督数据集的关系抽取研究方法存在着大量标签噪声的问题,提出了一种作用于分层注意力机制关系抽取模型的动态标签方法。首先,提出了一种根据关系类别相似性生成动态标签的概念。由于相同的关系标签包含相似的特征信息,计算特征信息的关系类别相似性有助于生成与特征信息相对应的动态标签。其次,利用动态标签方法的评分函数来评价远程监督标签是否为噪声,以决定是否需要生成新的标签代替远程监督标签,通过调整远程监督标签来抑制标签噪声对模型的影响。最后,根据动态标签来更新分层注意力机制以关注有效实例,重新学习每个有效实例的重要性,进一步抽取关键的关系特征信息。实验结果表明,相较于原始的分层注意力机制关系抽取模型,所提方法在Micro和Macro分数上分别有1.3个百分点和1.9个百分点的提升,实现了噪声标签的动态纠正,提升了模型的关系抽取能力。

关键词: 关系抽取, 远程监督, 动态标签方法, 评分函数, 分层注意力机制

Abstract: Concerning the problem that the research methods of relation extraction for distant supervision datasets have a lot of label noise, a dynamic label method applied to the hierarchical attention mechanism relation extraction model was proposed. Firstly, a concept of generating dynamic label based on the similarity of relation categories was proposed. Since the same relation labels contain similar feature information, calculating the similarity of relation categories of feature information is helpful to generate the dynamic label corresponding to the feature information. Secondly, the scoring function of the dynamic label was used to evaluate whether the distant supervision label was noise and to determine whether a new label was needed to generate to replace the distant supervision label, and the influence of label noise on the model was suppressed by adjusting the distant supervision label. Finally, according to the dynamic label, the hierarchical attention mechanism was updated to focus on the effective instances, the importance of each effective instance was relearned and key relation feature information was further extracted. The experimental results indicate that, compared with the original hierarchical attention mechanism relation extraction model, the proposed method has the Micro and Macro scores increased by 1.3 percentage points and 1.9 percentage points respectively, realizes the dynamic correction of the noise label, and improves the relation extraction ability of the model.

Key words: relation extraction, distant supervision, dynamic label method, scoring function, hierarchical attention mechanism

中图分类号: