《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (6): 1801-1808.DOI: 10.11772/j.issn.1001-9081.2024060776

• 人工智能 • 上一篇    

结合语义增强和感知注意力的关系抽取方法

杨大伟1, 徐西海2, 宋威1,3()   

  1. 1.江南大学 人工智能与计算机学院,江苏 无锡 214122
    2.无锡车联天下信息技术有限公司,江苏 无锡 214125
    3.江苏省模式识别与计算智能工程实验室(江南大学),江苏 无锡 214122
  • 收稿日期:2024-06-12 修回日期:2024-08-08 接受日期:2024-08-16 发布日期:2024-09-10 出版日期:2025-06-10
  • 通讯作者: 宋威
  • 作者简介:杨大伟(1995—),男,山东泰安人,硕士研究生,CCF会员,主要研究方向:自然语言处理、关系抽取
    徐西海(1986—),男,山东泰安人,硕士,主要研究方向:汽车智能驾驶、智能驾驶产品开发
    宋威(1981—),男,湖北施恩人,教授,博士,主要研究方向:数据挖掘、机器学习、模式识别。songwei@jiangnan.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(62076110)

Relation extraction method combining semantic enhancement and perception attention

Dawei YANG1, Xihai XU2, Wei SONG1,3()   

  1. 1.School of Artificial Intelligence and Computer Science,Jiangnan University,Wuxi Jiangsu 214122,China
    2.Wuxi Autolink Information Technology Company Limited,Wuxi Jiangsu 214125,China
    3.Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence (Jiangnan University),Wuxi Jiangsu 214122,China
  • Received:2024-06-12 Revised:2024-08-08 Accepted:2024-08-16 Online:2024-09-10 Published:2025-06-10
  • Contact: Wei SONG
  • About author:YANG Dawei, born in 1995, M. S. candidate. His research interests include nature language processing, relation extraction.
    XU Xihai, born in 1986, M. S. His research interests include intelligent driving, development of intelligent driving products.
    SONG Wei, born in 1981, Ph. D., professor. His research interests include data mining, machine learning, pattern recognition.
  • Supported by:
    National Natural Science Foundation of China(62076110)

摘要:

针对文本特征提取时缺乏考虑句子的上下文判别性特征以及未能充分利用实例和关系标签之间的关联信息的问题,提出一种结合语义增强和感知注意力的关系抽取方法(SPRE)。首先,在句子特征编码阶段,构建语义增强机制(SEM)提取句子的显著性语义特征,通过实体感知词嵌入和显著特征感知(SFP)得到显著信息增强的句子表示;其次,设计感知注意力机制(PAM)整合句子特征,通过感知句子与关系标签之间的语义信息、句子的实体类型与对应关系的实体类型之间的一致性信息,以及句子之间的相似性信息评估句子与关系标签的匹配程度,充分利用包中实例与关系标签的依赖关系,进一步提高方法的降噪能力;最后,利用分类器进行关系预测并根据预测结果与实际结果的交叉熵调整网络参数。在NYT-10(New York Times 10)和GDS(Google Distant Supervision)数据集上的实验结果表明,在NYT-10数据集上,与基于BERT(Bidirectional Encoder Representations from Transformers)的关系抽取方法PARE(Passage-Attended Relation Extraction)相比,所提方法在曲线下面积(AUC)上提升了2.1个百分点,在按置信度降序排列后前100、200 和300条数据精确率Precision@N(P@N)的平均值P@M提升了2.4个百分点;在GDS数据集上,所提方法的AUC和P@M分别达到了90.5%和97.8%。所提方法在上述2个数据集上均明显优于主流的远程监督关系抽取方法,验证了该方法的有效性。可见,在主流的远程监督关系抽取任务中,所提方法能有效地提升模型对数据特征的学习能力。

关键词: 远程监督, 关系抽取, 语义增强, 感知注意力, 降噪

Abstract:

Focusing on the issues that text feature extraction lacks consideration of the contextual discriminative features of sentences and fails to fully utilize the association information among instances and relation labels, a method combining Semantic enhancement and Perception attention for Relation Extraction (SPRE) was proposed. Firstly, during the sentence feature encoding phase, a Semantic Enhancement Mechanism (SEM) was constructed to extract salient semantic features of sentences, and a salient information enhanced sentence representation was obtained through entity-aware word embeddings and Salient Feature Perception (SFP). Then, a Perception Attention Mechanism (PAM) was designed to integrate sentence features. In the mechanism, the matching degree among sentences and relation labels was evaluated by perceiving the semantic information among sentences and relation labels, the consistency information among entity types of sentences and the corresponding relations, and the similarity information among sentences, so as to fully utilize the dependencies between instances and relation labels in a bag, thereby further enhancing noise reduction capability of the method. Finally, after conducting relation prediction by a classifier, the network parameters were adjusted according to cross-entropy between the predicted results and the actual results. Experimental results on NYT-10 (New York Times 10) and GDS (Google Distant Supervision) datasets show that on NYT-10 dataset, compared with the BERT (Bidirectional Encoder Representations from Transformers)-based relation extraction method PARE (Passage-Attended Relation Extraction), the proposed method achieves an Area Under Curve (AUC) increase of 2.1 percentage points and an average precision Precision@N (P@N) — P@M increase of 2.4 percentage points for the top 100, 200, and 300 data entries ranked in descending order of confidence; on GDS dataset, the AUC and P@M of the proposed method are 90.5% and 97.8% respectively. The proposed method outperforms mainstream distant supervision relation extraction methods on both datasets significantly, verifying the effectiveness of this method. It can be seen that in mainstream distant supervision relation extraction tasks, the proposed method can enhance the model’s ability to learn data features effectively.

Key words: distant supervision, relation extraction, semantic enhancement, perception attention, noise reduction

中图分类号: