《计算机应用》唯一官方网站

• •    下一篇

结合语义增强和感知注意力的关系抽取方法

杨大伟1,徐西海2,宋威1   

  1. 1. 江南大学
    2. 无锡车联天下信息技术有限公司
  • 收稿日期:2024-06-11 修回日期:2024-08-08 发布日期:2024-09-10 出版日期:2024-09-10
  • 通讯作者: 杨大伟
  • 基金资助:
    国家自然科学基金;江苏省自然科学基金

Relation extraction method combining semantic enhancement and perception attention

  • Received:2024-06-11 Revised:2024-08-08 Online:2024-09-10 Published:2024-09-10
  • Supported by:
    National Natural Science Foundation of China;Natural Science Foundation of Jiangsu Province

摘要: 针对文本特征提取时缺乏考虑句子的上下文判别性特征以及未能充分利用实例和关系标签之间的关联信息的问题,提出了一种结合语义增强和感知注意力的关系抽取方法。首先,在句子特征编码阶段,构建语义增强机制提取句子的显著性语义特征,通过实体感知词嵌入和显著特征感知得到显著信息增强的句子表示。其次,设计感知注意力机制来整合句子特征,通过感知句子与关系标签之间的语义信息、句子的实体类型与对应关系的实体类型之间的一致性信息和句子之间的相似性信息来评估句子与关系标签的匹配程度,以充分利用包中实例与关系标签的依赖关系,进一步提高方法的降噪能力。最后利用分类器进行关系预测,并根据预测结果与实际结果的交叉熵调整网络参数。本文在NYT-10和GDS数据集上进行了广泛的实验,实验结果表明,在NYT-10数据集上,与基于BERT(Bidirectional Encoder Representations from Transformers)的关系抽取方法PARE(Passage-Attended Relation Extraction)相比,所提方法的AUC值提升了2.1个百分点,按置信度降序排列后前100、200 和300条数据精准率Precision@N的平均值(P@M)提升了2.4个百分点;在GDS数据集上的AUC值和P@M也分别达到了最高90.5%和97.8%,所提方法在上述2个数据集上均明显优于其他主流的远程监督关系抽取方法,验证了所提方法的有效性。在主流的远程监督关系抽取任务中,所提方法能有效地提升模型对数据特征的学习能力。

关键词: 远程监督, 关系抽取, 语义增强, 感知注意力, 降噪

Abstract: Focused on the issues that text feature extraction lacks consideration of the contextual discriminative features of sentences and fails to fully utilize the associative information between instances and relation labels, a relation extraction method combining semantic enhancement and perception attention was proposed. Firstly, during the sentence feature encoding phase, a semantic enhancement mechanism was constructed to extract salient semantic features of sentences, and an enhanced sentence representation was obtained through entity-aware word embeddings and salient feature perception. Then, a perception attention mechanism was designed to integrate sentence features. It evaluated the matching degree between sentences and relation labels by perceiving the semantic information between sentences and relation labels, the consistency information between the entity types of sentences and the corresponding relations, and the similarity information between sentences, so as to fully utilize the dependencies between instances and relation labels in a bag, further enhancing the noise reduction capability of the method. Finally, a classifier was used for relation prediction, and the network parameters were adjusted according to the cross-entropy between the predicted results and the actual results. Extensive experiments were conducted on the NYT-10 and GDS datasets. The experimental results show that on the NYT-10 dataset, compared with the BERT-based(Bidirectional Encoder Representations from Transformers) relation extraction method PARE (Passage-Attended Relation Extraction), the proposed method achieves an AUC value increase of 2.1 percentage points and an average Precision@N (P@M) increase of 2.4 percentage points for the top 100, 200, and 300 data entries ranked in descending order of confidence. On the GDS dataset, the AUC value and P@M also reach the highest values of 90.5% and 97.8%, respectively. The proposed method significantly outperforms other mainstream distant supervision relation extraction methods on both datasets, verifying its effectiveness. In mainstream distant supervision relation extraction tasks, the proposed method can effectively enhance the model's ability to learn data features.

Key words: distant supervision, relation extraction, semantic enhancement, perception attention, noise reduction

中图分类号: