《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (6): 1713-1719.DOI: 10.11772/j.issn.1001-9081.2023060818

• CCF第38届中国计算机应用大会 (CCF NCCA 2023) • 上一篇    

基于掩码提示与门控记忆网络校准的关系抽取方法

魏超1,2,3, 陈艳平1,2,3(), 王凯1,2,3, 秦永彬1,2,3, 黄瑞章1,2,3   

  1. 1.文本计算与认知智能教育部工程研究中心(贵州大学), 贵阳 550025
    2.公共大数据国家重点实验室(贵州大学), 贵阳 550025
    3.贵州大学 计算机科学与技术学院, 贵阳 550025
  • 收稿日期:2023-06-26 修回日期:2023-08-16 接受日期:2023-08-21 发布日期:2023-08-30 出版日期:2024-06-10
  • 通讯作者: 陈艳平
  • 作者简介:魏超(1999—),男,贵州毕节人,硕士研究生,主要研究方向:自然语言处理、关系抽取
    王凯(1995—),男,贵州遵义人,博士研究生,主要研究方向:自然语言处理、关系抽取
    秦永彬(1980—),男,山东烟台人,教授,博士,CCF高级会员,主要研究方向:大数据、多源数据融合
    黄瑞章(1979—),女,天津人,教授,博士,CCF会员,主要研究方向:大数据与数据挖掘、信息提取。
  • 基金资助:
    国家自然科学基金资助项目(62166007)

Relation extraction method based on mask prompt and gated memory network calibration

Chao WEI1,2,3, Yanping CHEN1,2,3(), Kai WANG1,2,3, Yongbin QIN1,2,3, Ruizhang HUANG1,2,3   

  1. 1.Text Computing & Cognitive Intelligence Engineering Research Center of National Education Ministry (Guizhou University),Guiyang Guizhou 550025,China
    2.State Key Laboratory of Public Big Data (Guizhou University),Guiyang Guizhou 550025,China
    3.College of Computer Science and Technology,Guizhou University,Guiyang Guizhou 550025,China
  • Received:2023-06-26 Revised:2023-08-16 Accepted:2023-08-21 Online:2023-08-30 Published:2024-06-10
  • Contact: Yanping CHEN
  • About author:WEI Chao, born in 1999, M. S. candidate His research interests include natural language processing, relation extraction.
    WANG Kai, born in 1995, Ph. D. candidate. His research interests include natural language processing, relation extraction.
    QIN Yongbin, born in 1980, Ph. D., professor. His research interests include big data, multi-source data fusion.
    HUANG Ruizhang, born in 1979, Ph. D., professor. Her research interests include big data and data mining, information extraction.
  • Supported by:
    National Natural Science Foundation of China(62166007)

摘要:

针对关系抽取(RE)任务中实体关系语义挖掘困难和预测关系有偏差等问题,提出一种基于掩码提示与门控记忆网络校准(MGMNC)的RE方法。首先,利用提示中的掩码学习实体之间在预训练语言模型(PLM)语义空间中的潜在语义,通过构造掩码注意力权重矩阵,将离散的掩码语义空间相互关联;其次,采用门控校准网络将含有实体和关系语义的掩码表示融入句子的全局语义;再次,将它们作为关系提示校准关系信息,随后将句子表示的最终表示映射至相应的关系类别;最后,通过更好地利用提示中掩码,并结合传统微调方法的学习句子全局语义的优势,充分激发PLM的潜力。实验结果表明,所提方法在SemEval(SemEval-2010 Task 8)数据集的F1值达到91.4%,相较于RELA(Relation Extraction with Label Augmentation)生成式方法提高了1.0个百分点;在SciERC(Entities, Relations, and Coreference for Scientific knowledge graph construction)和CLTC(Chinese Literature Text Corpus)数据集上的F1值分别达到91.0%和82.8%。所提方法在上述3个数据集上均明显优于对比方法,验证了所提方法的有效性。相较于基于生成式的方法,所提方法实现了更优的抽取性能。

关键词: 关系抽取, 掩码, 门控神经网络, 预训练语言模型, 提示学习

Abstract:

To tackle the difficulty in semantic mining of entity relations and biased relation prediction in Relation Extraction (RE) tasks, a RE method based on Mask prompt and Gated Memory Network Calibration (MGMNC) was proposed. First, the latent semantics between entities within the Pre-trained Language Model (PLM) semantic space was learned through the utilization of masks in prompts. By constructing a mask attention weight matrix, the discrete masked semantic spaces were interconnected. Then, the gated calibration networks were used to integrate the masked representations containing entity and relation semantics into the global semantics of the sentence. Besides, these calibrated representations were served as prompts to adjust the relation information, and the final representation of the calibrated sentence was mapped to the corresponding relation class. Finally, the potential of PLM was fully exploited by the proposed approach through harnessing masks in prompts and combining them with the advantages of traditional fine-tuning methods. The experimental results highlight the effectiveness of the proposed method. On the SemEval (SemEval-2010 Task 8) dataset, the F1 score reached impressive 91.4%, outperforming the RELA (Relation Extraction with Label Augmentation) generative method by 1.0 percentage point. Additionally, the F1 scores on the SciERC (Entities, Relations, and Coreference for Scientific knowledge graph construction) and CLTC (Chinese Literature Text Corpus) datasets were remarkable, achieving 91.0% and 82.8% respectively. The effectiveness of the proposed method was evident as it consistently outperformed the comparative methods on all three datasets mentioned above. Furthermore, the proposed method achieved superior extraction performance compared to generative methods.

Key words: Relation Extraction (RE), mask, gated neural network, Pre-trained Language Model (PLM), prompt tuning

中图分类号: