Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (5): 1450-1459.DOI: 10.11772/j.issn.1001-9081.2025050583

• Artificial intelligence • Previous Articles    

Continual few-shot event detection model based on hierarchical adaptive fusion mechanism and category boundary distillation

Jie HU1,2,3(), Tong XU1, Yan ZHANG1,2,3   

  1. 1.School of Computer Science,Hubei University,Wuhan Hubei 430062,China
    2.Hubei Key Laboratory of Big Data Intelligent Analysis and Application (Hubei University),Wuhan Hubei 430062,China
    3.Key Laboratory of Intelligent Sensing System and Security,Ministry of Education (Hubei University),Wuhan Hubei 430062,China
  • Received:2025-05-28 Revised:2025-08-13 Accepted:2025-08-20 Online:2025-09-05 Published:2026-05-10
  • Contact: Jie HU
  • About author:XU Tong, born in 2001, M. S. candidate. Her research interests include natural language processing.
    ZHANG Yan, born in 1974, Ph. D., professor. His research interests include software engineering, information security.
  • Supported by:
    National Natural Science Foundation of China(61977021)

基于层次化自适应融合机制和类别边界蒸馏的持续少样本事件检测模型

胡婕1,2,3(), 徐彤1, 张龑1,2,3   

  1. 1.湖北大学 计算机学院,武汉 430062
    2.大数据智能分析与行业应用湖北省重点实验室(湖北大学),武汉 430062
    3.智能感知系统与安全教育部重点实验室(湖北大学),武汉 430062
  • 通讯作者: 胡婕
  • 作者简介:徐彤(2001—),女,湖北武汉人,硕士研究生,主要研究方向:自然语言处理
    张龑(1974—),男,湖北宜昌人,教授,博士,CCF会员,主要研究方向:软件工程、信息安全。
  • 基金资助:
    国家自然科学基金资助项目(61977021)

Abstract:

To address the challenges of catastrophic forgetting and limited generalization in Continual Few-shot Event Detection (CFED), a new CFED model based on hierarchical adaptive fusion mechanism and category boundary distillation was proposed. Firstly, feature reconstruction was introduced by combining global average pooling with a learnable mapping to enhance the structural modeling of text representations and optimize feature distribution. Secondly, a hierarchical adaptive fusion mechanism was designed to dynamically integrate shallow, intermediate, and deep features from the pretrained model. Gaussian perturbation was introduced to improve feature robustness, and a self-attention mechanism was employed to achieve adaptive cross-layer feature weighted fusion. Finally, a category-boundary distillation strategy was proposed, which aligned the class distributions of old and new tasks using KL (Kullback-Leibler) divergence and refined the decision boundary features via cosine similarity, effectively mitigating knowledge forgetting. Experimental comparisons with 9 baseline models and the large language model GPT-3.5-Turbo were conducted on the MAVEN and ACE2005 datasets. On MAVEN, the proposed model achieved average F1 value improvements of 2.92 and 1.80 percentage points over the suboptimal model HANet (Hierarchical Augmentation Networks) across 5 subtasks under the 4-way 5-shot and 4-way 10-shot settings, respectively; on ACE2005, it outperformed the suboptimal models HANet and Combined Retrain by 1.83 and 2.00 percentage points across 5 subtasks under the 2-way 5-shot and 2-way 10-shot settings, respectively. Compared to GPT-3.5-Turbo, the proposed model achieved average F1 score improvements of 3.47 and 8.77 percentage points on MAVEN, and 4.47 and 2.39 percentage points on ACE2005 under 2-way 1-shot and 2-way 2-shot settings, respectively. The results demonstrate the superior performance of the proposed model.

Key words: Continual Few-shot Event Detection (CFED), hierarchical adaptive fusion mechanism, feature reconstruction, category boundary distillation

摘要:

针对持续少样本事件检测(CFED)任务中面临的灾难性遗忘与小样本泛化难题,提出一种基于层次化自适应融合机制与类别边界蒸馏的CFED模型。首先,引入特征重构,结合全局平均池化与可学习映射,增强文本表征的结构建模能力并优化其特征分布;其次,设计层次化自适应融合机制,动态整合预训练模型浅层、中层与深层特征,引入高斯扰动增强特征鲁棒性,并通过自注意力机制实现跨层次特征的自适应加权融合;最后,提出类别边界蒸馏策略,利用KL(Kullback-Leibler)散度对齐新旧任务类别分布,结合余弦相似度优化决策边界特征,缓解知识遗忘。在数据集MAVEN和ACE2005上与9个基线模型以及大语言模型GPT-3.5-Turbo进行实验对比,结果表明所提模型在MAVEN上4?way 5-shot和4-way 10-shot的5个子任务平均F1值比次优模型HANet(Hierarchical Augmentation Networks)分别提升了2.92和1.80个百分点;在ACE2005上2-way 5-shot和2-way 10-shot的5个子任务平均F1值比次优模型HANet和Combined Retrain分别提升了1.83和2.00个百分点;相较于GPT-3.5-Turbo,所提模型在MAVEN上2-way 1-shot和2-way 2-shot的平均F1值分别提升了3.47和8.77个百分点,在ACE2005上分别提升了4.37和2.39个百分点,验证了该模型性能更优。

关键词: 持续少样本事件检测, 层次化自适应融合机制, 特征重构, 类别边界蒸馏

CLC Number: