Journal of Computer Applications

    Next Articles

Second-order meta-learning strategy-based few-shot object detection method

  

  • Received:2025-01-06 Revised:2025-04-18 Online:2025-04-27 Published:2025-04-27

基于二阶元学习策略的小样本目标检测算法

刘洲峰1,邵昕楠1,吴文涛1,余淼1,李春雷2   

  1. 1. 中原工学院
    2. 中原工学院 电子信息学院,郑州450007
  • 通讯作者: 邵昕楠
  • 基金资助:
    国家自然科学基金项目;国家自然科学基金项目;中原科技创新领军人才项目;河南省科技攻关项目;河南省科技攻关项目

Abstract: Current few-shot object detection methods based on meta-learning often rely on region proposal networks (RPN) to generate candidate boxes. However, pre-defined boxes typically contain many invalid regions, introducing noise that disrupts the accurate localization of few-shot targets with limited training samples. Additionally, existing single-stage meta-learning algorithms primarily use support information at the feature level to guide query learning, but the limited support samples result in features lacking global consistency, hindering the enhancement of query information and affecting classification and localization accuracy. To address these issues, a second-order meta-learning strategy-based few-shot object detection method (SM-FSOD) was proposed. The SM-FSOD model replaced RPN with Transformer encoder-decoder structures and ResNet-101 as the backbone, forming an end-to-end detection framework. A support-query feature parallel encoder was used in the first meta-learning stage to measure correlations between query features and support information, guiding query discrimination. In the second stage, a global prototype feature integration method was applied, coupling depth-aware query features with support features to enhance perception. Experiments on PASCAL-VOC 2007/2012 showed that SM-FSOD improved AP50 by 1.40%-4.02% compared to RPN-based Meta-FRCNN. On the more challenging COCO dataset, SM-FSOD achieved a 2.70%-3.45% AP50 improvement, even with increased categories.

Key words: object detection; few-shot learning, meta-learning strategy, feature enhancement, Transformer

摘要: 当前基于元学习的小样本目标检测算法普遍依赖区域建议生成网络(Region proposal network ,RPN)生成候选框,但预定义的候选框通常包含大量无效区域,在训练样本较少时,这些无效区域会引入噪声,干扰小样本目标的准确定位。此外,现有的一阶元学习策略主要依赖支持样本的特征信息来指导模型对小样本查询信息的学习,但由于小样本情况下支持样本有限,提取的特征通常只涵盖局部的类别信息,难以全面表示类别特征,进而影响查询样本的表达能力,降低分类和定位的精度。为了解决上述问题,提出了一种基于二阶元学习策略的小样本目标检测算法(Second order meta learning strategy based few-shot object detection ,SM-FSOD)。首先,该算法采用了Transformer的编码器和解码器结构代替区域建议生成网络,同时以ResNet-101为主干网络共同构建端到端的目标检测算法。而后,在上述算法的基础上,使用两个元学习器重新构建了一种二阶元学习策略,在元学习的第一阶段,采用了一种支持-查询特征并行编码器,通过两个并行的自注意力机制处理输入的查询特征和支持信息,度量二者间的相关性,并以此为依据指导小样本查询信息的判别。在元学习的第二阶段,采用了一种面向全局信息的原型特征整合方法,通过预先提取查询特征的深度感知信息,将其和原有的支持特征进行耦合,增强支持信息对小样本查询信息的感知能力。在PASCAL-VOC 2007/2012数据集上进行实验并和Meta-FRCNN(Meta-learning based faster R-CNN)、FSOR-SR(Few-shot object detector with a spatial reasoning framework)等小样本目标检测算法的平均精度进行比较,相较于依赖RPN的Meta-FRCNN算法,所提出的SM-FSOD算法在AP50(Average precision)基准下取得了1.40%~4.02%的精度提升。此外,在更具挑战性的COCO数据集上的实验结果表明,在样本类别数目增加的情况下,所提出的SM-FSOD算法在AP50基准下相较于Meta-FRCNN算法仍取得了2.70%~3.45%的提升。

关键词: 目标检测, 小样本学习, 元学习策略, 特征增强, Transformer

CLC Number: