《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (12): 3790-3797.DOI: 10.11772/j.issn.1001-9081.2023121731
收稿日期:
2023-12-18
修回日期:
2024-02-14
接受日期:
2024-02-28
发布日期:
2024-03-21
出版日期:
2024-12-10
通讯作者:
邬满
作者简介:
付可意(2000—),女,湖南衡阳人,硕士研究生,主要研究方向:小样本目标检测、小样本学习基金资助:
Keyi FU1, Gaocai WANG1, Man WU1,2,3()
Received:
2023-12-18
Revised:
2024-02-14
Accepted:
2024-02-28
Online:
2024-03-21
Published:
2024-12-10
Contact:
Man WU
About author:
FU Keyi, born in 2000, M. S. candidate. Her research interests include few-shot object detection, few-shot learning.Supported by:
摘要:
在现有的小样本目标检测中,区域提议网络(RPN)通常是在基类数据上训练以生成新类候选框;然而新类数据相较于基类更稀缺,在引入时可能产生与目标物不同的复杂背景,导致RPN将背景误认为前景,遗漏高交并比(IoU)值候选框。针对上述问题,提出一种基于改进RPN和特征聚合小样本目标检测方法(IFA-FSOD)。首先,基于RPN进行改进,即通过在RPN中设计一个基于度量的非线性分类器,计算骨干网络提取的特征和新类特征之间的相似度,以提高对新类候选框的召回率,从而筛选高IoU候选框;其次,在感兴趣区域对齐(RoI Align)中引入基于注意力机制的特征聚合模块(FAM),并通过设计不同尺度的网格,获取更全面的信息和特征表示,从而缓解因尺度不同引起的特征信息缺失。实验结果表明,相较于QA-FewDet(Query Adaptive Few-shot object Detection)方法,IFA-FSOD方法在PASCAL VOC数据集的新类上的Novel Set 3中的10-shot下的新类别平均精度(50% IoU)(nAP50)提升了4.5个百分点;相较于FsDetView(Few-shot object Detection and Viewpoint estimation)方法,在10-shot和30-shot设置下,IFA-FSOD方法在COCO数据集的新类上的平均精度均值(mAP)分别提升了0.2和0.8个百分点。可见改进RPN和特征聚合(IFA)能有效提高在小样本情况下对目标类别的检测性能,并解决高IoU值候选框遗漏和特征信息捕捉不全的问题。
中图分类号:
付可意, 王高才, 邬满. 基于改进区域提议网络和特征聚合小样本目标检测方法[J]. 计算机应用, 2024, 44(12): 3790-3797.
Keyi FU, Gaocai WANG, Man WU. Few-shot object detection method based on improved region proposal network and feature aggregation[J]. Journal of Computer Applications, 2024, 44(12): 3790-3797.
方法 | Novel Set 1 | Novel Set 2 | Novel Set 3 | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1-shot | 2-shot | 3-shot | 5-shot | 10-shot | 1-shot | 2-shot | 3-shot | 5-shot | 10-shot | 1-shot | 2-shot | 3-shot | 5-shot | 10-shot | |
FSRW[ | 14.8 | 15.5 | 26.7 | 33.9 | 47.2 | 15.7 | 15.3 | 22.7 | 30.1 | 40.5 | 21.3 | 25.6 | 28.4 | 42.8 | 45.9 |
MetaR-CNN[ | 19.9 | 25.5 | 35.0 | 45.7 | 51.5 | 10.4 | 19.4 | 29.6 | 34.8 | 45.4 | 14.3 | 18.2 | 27.5 | 41.2 | 48.1 |
MetaDet[ | 18.9 | 20.6 | 30.2 | 36.8 | 49.6 | 21.8 | 23.1 | 27.8 | 31.7 | 43.0 | 20.6 | 23.9 | 29.4 | 43.9 | 44.1 |
TFA[ | 39.8 | 36.1 | 44.7 | 55.7 | 56.0 | 23.5 | 26.9 | 34.1 | 35.1 | 39.1 | 30.8 | 34.8 | 42.8 | 49.5 | 49.8 |
SRR-FSD[ | 47.8 | 50.5 | 51.3 | 55.2 | 56.8 | 32.5 | 35.3 | 39.1 | 40.8 | 43.8 | 40.1 | 41.5 | 44.3 | 46.9 | 46.4 |
QA-FewDet[ | 42.4 | 51.9 | 55.7 | 62.6 | 63.4 | 25.9 | 37.8 | 46.6 | 48.9 | 51.1 | 35.2 | 42.9 | 47.8 | 54.8 | 53.5 |
DA-FSOD[ | 33.4 | 45.1 | 47.1 | 53.1 | 60.0 | 24.2 | 31.4 | 39.5 | 43.9 | 49.0 | 24.5 | 36.1 | 42.3 | 49.2 | 54.5 |
FSCE[ | 32.9 | 44.0 | 46.8 | 52.9 | 59.7 | 23.7 | 30.6 | 38.4 | 38.4 | 48.5 | 22.6 | 33.4 | 39.5 | 47.3 | 54.0 |
G-FSD[ | 42.4 | 45.8 | 45.9 | 53.7 | 56.1 | 21.7 | 27.8 | 35.2 | 37.0 | 40.3 | 30.2 | 37.6 | 43.0 | 49.7 | 50.1 |
FSOD-UP[ | 43.8 | 47.8 | 50.3 | 55.4 | 61.7 | 31.2 | 30.5 | 41.2 | 42.2 | 48.3 | 35.5 | 39.7 | 43.9 | 50.6 | 53.5 |
DCNet[ | 33.9 | 37.4 | 43.7 | 51.1 | 59.6 | 23.2 | 24.8 | 30.6 | 36.7 | 46.6 | 32.3 | 34.9 | 39.7 | 42.6 | 50.7 |
IFA-FSOD | 30.1 | 52.3 | 58.7 | 62.4 | 65.4 | 27.7 | 37.9 | 38.0 | 42.5 | 48.6 | 21.9 | 44.5 | 49.8 | 55.5 | 58.0 |
表1 PASCAL VOC Novel数据集上的nAP50 (%)
Tab. 1 nAP50 on PASCAL VOC Novel dataset
方法 | Novel Set 1 | Novel Set 2 | Novel Set 3 | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1-shot | 2-shot | 3-shot | 5-shot | 10-shot | 1-shot | 2-shot | 3-shot | 5-shot | 10-shot | 1-shot | 2-shot | 3-shot | 5-shot | 10-shot | |
FSRW[ | 14.8 | 15.5 | 26.7 | 33.9 | 47.2 | 15.7 | 15.3 | 22.7 | 30.1 | 40.5 | 21.3 | 25.6 | 28.4 | 42.8 | 45.9 |
MetaR-CNN[ | 19.9 | 25.5 | 35.0 | 45.7 | 51.5 | 10.4 | 19.4 | 29.6 | 34.8 | 45.4 | 14.3 | 18.2 | 27.5 | 41.2 | 48.1 |
MetaDet[ | 18.9 | 20.6 | 30.2 | 36.8 | 49.6 | 21.8 | 23.1 | 27.8 | 31.7 | 43.0 | 20.6 | 23.9 | 29.4 | 43.9 | 44.1 |
TFA[ | 39.8 | 36.1 | 44.7 | 55.7 | 56.0 | 23.5 | 26.9 | 34.1 | 35.1 | 39.1 | 30.8 | 34.8 | 42.8 | 49.5 | 49.8 |
SRR-FSD[ | 47.8 | 50.5 | 51.3 | 55.2 | 56.8 | 32.5 | 35.3 | 39.1 | 40.8 | 43.8 | 40.1 | 41.5 | 44.3 | 46.9 | 46.4 |
QA-FewDet[ | 42.4 | 51.9 | 55.7 | 62.6 | 63.4 | 25.9 | 37.8 | 46.6 | 48.9 | 51.1 | 35.2 | 42.9 | 47.8 | 54.8 | 53.5 |
DA-FSOD[ | 33.4 | 45.1 | 47.1 | 53.1 | 60.0 | 24.2 | 31.4 | 39.5 | 43.9 | 49.0 | 24.5 | 36.1 | 42.3 | 49.2 | 54.5 |
FSCE[ | 32.9 | 44.0 | 46.8 | 52.9 | 59.7 | 23.7 | 30.6 | 38.4 | 38.4 | 48.5 | 22.6 | 33.4 | 39.5 | 47.3 | 54.0 |
G-FSD[ | 42.4 | 45.8 | 45.9 | 53.7 | 56.1 | 21.7 | 27.8 | 35.2 | 37.0 | 40.3 | 30.2 | 37.6 | 43.0 | 49.7 | 50.1 |
FSOD-UP[ | 43.8 | 47.8 | 50.3 | 55.4 | 61.7 | 31.2 | 30.5 | 41.2 | 42.2 | 48.3 | 35.5 | 39.7 | 43.9 | 50.6 | 53.5 |
DCNet[ | 33.9 | 37.4 | 43.7 | 51.1 | 59.6 | 23.2 | 24.8 | 30.6 | 36.7 | 46.6 | 32.3 | 34.9 | 39.7 | 42.6 | 50.7 |
IFA-FSOD | 30.1 | 52.3 | 58.7 | 62.4 | 65.4 | 27.7 | 37.9 | 38.0 | 42.5 | 48.6 | 21.9 | 44.5 | 49.8 | 55.5 | 58.0 |
方法 | 10-shot | 30-shot | ||||
---|---|---|---|---|---|---|
mAP | mAP50 | mAP75 | mAP | mAP50 | mAP75 | |
TFAw/fc[ | 10.0 | 19.2 | 9.2 | 13.4 | 24.7 | 13.2 |
TFAw/cos[ | 10.0 | 19.1 | 9.3 | 13.7 | 24.9 | 13.4 |
FSRW[ | 5.6 | 12.3 | 4.6 | 9.1 | 19.0 | 7.6 |
MetaDet[ | 7.1 | 14.6 | 6.1 | 11.3 | 21.7 | 8.1 |
Meta R-CNN[ | 8.7 | 19.1 | 6.6 | 12.4 | 25.3 | 10.8 |
MPSR[ | 9.8 | 17.9 | 9.7 | 14.1 | 25.4 | 14.2 |
FSCE[ | 11.9 | — | 10.5 | 15.3 | — | 14.2 |
FsDetView[ | 12.5 | 27.3 | 9.8 | 14.7 | 30.6 | 12.2 |
SRR-FSD[ | 11.3 | 23.0 | 9.8 | 14.7 | 29.2 | 13.5 |
IFA-FSOD | 12.7 | 24.9 | 10.6 | 15.5 | 31.0 | 15.1 |
表2 MS COCO数据集上的mAP (%)
Tab.2 mAP on MS COCO dataset
方法 | 10-shot | 30-shot | ||||
---|---|---|---|---|---|---|
mAP | mAP50 | mAP75 | mAP | mAP50 | mAP75 | |
TFAw/fc[ | 10.0 | 19.2 | 9.2 | 13.4 | 24.7 | 13.2 |
TFAw/cos[ | 10.0 | 19.1 | 9.3 | 13.7 | 24.9 | 13.4 |
FSRW[ | 5.6 | 12.3 | 4.6 | 9.1 | 19.0 | 7.6 |
MetaDet[ | 7.1 | 14.6 | 6.1 | 11.3 | 21.7 | 8.1 |
Meta R-CNN[ | 8.7 | 19.1 | 6.6 | 12.4 | 25.3 | 10.8 |
MPSR[ | 9.8 | 17.9 | 9.7 | 14.1 | 25.4 | 14.2 |
FSCE[ | 11.9 | — | 10.5 | 15.3 | — | 14.2 |
FsDetView[ | 12.5 | 27.3 | 9.8 | 14.7 | 30.6 | 12.2 |
SRR-FSD[ | 11.3 | 23.0 | 9.8 | 14.7 | 29.2 | 13.5 |
IFA-FSOD | 12.7 | 24.9 | 10.6 | 15.5 | 31.0 | 15.1 |
方法 | 不同小样本条件下推理的nAP50 | |||||
---|---|---|---|---|---|---|
Metric RPN | FAM | 1-shot | 2-shot | 3-shot | 5-shot | 10-shot |
× | × | 29.1 | 48.5 | 53.0 | 56.2 | 60.8 |
× | √ | 29.4 | 50.9 | 57.9 | 60.7 | 63.6 |
√ | × | 29.9 | 51.5 | 58.5 | 59.8 | 64.1 |
√ | √ | 30.1 | 52.3 | 58.7 | 62.4 | 65.4 |
表3 在PASCAL VOC Novel Set 1上的消融实验结果 (%)
Tab.3 Ablation experimental results on PASCAL VOC Novel Set 1
方法 | 不同小样本条件下推理的nAP50 | |||||
---|---|---|---|---|---|---|
Metric RPN | FAM | 1-shot | 2-shot | 3-shot | 5-shot | 10-shot |
× | × | 29.1 | 48.5 | 53.0 | 56.2 | 60.8 |
× | √ | 29.4 | 50.9 | 57.9 | 60.7 | 63.6 |
√ | × | 29.9 | 51.5 | 58.5 | 59.8 | 64.1 |
√ | √ | 30.1 | 52.3 | 58.7 | 62.4 | 65.4 |
方法 | mAP50 | ||
---|---|---|---|
Metric RPN | FAM | 10-shot | 30-shot |
× | × | 20.4 | 28.1 |
× | √ | 22.8 | 29.4 |
√ | × | 23.3 | 30.2 |
√ | √ | 24.9 | 31.0 |
表4 在MS COCO数据集上的消融实验结果 (%)
Tab.4 Ablation experimental results on MS COCO dataset
方法 | mAP50 | ||
---|---|---|---|
Metric RPN | FAM | 10-shot | 30-shot |
× | × | 20.4 | 28.1 |
× | √ | 22.8 | 29.4 |
√ | × | 23.3 | 30.2 |
√ | √ | 24.9 | 31.0 |
1 | 史燕燕,史殿习,乔子腾,等. 小样本目标检测研究综述[J]. 计算机学报, 2023, 46(8):1753-1780. |
SHI Y Y, SHI D X, QIAO Z T, et al. A survey on recent advances in few-shot object detection[J]. Chinese Journal of Computers, 2023, 46(8):1753-1780. | |
2 | 黄友文,豆恒,肖贵光. 融合分类校正与样本扩增的小样本目标检测[J]. 计算机工程与应用, 2024, 60(1):254-262. |
HUANG Y W, DOU H, XIAO G G. Few-shot object detection based on fusion of classification correction and sample amplification[J]. Computer Engineering and Applications, 2024, 60(1): 254-262. | |
3 | KÖHLER M, EISENBACH M, GROSS H M. Few-shot object detection: a comprehensive survey[J]. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(9):11958-11978. |
4 | KANG B, LIU Z, WANG X, et al. Few-shot object detection via feature reweighting[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 8419-8428. |
5 | WANG X, HUANG T E, DARRELL T, et al. Frustratingly simple few-shot object detection [C]// Proceedings of the 37th International Conference on Machine Learning. New York: JMLR.org, 2020: 9919-9928. |
6 | YAN X, CHEN Z, XU A, et al. Meta R-CNN: towards general solver for instance-level low-shot learning [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 9576-9585. |
7 | REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149. |
8 | HU H, BAI S, LI A, et al. Dense relation distillation with context-aware aggregation for few-shot object detection [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 10180-10189. |
9 | HAN G, HUANG S, MA J, et al. Meta Faster R-CNN: towards accurate few-shot object detection with attentive feature alignment[C]// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022: 780-789. |
10 | LE JEUNE P, MOKRAOUI A. A comparative attention framework for better few-shot object detection on aerial images[EB/OL]. [2023-11-10]. . |
11 | XIAO Y, LEPETIT V, MARLET R. Few-shot object detection and viewpoint estimation for objects in the wild[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(3): 3090-3106. |
12 | WANG Y X, RAMANAN D, HEBERT M. Meta-learning to detect rare objects [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 9924-9933. |
13 | LI Y, FENG W, LYU S, et al. Feature reconstruction and metric based network for few-shot object detection [J]. Computer Vision and Image Understanding, 2023, 227: No.103600. |
14 | LI A, LI Z. Transformation invariant few-shot object detection[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 3093-3101. |
15 | LI B, YANG B, LIU C, et al. Beyond max-margin: class margin equilibrium for few-shot object detection [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 7359-7368. |
16 | HAN G, MA J, HUANG S, et al. Few-shot object detection with fully cross-transformer [C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 5311-5320. |
17 | FAN Q, ZHUO W, TANG C K, et al. Few-shot object detection with Attention-RPN and Multi-Relation Detector [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 4012-4021. |
18 | ZHANG L, ZHOU S, GUAN J, et al. Accurate few-shot object detection with support-query mutual guidance and hybrid loss [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 14419-14427. |
19 | HAN G, HE Y, HUANG S, et al. Query adaptive few-shot object detection with heterogeneous graph convolutional networks[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 3243-3252. |
20 | LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 936-944. |
21 | FAN Z, MA Y, LI Z, et al. Generalized few-shot object detection without forgetting[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 4525-4534. |
22 | WU A, HAN Y, ZHU L, et al. Universal-prototype enhancing for few-shot object detection [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 9547-9556. |
23 | HAN J, REN Y, DING J, et al. Few-shot object detection via variational feature aggregation [C]// Proceedings of the 37th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2023: 755-763. |
24 | SUN B, LI B, CAI S, et al. FSCE: few-shot object detection via contrastive proposal encoding [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 7348-7358. |
25 | ZHU C, CHEN F, AHMED U, et al. Semantic relation reasoning for shot-stable few-shot object detection [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 8778-8787. |
26 | QIAO L, ZHAO Y, LI Z, et al. DeFRCN: decoupled Faster R-CNN for few-shot object detection [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 8661-8670. |
27 | KAUL P, XIE W, ZISSERMAN A. Label, verify, correct: a simple few shot object detection method [C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 14217-14227. |
28 | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. |
29 | KRIZHEVSKY A, SUTSKEVER I, HINTON G. ImageNet classification with deep convolutional neural networks[C]// Proceedings of the 25th International Conference on Neural Information Processing Systems — Volume 1. Red Hook: Curran Associates Inc., 2012: 1097-1105. |
30 | EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The PASCAL Visual Object Classes (VOC) challenge[J]. International Journal of Computer Vision, 2010, 88(2): 303-308. |
31 | EVERINGHAM M, ESLAMI S M A, VAN GOOL L, et al. The PASCAL visual object classes challenge: a retrospective[J]. International Journal of Computer Vision, 2015, 111(1): 98-136. |
32 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context [C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014:740-755. |
33 | YAO J, SHI T Y, CHE X P, et al. DA-FSOD: a novel data augmentation scheme for few-shot object detection[J]. IEEE Access, 2023, 11: 92100-92110. |
34 | WU J, LIU S, HUANG D, et al. Multi-scale positive sample refinement for few-shot object detection[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12361. Cham: Springer, 2020: 456-472. |
[1] | 李鸿天, 史鑫昊, 潘卫国, 徐成, 徐冰心, 袁家政. 融合多尺度和注意力机制的小样本目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1437-1444. |
[2] | 张鹏飞, 韩李涛, 冯恒健, 李洪梅. 基于注意力机制和全局特征优化的点云语义分割[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1086-1092. |
[3] | 黄学雨, 贺怀宇, 林慧敏, 陈金水. 基于特征聚合的铜合金金相图分类识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2593-2601. |
[4] | 赵欣, 祝倩倩, 赵聪, 吴佳玲. 基于多尺度和跨空间融合的超声乳腺结节分割[J]. 《计算机应用》唯一官方网站, 2023, 43(11): 3599-3606. |
[5] | 林润超, 黄荣, 董爱华. 基于注意力机制和元特征二次重加权的小样本目标检测[J]. 《计算机应用》唯一官方网站, 2022, 42(10): 3025-3032. |
[6] | 何韩森, 孙国梓. 基于特征聚合的假新闻内容检测模型[J]. 计算机应用, 2020, 40(8): 2189-2193. |
[7] | 郭明祥, 宋全军, 徐湛楠, 董俊, 谢成军. 基于三维残差稠密网络的人体行为识别算法[J]. 计算机应用, 2019, 39(12): 3482-3489. |
[8] | 陈宏宇, 邓德祥, 颜佳, 范赐恩. 基于显著性语义区域加权的图像检索算法[J]. 计算机应用, 2019, 39(1): 136-142. |
[9] | 邹承明, 罗莹, 徐晓龙. 基于多特征组合的细粒度图像分类方法[J]. 计算机应用, 2018, 38(7): 1853-1856. |
[10] | 郭川磊, 何嘉. 基于转置卷积操作改进的单阶段多边框目标检测方法[J]. 计算机应用, 2018, 38(10): 2833-2838. |
[11] | 胡杨, 戴丹, 刘骊, 冯旭鹏, 刘利军, 黄青松. 基于情感角色模型的文本情感分类方法[J]. 计算机应用, 2015, 35(5): 1310-1313. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||