《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (5): 1437-1444.DOI: 10.11772/j.issn.1001-9081.2023050699
所属专题: 人工智能; 2023年中国计算机学会人工智能会议(CCFAI 2023)
• 2023年中国计算机学会人工智能会议(CCFAI 2023) • 上一篇 下一篇
李鸿天1, 史鑫昊1, 潘卫国1(), 徐成1, 徐冰心1, 袁家政1,2
收稿日期:
2023-05-08
修回日期:
2023-06-11
接受日期:
2023-06-16
发布日期:
2023-08-01
出版日期:
2024-05-10
通讯作者:
潘卫国
作者简介:
李鸿天(1998—),男,广东肇庆人,硕士研究生,主要研究方向:图像处理、计算机视觉基金资助:
Hongtian LI1, Xinhao SHI1, Weiguo PAN1(), Cheng XU1, Bingxin XU1, Jiazheng YUAN1,2
Received:
2023-05-08
Revised:
2023-06-11
Accepted:
2023-06-16
Online:
2023-08-01
Published:
2024-05-10
Contact:
Weiguo PAN
About author:
LI Hongtian, born in 1998, M. S. candidate. His research interests include image processing, computer vision.Supported by:
摘要:
现有基于微调的二阶段小样本目标检测方法对新类特征不敏感,易将新类别误判成与它相似度高的基类,影响模型的检测性能。针对上述问题,提出一种融合多尺度和注意力机制的小样本目标检测(MA-FSOD)算法。首先在骨干网络使用分组卷积和大卷积核提取更具类别区分性的特征,并加入卷积注意力模块(CBAM)实现特征的自适应增强;再通过改进的金字塔网络实现多尺度的特征融合,使候选框生成网络(RPN)可以准确找到感兴趣区域(RoI),从多个尺度向分类头提供更丰富的高质量正样本;最后在微调阶段采用余弦分类头进行分类,降低类内方差。在PASCAL-VOC 2007/2012数据集上与基于候选框编码对比损失的小样本目标检测(FSCE)算法相比,MA-FSOD算法对新类的AP50提升了5.6个百分点;在更具挑战性的MSCOCO数据集中,与Meta-Faster-RCNN相比,10-shot和30-shot对应的AP则分别提升了0.1个百分点和1.6个百分点。实验结果表明,相较于一些主流的小样本目标检测算法,MA?FSOD算法能更有效地缓解误分类问题,实现更高精度的小样本目标检测。
中图分类号:
李鸿天, 史鑫昊, 潘卫国, 徐成, 徐冰心, 袁家政. 融合多尺度和注意力机制的小样本目标检测[J]. 计算机应用, 2024, 44(5): 1437-1444.
Hongtian LI, Xinhao SHI, Weiguo PAN, Cheng XU, Bingxin XU, Jiazheng YUAN. Few-shot object detection via fusing multi-scale and attention mechanism[J]. Journal of Computer Applications, 2024, 44(5): 1437-1444.
数据集 划分 | K-shot | AP50 | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
基于迁移学习范式的方法 | 基于元学习范式的方法 | MA-FSOD | ||||||||
TFA w/cos[ | MPSR[ | FSCE[ | FSOD-SR[ | FSRW[ | Meta R-CNN[ | QA-FewDet[ | Meta-Faster-RCNN[ | |||
split1 | 1-shot | 39.8 | 41.4 | 44.2 | 50.1 | 14.8 | 19.9 | 42.4 | 43.0 | 46.3 |
2-shot | 36.1 | — | 43.8 | 54.4 | 15.5 | 25.5 | 51.9 | 54.5 | 52.4 | |
3-shot | 44.7 | 51.4 | 51.4 | 56.2 | 26.7 | 35.0 | 55.7 | 60.6 | 61.4 | |
5-shot | 55.7 | 55.6 | 61.9 | 60.0 | 33.9 | 45.7 | 62.6 | 66.1 | 64.8 | |
10-shot | 56.0 | 61.7 | 63.4 | 62.4 | 47.2 | 51.5 | 63.4 | 65.4 | 65.4 | |
split2 | 1-shot | 23.5 | 24.3 | 27.3 | 29.5 | 15.7 | 10.4 | 25.9 | 27.7 | 33.7 |
2-shot | 26.9 | — | 29.5 | 39.9 | 15.3 | 19.4 | 37.8 | 35.5 | 34.4 | |
3-shot | 34.1 | 39.0 | 43.5 | 43.5 | 22.7 | 29.6 | 46.6 | 46.1 | 45.1 | |
5-shot | 35.1 | 39.7 | 44.2 | 44.6 | 30.1 | 34.8 | 48.9 | 47.8 | 47.3 | |
10-shot | 39.1 | 47.2 | 50.2 | 48.1 | 40.5 | 45.4 | 51.1 | 51.2 | 50.7 | |
split3 | 1-shot | 30.8 | 35.4 | 37.2 | 43.6 | 21.3 | 14.3 | 35.2 | 40.6 | 47.1 |
2-shot | 34.8 | — | 41.9 | 46.6 | 25.6 | 18.2 | 42.9 | 46.4 | 54.3 | |
3-shot | 42.8 | 42.1 | 47.5 | 53.4 | 28.4 | 27.5 | 47.8 | 53.4 | 56.1 | |
5-shot | 49.5 | 48.1 | 54.6 | 53.4 | 42.8 | 41.2 | 54.8 | 59.9 | 61.6 | |
10-shot | 49.8 | 49.5 | 58.5 | 59.5 | 45.9 | 48.1 | 53.5 | 58.6 | 61.8 | |
AP50平均值 | 39.9 | 35.7 | 46.6 | 49.7 | 28.4 | 31.1 | 48.0 | 50.5 | 52.2 |
表1 在PASCAL-VOC数据集各小样本条件下检测性能对比 ( %)
Tab. 1 Comparison of detection performance under various few-shot conditions for PASCAL-VOC dataset
数据集 划分 | K-shot | AP50 | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
基于迁移学习范式的方法 | 基于元学习范式的方法 | MA-FSOD | ||||||||
TFA w/cos[ | MPSR[ | FSCE[ | FSOD-SR[ | FSRW[ | Meta R-CNN[ | QA-FewDet[ | Meta-Faster-RCNN[ | |||
split1 | 1-shot | 39.8 | 41.4 | 44.2 | 50.1 | 14.8 | 19.9 | 42.4 | 43.0 | 46.3 |
2-shot | 36.1 | — | 43.8 | 54.4 | 15.5 | 25.5 | 51.9 | 54.5 | 52.4 | |
3-shot | 44.7 | 51.4 | 51.4 | 56.2 | 26.7 | 35.0 | 55.7 | 60.6 | 61.4 | |
5-shot | 55.7 | 55.6 | 61.9 | 60.0 | 33.9 | 45.7 | 62.6 | 66.1 | 64.8 | |
10-shot | 56.0 | 61.7 | 63.4 | 62.4 | 47.2 | 51.5 | 63.4 | 65.4 | 65.4 | |
split2 | 1-shot | 23.5 | 24.3 | 27.3 | 29.5 | 15.7 | 10.4 | 25.9 | 27.7 | 33.7 |
2-shot | 26.9 | — | 29.5 | 39.9 | 15.3 | 19.4 | 37.8 | 35.5 | 34.4 | |
3-shot | 34.1 | 39.0 | 43.5 | 43.5 | 22.7 | 29.6 | 46.6 | 46.1 | 45.1 | |
5-shot | 35.1 | 39.7 | 44.2 | 44.6 | 30.1 | 34.8 | 48.9 | 47.8 | 47.3 | |
10-shot | 39.1 | 47.2 | 50.2 | 48.1 | 40.5 | 45.4 | 51.1 | 51.2 | 50.7 | |
split3 | 1-shot | 30.8 | 35.4 | 37.2 | 43.6 | 21.3 | 14.3 | 35.2 | 40.6 | 47.1 |
2-shot | 34.8 | — | 41.9 | 46.6 | 25.6 | 18.2 | 42.9 | 46.4 | 54.3 | |
3-shot | 42.8 | 42.1 | 47.5 | 53.4 | 28.4 | 27.5 | 47.8 | 53.4 | 56.1 | |
5-shot | 49.5 | 48.1 | 54.6 | 53.4 | 42.8 | 41.2 | 54.8 | 59.9 | 61.6 | |
10-shot | 49.8 | 49.5 | 58.5 | 59.5 | 45.9 | 48.1 | 53.5 | 58.6 | 61.8 | |
AP50平均值 | 39.9 | 35.7 | 46.6 | 49.7 | 28.4 | 31.1 | 48.0 | 50.5 | 52.2 |
小样本 条件 | 评估 指标 | 基于迁移学习范式的方法 | 基于元学习范式的方法 | MA-FSOD | ||||||
---|---|---|---|---|---|---|---|---|---|---|
TFAw/cos[ | MPSR[ | FSCE[ | FSOD-SR[ | FSRW[ | Meta R-CNN[ | QA-FewDet[ | Meta-Faster-RCNN[ | |||
10-shot | AP | 10.0 | 9.8 | 11.9 | 11.6 | 5.6 | 8.7 | 11.6 | 12.7 | 12.8 |
AP50 | 19.1 | 17.9 | — | 21.7 | 12.3 | 19.1 | 23.9 | 25.7 | 25.6 | |
AP75 | 9.3 | 9.7 | 10.5 | 10.4 | 4.6 | 6.6 | 9.8 | 10.8 | 11.2 | |
30-shot | AP | 13.7 | 14.1 | 16.4 | 15.2 | 9.1 | 12.4 | 16.5 | 16.6 | 18.2 |
AP50 | 24.9 | 25.4 | — | 27.5 | 19.0 | 25.3 | 31.9 | 31.8 | 34.6 | |
AP75 | 13.4 | 14.2 | 16.2 | 14.6 | 7.6 | 10.8 | 15.5 | 15.8 | 17.4 |
表2 在MSCOCO数据集各小样本条件下检测性能对比 (%)
Tab. 2 Comparison of detection performance under various few-shot conditions for MSCOCO dataset
小样本 条件 | 评估 指标 | 基于迁移学习范式的方法 | 基于元学习范式的方法 | MA-FSOD | ||||||
---|---|---|---|---|---|---|---|---|---|---|
TFAw/cos[ | MPSR[ | FSCE[ | FSOD-SR[ | FSRW[ | Meta R-CNN[ | QA-FewDet[ | Meta-Faster-RCNN[ | |||
10-shot | AP | 10.0 | 9.8 | 11.9 | 11.6 | 5.6 | 8.7 | 11.6 | 12.7 | 12.8 |
AP50 | 19.1 | 17.9 | — | 21.7 | 12.3 | 19.1 | 23.9 | 25.7 | 25.6 | |
AP75 | 9.3 | 9.7 | 10.5 | 10.4 | 4.6 | 6.6 | 9.8 | 10.8 | 11.2 | |
30-shot | AP | 13.7 | 14.1 | 16.4 | 15.2 | 9.1 | 12.4 | 16.5 | 16.6 | 18.2 |
AP50 | 24.9 | 25.4 | — | 27.5 | 19.0 | 25.3 | 31.9 | 31.8 | 34.6 | |
AP75 | 13.4 | 14.2 | 16.2 | 14.6 | 7.6 | 10.8 | 15.5 | 15.8 | 17.4 |
是否采用 ConvNeXt‑tiny | 是否采用本文改进的 多尺度特征融合模块 | 仅在基类推理 的AP50 /% | 不同小样本条件下推理的AP50/% | 参数量/106 | FLOPs | ||||
---|---|---|---|---|---|---|---|---|---|
1-shot | 2-shot | 3-shot | 5-shot | 10-shot | |||||
否 | 否 | 71.4 | 29.9 | 33.2 | 39.3 | 47.1 | 48.4 | 60.08 | 40.38 |
否 | 是 | 72.4 | 35.0 | 36.0 | 40.9 | 48.8 | 50.6 | 63.62 | 42.86 |
是 | 否 | 77.6 | 36.0 | 50.0 | 46.5 | 61.9 | 63.4 | 44.88 | 33.45 |
是 | 是 | 78.5 | 39.3 | 50.8 | 53.3 | 62.8 | 65.2 | 48.42 | 35.92 |
表3 骨干网络与多尺度融合网络在VOC07-split1中的消融实验结果
Tab. 3 Ablation experiment results of backbone and multi-scale pyramid networks in VOC07-split1
是否采用 ConvNeXt‑tiny | 是否采用本文改进的 多尺度特征融合模块 | 仅在基类推理 的AP50 /% | 不同小样本条件下推理的AP50/% | 参数量/106 | FLOPs | ||||
---|---|---|---|---|---|---|---|---|---|
1-shot | 2-shot | 3-shot | 5-shot | 10-shot | |||||
否 | 否 | 71.4 | 29.9 | 33.2 | 39.3 | 47.1 | 48.4 | 60.08 | 40.38 |
否 | 是 | 72.4 | 35.0 | 36.0 | 40.9 | 48.8 | 50.6 | 63.62 | 42.86 |
是 | 否 | 77.6 | 36.0 | 50.0 | 46.5 | 61.9 | 63.4 | 44.88 | 33.45 |
是 | 是 | 78.5 | 39.3 | 50.8 | 53.3 | 62.8 | 65.2 | 48.42 | 35.92 |
分类头类型 | 不同小样本条件下推理的AP50 | ||||
---|---|---|---|---|---|
1-shot | 2-shot | 3-shot | 5-shot | 10-shot | |
双头分类器 | 34.0 | 44.9 | 51.0 | 58.5 | 62.0 |
共享FC分类头 | 39.3 | 50.8 | 53.3 | 62.8 | 65.2 |
余弦分类头 | 41.2 | 49.6 | 55.4 | 64.0 | 64.9 |
表4 不同分类头在VOC07-split1中的消融实验结果 ( %)
Tab. 4 Ablation experiment results with different classification heads in VOC07-split1
分类头类型 | 不同小样本条件下推理的AP50 | ||||
---|---|---|---|---|---|
1-shot | 2-shot | 3-shot | 5-shot | 10-shot | |
双头分类器 | 34.0 | 44.9 | 51.0 | 58.5 | 62.0 |
共享FC分类头 | 39.3 | 50.8 | 53.3 | 62.8 | 65.2 |
余弦分类头 | 41.2 | 49.6 | 55.4 | 64.0 | 64.9 |
不同小样本条件下推理的AP50 /% | |||||
---|---|---|---|---|---|
1-shot | 2-shot | 3-shot | 5-shot | 10-shot | |
10 | 36.5 | 47.1 | 54.9 | 55.3 | 55.1 |
20 | 41.2 | 49.6 | 55.4 | 64.0 | 64.9 |
30 | 39.1 | 47.5 | 51.2 | 60.9 | 64.8 |
40 | 39.4 | 47.8 | 53.1 | 61.7 | 63.1 |
表5 不同α在VOC07-split1中的消融实验结果
Tab. 5 Ablation experiment results with different values of α in VOC07-split1
不同小样本条件下推理的AP50 /% | |||||
---|---|---|---|---|---|
1-shot | 2-shot | 3-shot | 5-shot | 10-shot | |
10 | 36.5 | 47.1 | 54.9 | 55.3 | 55.1 |
20 | 41.2 | 49.6 | 55.4 | 64.0 | 64.9 |
30 | 39.1 | 47.5 | 51.2 | 60.9 | 64.8 |
40 | 39.4 | 47.8 | 53.1 | 61.7 | 63.1 |
实验 序号 | 是否加入CBAM | 是否冻结参数 | 仅在基类推理(AP50) | 不同小样本条件下推理(AP50) | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
基类训练 | 新类微调 | CBAM | 金字塔模块 | RPN | RoI提取器 | 1-shot | 2-shot | 3-shot | 5-shot | 10-shot | ||
1 | 是 | 是 | 否 | 否 | 否 | 否 | 79.1 | 36.9 | 47.4 | 55.0 | 63.8 | 64.9 |
2 | 是 | 是 | 是 | 否 | 否 | 否 | 36.3 | 47.7 | 54.7 | 63.9 | 65.5 | |
3 | 是 | 是 | 是 | 是 | 否 | 否 | 39.2 | 48.8 | 57.5 | 64.1 | 65.3 | |
4 | 是 | 是 | 是 | 是 | 是 | 否 | 43.7 | 49.4 | 56.6 | 64.2 | 65.5 | |
5 | 是 | 是 | 是 | 是 | 是 | 是 | 45.2 | 52.3 | 60.4 | 64.4 | 65.8 | |
6 | 是 | 否 | 否 | 是 | 是 | 是 | 52.2 | 48.9 | 59.2 | 64.8 | 64.8 | |
7 | 否 | 否 | 否 | 是 | 是 | 是 | 78.5 | 41.2 | 49.6 | 55.4 | 64.0 | 64.9 |
表6 CBAM与微调策略在VOC07-split1中的消融实验结果 ( %)
Tab. 6 Ablation experiment results of CBAM with fine-tuning strategy in VOC07-split1
实验 序号 | 是否加入CBAM | 是否冻结参数 | 仅在基类推理(AP50) | 不同小样本条件下推理(AP50) | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
基类训练 | 新类微调 | CBAM | 金字塔模块 | RPN | RoI提取器 | 1-shot | 2-shot | 3-shot | 5-shot | 10-shot | ||
1 | 是 | 是 | 否 | 否 | 否 | 否 | 79.1 | 36.9 | 47.4 | 55.0 | 63.8 | 64.9 |
2 | 是 | 是 | 是 | 否 | 否 | 否 | 36.3 | 47.7 | 54.7 | 63.9 | 65.5 | |
3 | 是 | 是 | 是 | 是 | 否 | 否 | 39.2 | 48.8 | 57.5 | 64.1 | 65.3 | |
4 | 是 | 是 | 是 | 是 | 是 | 否 | 43.7 | 49.4 | 56.6 | 64.2 | 65.5 | |
5 | 是 | 是 | 是 | 是 | 是 | 是 | 45.2 | 52.3 | 60.4 | 64.4 | 65.8 | |
6 | 是 | 否 | 否 | 是 | 是 | 是 | 52.2 | 48.9 | 59.2 | 64.8 | 64.8 | |
7 | 否 | 否 | 否 | 是 | 是 | 是 | 78.5 | 41.2 | 49.6 | 55.4 | 64.0 | 64.9 |
图7 FSCE对新类误检、漏检、出现不确定检测与MA?FSOD正确检测的对比
Fig. 7 False detection, misdetection and emergence of uncertainty detection for a new class of FSCE vs. correct detection of MA-FSOD
1 | 范馨月,鲍泓,潘卫国.基于类别不平衡数据集的图像实例分割方法[J].计算机工程,2022,48(12):224-231. 10.19678/j.issn.1000-3428.0063741 |
FAN X Y, BAO H, PAN W G. Image instance segmentation method based on class-imbalanced dataset[J]. Computer Engineering,2022,48(12):224-231. 10.19678/j.issn.1000-3428.0063741 | |
2 | 林润超,黄荣,董爱华.基于注意力机制和元特征二次重加权的小样本目标检测[J].计算机应用,2022,42(10):3025-3032. 10.11772/j.issn.1001-9081.2021091571 |
LIN R C, HUANG R, DONG A H. Few-shot object detection based on attention mechanism and secondary reweighting of meta-features[J].Journal of Computer Applications, 2022,42(10): 3025-3032. 10.11772/j.issn.1001-9081.2021091571 | |
3 | 范馨月,刘腾,鲍泓,等.基于记忆库和后处理方法解决长尾实例分割问题[J].计算机应用研究,2023,40(6):1876-1881. |
FAN X Y, LIU T, BAO H, et al. Method for long-tailed instance segmentation based on memory bank and confidence calibration[J]. Application Research of Computers, 2023, 40(6): 1876-1881. | |
4 | 李丽芬,范新烨.元学习与多尺度特征融合的小样本目标检测[J/OL].小型微型计算机系统, 2023 [2023-06-18]. . |
LI L F, FAN X Y. Few-shot object detection with meta-learning and multi-scale feature fusion [J/OL]. Journal of Chinese Computer Systems, 2023 [2023-06-18]. . | |
5 | REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. 10.1109/tpami.2016.2577031 |
6 | WANG X, HUANG T E, DARRELL T, et al. Frustratingly simple few-shot object detection [EB/OL]. [2023-06-18]. . 10.48550/arXiv.2003.06957 |
7 | 潘兴甲,张旭龙,董未名,等.小样本目标检测的研究现状[J].南京信息工程大学学报(自然科学版),2019,11(6):698-705. |
PAN X J, ZHANG X L, DONG W M, et al. A survey of few-shot object detection[J]. Journal of Nanjing University of Information Science & Technology (Natural Science Edition),2019,11(6):698-705. | |
8 | LIANG T, BAO H, PAN W, et al. DetectFormer: category-assisted transformer for traffic scene object detection[J]. Sensors, 2022, 22(13): 4833. 10.3390/s22134833 |
9 | WU J, LIU S, HUANG D, et al. Multi-scale positive sample refinement for few-shot object detection[C]// Proceedings of the 16th European Conference on Computer Vision. Cham: Springer, 2020:456-472. 10.1007/978-3-030-58517-4_27 |
10 | XU H, WANG X, SHAO F, et al. Few-shot object detection via sample processing[J]. IEEE Access, 2021, 9: 29207-29221. 10.1109/access.2021.3059446 |
11 | SUN B, LI B, CAI S, et al. FSCE: few-shot object detection via contrastive proposal encoding[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 7352-7362. 10.1109/cvpr46437.2021.00727 |
12 | KIM G, JUNG H-G, LEE S-W. Spatial reasoning for few-shot object detection[J]. Pattern Recognition, 2021, 120: 108118. 10.1016/j.patcog.2021.108118 |
13 | QIAO L, ZHAO Y, LI Z, et al. DeFRCN: decoupled Faster R‑CNN for few-shot object detection [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 8681-8690. 10.1109/iccv48922.2021.00856 |
14 | KAUL P, XIE W, Label ZISSERMAN A., verify, correct: a simple few shot object detection method[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 14237-14247. 10.1109/cvpr52688.2022.01384 |
15 | KANG B, LIU Z, WANG X, et al. Few-shot object detection via feature reweighting[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 8420-8429. 10.1109/iccv.2019.00851 |
16 | YAN X, CHEN Z, XU A, et al. Meta R-CNN: towards general solver for instance-level low-shot learning[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 9577-9586. 10.1109/iccv.2019.00967 |
17 | HAN G, HE Y, HUANG S, et al. Query adaptive few-shot object detection with heterogeneous graph convolutional networks[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 3263-3272. 10.1109/iccv48922.2021.00325 |
18 | HAN G, MA J, HUANG S, et al. Few-shot object detection with fully cross-transformer[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 5321-5330. 10.1109/cvpr52688.2022.00525 |
19 | HAN G, HUANG S, MA J, et al. Meta Faster R-CNN: towards accurate few-shot object detection with attentive feature alignment[C]// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Menlo Park: AAAI Press, 2022: 780-789. 10.1609/aaai.v36i1.19959 |
20 | 刘春磊,陈天恩,王聪,等.小样本目标检测研究综述[J].计算机科学与探索,2023,17(1):53-73. |
LIU C L, CHEN T E, WANG C, et al. Survey of few-shot object detection[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(1): 53-73. | |
21 | LIU Z, MAO H, WU C-Y, et al. A ConvNet for the 2020s[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 11976-11986. 10.1109/cvpr52688.2022.01167 |
22 | WOO S, PARK J, LEE J-Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 3-19. 10.1007/978-3-030-01234-2_1 |
23 | LIN T-Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 2117-2125. 10.1109/cvpr.2017.106 |
24 | LI W, HUANG R, LI J, et al. A perspective survey on deep transfer learning for fault diagnosis in industrial scenarios: theories, applications and challenges[J]. Mechanical Systems and Signal Processing, 2022, 167: 108487. 10.1016/j.ymssp.2021.108487 |
25 | SHRIVASTAVA A, GUPTA A, GIRSHICK R. Training region-based object detectors with online hard example mining[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 761-769. 10.1109/cvpr.2016.89 |
26 | PANG J, CHEN K, SHI J, et al. Libra R-CNN: towards balanced learning for object detection[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 821-830. 10.1109/cvpr.2019.00091 |
27 | CAO Y, CHEN K, LOY C C, et al. Prime sample attention in object detection[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 11583-11591. 10.1109/cvpr42600.2020.01160 |
28 | WU Y, CHEN Y, YUAN L, et al. Rethinking classification and localization for object detection[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 10186-10195. 10.1109/cvpr42600.2020.01020 |
29 | EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The PASCAL Visual Object Classes (VOC) challenge[J]. International Journal of Computer Vision, 2009, 88: 303-308. 10.1007/s11263-009-0275-4 |
30 | EVERINGHAM M, ESLAMI S M A, VAN GOOL L, et al. The PASCAL visual object classes challenge: a retrospective[J]. International Journal of Computer Vision, 2015, 111: 98-136. 10.1007/s11263-014-0733-5 |
31 | LIN T-Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]// Proceedings of the 13th European Conference on Computer Vision. Cham: Springer, 2014: 740-755. 10.1007/978-3-319-10602-1_48 |
32 | GLOROT X, BENGIO Y. Understanding the difficulty of training deep feedforward neural networks[C]// Proceedings of the 13th International Conference on Artificial Intelligence and Statistics. New York: JMLR, 2010: 249-256. |
33 | LIANG T, BAO H, PAN W, et al. Traffic sign detection via improved sparse R-CNN for autonomous vehicles[J]. Journal of Advanced Transportation, 2022, 2022: 3825532. 10.1155/2022/3825532 |
34 | CHATTOPADHAY A, SARKAR A, HOWLADER P, et al. Grad-CAM++: generalized gradient-based visual explanations for deep convolutional networks[C]// Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2018: 839-847. 10.1109/wacv.2018.00097 |
[1] | 秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974. |
[2] | 李力铤, 华蓓, 贺若舟, 徐况. 基于解耦注意力机制的多变量时序预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2732-2738. |
[3] | 赵志强, 马培红, 黑新宏. 基于双重注意力机制的人群计数方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2886-2892. |
[4] | 薛凯鹏, 徐涛, 廖春节. 融合自监督和多层交叉注意力的多模态情感分析网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2387-2392. |
[5] | 汪雨晴, 朱广丽, 段文杰, 李书羽, 周若彤. 基于交互注意力机制的心理咨询文本情感分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2393-2399. |
[6] | 高鹏淇, 黄鹤鸣, 樊永红. 融合坐标与多头注意力机制的交互语音情感识别[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2400-2406. |
[7] | 李钟华, 白云起, 王雪津, 黄雷雷, 林初俊, 廖诗宇. 基于图像增强的低照度人脸检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2588-2594. |
[8] | 莫尚斌, 王文君, 董凌, 高盛祥, 余正涛. 基于多路信息聚合协同解码的单通道语音增强[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2611-2617. |
[9] | 刘丽, 侯海金, 王安红, 张涛. 基于多尺度注意力的生成式信息隐藏算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2102-2109. |
[10] | 徐松, 张文博, 王一帆. 基于时空信息的轻量视频显著性目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2192-2199. |
[11] | 李大海, 王忠华, 王振东. 结合空间域和频域信息的双分支低光照图像增强网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2175-2182. |
[12] | 魏文亮, 王阳萍, 岳彪, 王安政, 张哲. 基于光照权重分配和注意力的红外与可见光图像融合深度学习模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2183-2191. |
[13] | 熊武, 曹从军, 宋雪芳, 邵云龙, 王旭升. 基于多尺度混合域注意力机制的笔迹鉴别方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2225-2232. |
[14] | 李欢欢, 黄添强, 丁雪梅, 罗海峰, 黄丽清. 基于多尺度时空图卷积网络的交通出行需求预测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2065-2072. |
[15] | 毛典辉, 李学博, 刘峻岭, 张登辉, 颜文婧. 基于并行异构图和序列注意力机制的中文实体关系抽取模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2018-2025. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||