融合注意力和上下文信息的遥感图像小目标检测算法

doi:10.11772/j.issn.1001-9081.2024010125

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (1): 292-300.DOI: 10.11772/j.issn.1001-9081.2024010125

• 多媒体计算与计算机仿真 • 上一篇下一篇

融合注意力和上下文信息的遥感图像小目标检测算法

刘赏¹(), 周煜炜¹, 代娆¹, 董林芳¹, 刘猛²

^1.天津财经大学理工学院，天津 300222
^2.河北省水文工程地质勘查院（河北省遥感中心），石家庄 050021

收稿日期:2024-02-05 修回日期:2024-04-01 接受日期:2024-04-07 发布日期:2024-05-09 出版日期:2025-01-10
通讯作者: 刘赏
作者简介:周煜炜（1998—），女，天津人，硕士研究生，主要研究方向：图像处理、目标检测；
代娆（1998—），女，天津人，硕士研究生，主要研究方向：图像处理、姿态估计；
董林芳（1972—），女，河北张家口人，副教授，博士，主要研究方向：人工智能、自然语言处理；
刘猛（1986—），男，河北石家庄人，硕士，主要研究方向：图像处理、遥感地质。
基金资助:
河北省财政项目(13000023P00F2D410374D);天津市科技计划项目(22ZLZKZF00480)

Small target detection algorithm in remote sensing images integrating attention and contextual information

Shang LIU¹(), Yuwei ZHOU¹, Rao DAI¹, Linfang DONG¹, Meng LIU²

^1.School of Science and Engineering，Tianjin University of Finance and Economics，Tianjin 300222，China
^2.Hydrogeological Engineering Geological Exploration Institute （Hebei Remote Sensing Center），Shijiazhuang Hebei 050021，China

Received:2024-02-05 Revised:2024-04-01 Accepted:2024-04-07 Online:2024-05-09 Published:2025-01-10
Contact: Shang LIU
About author:ZHOU Yuwei， born in 1998， M. S. candidate. Her research interests include image processing， object detection.
DAI Rao， born in 1998， M. S. candidate. Her research interests include image processing， pose estimation.
DONG Linfang， born in 1972， Ph. D.， associate professor. Her research interests include artificial intelligence， natural language processing.
LIU Meng， born in 1986， M. S. His research interests include image processing， remote sensing geology.
Supported by:
Financial Project of Hebei Province(13000023P00F2D410374D);Science and Technology Program of Tianjin(22ZLZKZF00480)

摘要/Abstract

摘要：

对多尺度的遥感图像进行小目标检测时，基于深度学习的目标检测算法容易出现误检和漏检的情况。这是因为此类算法的特征提取模块进行了多次的下采样操作；而且未能根据不同类别、不同尺度的目标关注所需的上下文信息。为了解决该问题，提出一种融合注意力和上下文信息的遥感图像小目标检测算法ACM-YOLO（Attention-Context-Multiscale YOLO）。首先，应用细粒度的查询感知稀疏注意力以减少小目标特征信息的丢失，从而避免漏检；其次，设计局部上下文增强（LCE）函数以更好地关注不同类别的遥感目标所需的上下文信息，从而避免误检；最后，使用加权双向特征金字塔网络（BiFPN）强化特征融合模块对遥感图像小目标的多尺度特征融合能力，从而改善算法检测效果。在DOTA数据集和NWPU VHR-10数据集上进行对比实验和消融实验，以验证所提算法的有效性和泛化性。实验结果表明，在2个数据集上所提算法的平均精确率均值（mAP）分别达到了77.33%和96.12%，而相较于YOLOv5算法，召回率分别提升了10.00和7.50个百分点。可见，所提算法能有效提升mAP和召回率，减少误检和漏检。

关键词: 遥感图像, 小目标检测, 稀疏采样, 局部上下文信息增强, 多尺度特征融合

Abstract:

When detecting small targets in multi-scale remote sensing images， target detection algorithms based on deep learning are prone to false detection and missed detection. One of the reasons is that the feature extraction module carries out multiple down-sampling operations. The second reason is the failure to pay attention to the contextual information required by different categories and different scales of targets. To solve this problem， a small object detection algorithm in remote sensing images integrating attention and contextual information ACM-YOLO （Attention-Context-Multiscale YOLO） was proposed. Firstly， to reduce the loss of small target feature information， fine-grained query aware sparse attention was applied， thereby avoiding missed detection. Secondly， to pay more attention to the contextual information required by different categories of remote sensing targets， the Local Contextual Enhancement （LCE） function was designed， thereby avoiding false detection. Finally， to strengthen multi-scale feature fusion capability of the feature fusion module on small targets in remote sensing images， the weighted Bi-directional Feature Pyramid Network （BiFPN） was adopted， thereby improving detection effect of the algorithm. Comparison experiments and ablation experiments were performed on DOTA dataset and NWPU VHR-10 dataset to verify effectiveness and generalization of the proposed algorithm. Experimental results show that on the two datasets， the proposed algorithm has the mean Average Precision （mAP） reached 77.33% and 96.12% respectively， and the Recall increases by 10.00 and 7.50 percentage points， respectively， compared with YOLOv5 algorithm. It can be seen that the proposed algorithm improves mAP and recall effectively， which reduces false detection and missed detection.

Key words: remote sensing image, small target detection, sparse sampling, local contextual information enhancement, multi-scale feature fusion

中图分类号:

TP751

刘赏, 周煜炜, 代娆, 董林芳, 刘猛. 融合注意力和上下文信息的遥感图像小目标检测算法[J]. 计算机应用, 2025, 45(1): 292-300.

Shang LIU, Yuwei ZHOU, Rao DAI, Linfang DONG, Meng LIU. Small target detection algorithm in remote sensing images integrating attention and contextual information[J]. Journal of Computer Applications, 2025, 45(1): 292-300.

图/表 13

参考文献 35

1	李坤亚，欧鸥，刘广滨，等.改进YOLOv5的遥感图像目标检测算法［J］.计算机工程与应用， 2023， 59（9）： 207-214.
	LI K Y， OU O， LIU G B， et al. Target detection algorithm of remote sensing image based on improved YOLOv5 ［J］. Computer Engineering and Applications， 2023， 59（9）： 207-214.
2	KRIZHEVSKY A， SUTSKEVER I， HINTON G E. ImageNet classification with deep convolutional neural networks ［C］// Proceedings of the 25th International Conference on Neural Information Processing Systems， Volume 1. Red Hook： Curran Associates Inc.， 2012： 1097-1105.
3	GIRSHICK R， DONAHUE J， DARRELL T， et al. Rich feature hierarchies for accurate object detection and semantic segmentation ［C］// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014： 580-587.
4	GIRSHICK R. Fast R-CNN ［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 1440-1448.
5	REN S， HE K， GIRSHICK R， et al. Faster R-CNN： towards real-time object detection with region proposal networks ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（6）： 1137-1149.
6	LIU W， ANGUELOV D， ERHAN D， et al. SSD： single shot multibox detector ［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS 9905. Cham： Springer， 2016： 21-37.
7	LIN T Y， GOYAL P， GIRSHICK R， et al. Focal loss for dense object detection ［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2999-300.
8	REDMON J， DIVVALA S K， GIRSHICK R， et al. You only look once： unified， real-time object detection ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 779-788.
9	汪鹏，辛雪静，王利琴，等.基于YOLOv3的光学遥感图像目标检测算法［J］.激光与光电子学进展， 2021， 58（20）： No.2028006.
	WANG P， XIN X J， WANG L Q， et al. Object detection algorithm of optical remote sensing images based on YOLOv3 ［J］. Laser and Optoelectronics Progress， 2021， 58（20）： No.2028006.
10	REDMON J， FARHADI A. YOLOv3： an incremental improvement ［EB/OL］. ［2023-12-10］. .
11	林文龙，阿里甫·库尔班，陈一潇，等.面向遥感影像目标检测的ACFEM-RetinaNet算法［J］.计算机工程与应用， 2024， 60（1）： 245-253.
	LIN W L， ALIFU K， CHEN Y X， et al. ACFEM RetinaNet algorithm for remote sensing image target detection ［J］. Computer Engineering and Applications， 2024， 60（1）： 245-253.
12	GONG H， MU T， LI Q， et al. Swin-transformer-enabled YOLOv5 with attention mechanism for small object detection on satellite images ［J］. Remote Sensing， 2022， 14（12）： No.2861.
13	周华平，郭伟.改进YOLOv5网络在遥感图像目标检测中的应用［J］.遥感信息， 2022， 37（5）： 23-30.
	ZHOU H P， GUO W. Improved YOLOv5 network in application of remote sensing image object detection ［J］. Remote Sensing Information， 2022， 37（5）： 23-30.
14	李惠惠，范军芳，陈启丽.改进YOLOv5的遥感图像目标检测［J］.弹箭与制导学报， 2022， 42（4）： 17-23.
	LI H H， FAN J F， CHEN Q L. Improved YOLOv5 remote sensing image target detection ［J］. Journal of Projectiles， Rockets， Missiles and Guidance， 2022， 42（4）： 17-23.
15	万羽欣.基于YOLO改进算法的遥感图像小目标检测方法研究［D］.北京：北京交通大学， 2022： 72-72.
	WAN Y X. Research on small object detection method in remote sensing image based on improved YOLO algorithm ［D］. Beijing： Beijing Jiaotong University， 2022： 72-72.
16	WANG P， SUN X， DIAO W， et al. FMSSD： feature-merged single-shot detection for multiscale objects in large-scale remote sensing imagery ［J］. IEEE Transactions on Geoscience and Remote Sensing， 2020， 58（5）： 3377-3390.
17	SHAO J， YANG Q， LUO C， et al. Vessel detection from nighttime remote sensing imagery based on deep learning ［J］. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing， 2021， 14： 12536-12544.
18	赵文清，孔子旭，周震东，等.增强小目标特征的航空遥感目标检测［J］.中国图象图形学报， 2021， 26（3）： 644-653.
	ZHAO W Q， KONG Z X， ZHOU Z D， et al. Target detection algorithm of aerial remote sensing based on feature enhancement technology ［J］. Journal of Image and Graphics， 2021， 26（3）： 644-653.
19	范新南，严炜，史朋飞，等.多尺度深度特征融合网络的遥感图像目标检测［J］.遥感学报， 2022， 26（11）： 2292-2303.
	FAN X N， YAN W， SHI P F， et al. Remote sensing image target detection based on a multi-scale deep feature fusion network ［J］. National Remote Sensing Bulletin， 2022， 26（11）： 2292-2303.
20	王怀济，李广明，张红良，等.融合卷积通道注意力的遥感图像目标检测方法［J］.计算机工程与应用， 2024， 60（2）： 200-210.
	WANG H J， LI G M， ZHANG H L， et al. Rotating object detection method based on convolutional block channel attention in remote sensing images ［J］. Computer Engineering and Applications， 2024， 60（2）： 200-210.
21	赵文清，康怿瑾，赵振兵，等.改进YOLOv5s的遥感图像目标检测［J］.智能系统学报， 2023， 18（1）： 86-95.
	ZHAO W Q， KANG Y J， ZHAO Z B， et al. A remote sensing image object detection algorithm with improved YOLOv5s ［J］. CAAI Transactions on Intelligent Systems， 2023， 18（1）： 86-95.
22	ZHU L， WANG X， KE Z， et al. BiFormer： vision transformer with bi-level routing attention ［C］// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2023： 10323-10333.
23	TAN M， PANG R， LE Q V. EfficientDet： scalable and efficient object detection ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 10778-10787.
24	REN S， ZHOU D， HE S， et al. Shunted self-attention via multi-scale token aggregation ［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 10843-10852.
25	CHENG G， HAN J， ZHOU P， et al. Multi-class geospatial object detection and geographic image classification based on collection of part detectors ［J］. ISPRS Journal of Photogrammetry and Remote Sensing， 2014， 98： 119-132.
26	LIN T Y， DOLLÁR P， GIRSHICK R， et al. Feature pyramid networks for object detection ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 936-944.
27	LIU S， QI L， QIN H， et al. Path aggregation network for instance segmentation ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 8759-8768.
28	张朝阳，张上，王恒涛，等.多尺度下遥感小目标多头注意力检测［J］.计算机工程与应用， 2023， 59（8）： 227-238.
	ZHANG Z Y， ZHANG S， WANG H T， et al. Multi-head attention detection of small targets in remote sensing at multiple scales ［J］. Computer Engineering and Applications， 2023， 59（8）： 227-238.
29	XIA G S， BAI X， DING J， et al. DOTA： a large-scale dataset for object detection in aerial images ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 3974-3983.
30	MA J， SHAO W， YE H， et al. Arbitrary-oriented scene text detection via rotation proposals ［J］. IEEE Transactions on Multimedia， 2018， 20（11）： 3111-3122.
31	戴媛，易本顺，肖进胜，等.基于改进旋转区域生成网络的遥感图像目标检测［J］.光学学报， 2020， 40（1）： No.0111020.
	DAI Y， YI B S， XIAO J S， et al. Object detection of remote sensing image based on improved rotation region proposal network ［J］. Acta Optica Sinica， 2020， 40（1）： No.0111020.
32	YANG X， YAN J. Arbitrary-oriented object detection with circular smooth label ［C］// Proceedings of the 2020 European Conference on Computer Vision， LNCS 12353. Cham： Springer， 2020： 677-694.
33	肖进胜，张舒豪，陈云华，等.双向特征融合与特征选择的遥感影像目标检测［J］.电子学报， 2022， 50（2）： 267-272.
	XIAO J S， ZHANG S H， CHEN Y H， et al. Remote sensing image object detection based on bidirectional feature fusion and feature selection ［J］. Acta Electronica Sinica， 2022， 50（2）： 267-272.
34	LI C， LI L， JIANG H， et al. YOLOv6： a single-stage object detection framework for industrial applications ［EB/OL］. ［2023-03-03］. .
35	BOCHKOVSKIY A， WANG C Y， LIAO H M. YOLOv4： optimal speed and accuracy of object detection ［EB/OL］. ［2023-02-20］. .

LCE卷积核大小	mAP/%	LCE卷积核大小	mAP/%
5	94.43	33	95.24
15	94.48	35	95.37
20	95.05	40	95.14
30	95.16

LCE卷积核大小	mAP/%	LCE卷积核大小	mAP/%
5	94.43	33	95.24
15	94.48	35	95.37
20	95.05	40	95.14
30	95.16

算法	AP				mAP
算法	船只	飞机	小型汽车	大型汽车	mAP
RetinaNet^［6］	62.2	83.4	65.7	48.3	64.90
YOLO-CLD^［15］	57.6	67.6	37.9	60.1	55.80
FMSSD^［16］	76.9	89.1	69.2	73.6	77.20
RRPN^［30］	47.2	83.9	34.7	49.7	53.88
Dai^［31］	65.0	78.0	37.0	59.0	59.75
CSL^［32］	64.9	84.2	67.6	51.5	67.05
Xiao^［33］	66.5	85.7	69.2	54.2	68.90
YOLOv6^［34］	71.0	58.5	23.2	37.9	47.65
YOLOv8	70.4	60.9	30.0	45.2	51.63
YOLOv5	78.7	78.7	54.2	60.3	67.98
ACM-YOLO	85.4	89.2	60.3	74.4	77.33

算法	AP				mAP
算法	船只	飞机	小型汽车	大型汽车	mAP
RetinaNet^［6］	62.2	83.4	65.7	48.3	64.90
YOLO-CLD^［15］	57.6	67.6	37.9	60.1	55.80
FMSSD^［16］	76.9	89.1	69.2	73.6	77.20
RRPN^［30］	47.2	83.9	34.7	49.7	53.88
Dai^［31］	65.0	78.0	37.0	59.0	59.75
CSL^［32］	64.9	84.2	67.6	51.5	67.05
Xiao^［33］	66.5	85.7	69.2	54.2	68.90
YOLOv6^［34］	71.0	58.5	23.2	37.9	47.65
YOLOv8	70.4	60.9	30.0	45.2	51.63
YOLOv5	78.7	78.7	54.2	60.3	67.98
ACM-YOLO	85.4	89.2	60.3	74.4	77.33

算法	精确率	召回率	mAP
Faster-R CNN^［5］	63.5	90.8	76.47
SSD^［6］	92.3	78.2	74.12
Fan^［19］	—	—	93.36
YOLOv3^［10］	88.7	86.1	84.35
YOLOv4^［35］	87.6	89.7	86.76
YOLOv6^［34］	93.5	85.0	90.40
YOLOv8	93.9	84.8	90.90
YOLOv5	96.8	87.3	90.70
ACM-YOLO	95.6	94.8	96.12

融合注意力和上下文信息的遥感图像小目标检测算法

Small target detection algorithm in remote sensing images integrating attention and contextual information

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 13

参考文献 35

相关文章 15

编辑推荐

Metrics

算法	AP										mAP	小目标mAP
算法	飞机	船舰	储油罐	棒球场	网球场	篮球场	田径场	港口	桥梁	车辆	mAP	小目标mAP
Faster-RCNN^［5］	82.8	77.6	52.5	96.4	62.7	69.4	98.2	82.6	78.8	63.7	76.47	74.70
SSD^［6］	90.3	72.5	60.3	87.5	58.9	65.2	90.3	80.5	77.9	57.8	74.12	73.53
Fan^［19］	99.9	90.7	89.5	92.4	99.2	90.8	90.7	99.2	90.9	90.3	93.36	93.63
YOLOv3^［10］	92.5	75.8	86.1	89.3	82.7	75.5	88.4	90.2	84.4	78.6	84.35	82.30
YOLOv4^［35］	94.6	79.8	94.1	95.4	89.2	71.5	98.7	80.6	95.3	68.4	86.76	80.93
YOLOv6^［34］	99.5	89.6	98.7	98.0	89.9	67.9	99.4	98.8	77.0	84.9	90.40	91.30
YOLOv8	99.4	89.2	99.0	98.4	89.8	68.2	99.4	97.5	80.1	87.7	90.90	92.10
YOLOv5	99.4	88.6	98.5	98.8	82.1	75.8	99.5	98.2	80.2	86.0	90.70	91.33
ACM-YOLO	99.5	91.0	98.5	98.9	96.2	97.3	99.5	99.3	89.2	91.6	96.12	94.03

算法	精确率/%	召回率/%	权重模型大小/MB	参数量/10⁶	GPU浮点运算数/GFLOPs	mAP/%
YOLOv5	77.8	64.5	14.5	7.05	15.9	67.98
YOLOv5_Attention&LCE	81.1	72.7	15.2	7.38	275.3	76.69
YOLOv5_BiFPN	80.7	74.3	14.7	7.11	16.1	77.00
ACM-YOLO	80.9	74.5	15.2	7.38	275.3	77.33

[1]	宋鹏程, 郭立君, 张荣. 利用局部-全局时间依赖的弱监督视频异常检测[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 240-246.
[2]	李烨恒, 罗光圣, 苏前敏. 基于改进YOLOv5的Logo检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2580-2587.
[3]	姬张建, 杜娜. 基于改进VariFocalNet的微小目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2200-2207.
[4]	刘越, 刘芳, 武奥运, 柴秋月, 王天笑. 基于自注意力机制与图卷积的3D目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1972-1977.
[5]	李鸿天, 史鑫昊, 潘卫国, 徐成, 徐冰心, 袁家政. 融合多尺度和注意力机制的小样本目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1437-1444.
[6]	吴宁, 罗杨洋, 许华杰. 基于多尺度特征融合的遥感图像语义分割方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 737-744.
[7]	蒋占军, 吴佰靖, 马龙, 廉敬. 多尺度特征和极化自注意力的Faster-RCNN水漂垃圾识别[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 938-944.
[8]	李雨秋, 侯利萍, 薛健, 吕科, 王泳. 基于内容解译的遥感图像推荐方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 722-731.
[9]	王林, 刘景亮, 王无为. 基于空洞卷积融合Transformer的无人机图像小目标检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3595-3602.
[10]	李子怡, 曲婷婷, 崇乾鹏, 徐金东. 基于模糊多尺度特征的遥感图像分割网络[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3581-3586.
[11]	刘涛, 鞠事宏, 高一萌. 基于改进YOLOv8n的无人机视角下小目标检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3603-3609.
[12]	杨昊, 张轶. 基于上下文信息和多尺度融合重要性感知的特征金字塔网络算法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2727-2734.
[13]	路琨婷, 费蓉蓉, 张选德. 融合卷积神经网络的遥感图像全色锐化[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2963-2969.
[14]	梁美佳, 刘昕武, 胡晓鹏. 基于改进YOLOv3的列车运行环境图像小目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2611-2618.
[15]	郑帅, 张晓龙, 邓鹤, 任宏伟. 基于多尺度特征融合和网格注意力机制的三维肝脏影像分割方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2303-2310.