基于改进YOLOv3的遥感图像小目标检测

doi:10.11772/j.issn.1001-9081.2021101802

《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (12): 3723-3732.DOI: 10.11772/j.issn.1001-9081.2021101802

所属专题：人工智能

基于改进YOLOv3的遥感图像小目标检测

冯号¹, 黄朝兵¹(), 文元桥²

^1.武汉理工大学信息工程学院，武汉 430070
^2.武汉理工大学智能交通系统研究中心，武汉，430063

收稿日期:2021-10-22 修回日期:2022-01-10 接受日期:2022-01-14 发布日期:2022-01-19 出版日期:2022-12-10
通讯作者: 黄朝兵
作者简介:冯号（1996—），男，重庆人，硕士研究生，主要研究方向：信息处理、图像处理与识别
文元桥（1975—），男，湖北松滋人，教授，博士，主要研究方向：水上交通安全、智能船舶。
基金资助:
国家自然科学基金资助项目(52072287)

Remote sensing image small target detection based on improved YOLOv3

Hao FENG¹, Chaobing HUANG¹(), Yuanqiao WEN²

^1.School of Information Engineering，Wuhan University of Technology，Wuhan Hubei 430070，China
^2.Intelligent Transportation Systems Research Center，Wuhan University of Technology，Wuhan Hubei 430063，China

Received:2021-10-22 Revised:2022-01-10 Accepted:2022-01-14 Online:2022-01-19 Published:2022-12-10
Contact: Chaobing HUANG
About author:FENG Hao， born in 1996， M. S. candidate. His research interests include information processing， image processing and recognition.
WEN Yuanqiao， born in 1975， Ph. D.， professor. His research interests include water traffic safety， intelligent ships.
Supported by:
National Natural Science Foundation of China(52072287)

摘要/Abstract

摘要：

YOLOv3算法被广泛地应用于目标检测任务。虽然在YOLOv3基础上改进的一些算法取得了一定的成果，但是仍存在表征能力不足且检测精度不高的问题，尤其对小目标的检测还不能满足需求。针对上述问题，提出了一种改进YOLOv3的遥感图像小目标检测算法。首先，使用K均值聚类变换（K-means-T）算法优化锚框的大小，从而提升先验框和真实框之间的匹配度；其次，优化置信度损失函数，以解决难易样本分布不均衡的问题；最后，引入注意力机制来提高算法对细节信息的感知能力。在RSOD数据集上进行实验的结果显示，与原始的YOLOv3算法、YOLOv4算法相比，所提算法在小目标“飞机（aircraft）”类上的平均精确率（AP）分别提高了7.3个百分点和5.9个百分点。这表明所提算法能够有效检测遥感图像小目标，具有更高的准确率。

关键词: 小目标检测, YOLOv3, K均值聚类变换, 置信度损失函数, 注意力机制

Abstract:

YOLOv3 （You Only Look Once version 3） algorithm is widely used in target detection tasks. Although some improved algorithms based on YOLOv3 have achieved some results， there are still problems of insufficient representation ability and low detection accuracy， especially for the detection of small targets. In order to solve the above problems， a small target detection algorithm for remote sensing images based on YOLOv3 was proposed. Firstly， K-means Transformation （K-means-T） algorithm was used to optimize the size of anchor box， so that the matching degree between the priori box and ground truth box was improved. Secondly， the confidence loss function was optimized to solve the problem of uneven distribution of hard and easy samples. Finally， attention mechanism was introduced to improve the algorithm’s ability to perceive the detailed information. Results of the experiments carried out on RSOD dataset show that compared with the original YOLOv3 algorithm and YOLOv4 algorithm， the proposed algorithm has the detection Average Precision （AP） on the small target class “aircraft” increased by 7.3 percentage points and 5.9 percentage points respectively， illustrating that the proposed improved algorithm can detect small targets in remote sensing images effectively， with higher accuracy.

Key words: small target detection, YOLO (You Only Look Once)v3, K-means Transformation (K-means-T), confidence loss function, attention mechanism

中图分类号:

TP391.4

冯号, 黄朝兵, 文元桥. 基于改进YOLOv3的遥感图像小目标检测[J]. 计算机应用, 2022, 42(12): 3723-3732.

Hao FENG, Chaobing HUANG, Yuanqiao WEN. Remote sensing image small target detection based on improved YOLOv3[J]. Journal of Computer Applications, 2022, 42(12): 3723-3732.

图/表 19

图1 YOLOv3的模型结构

Fig. 1 Model structure of YOLOv3

图2 多尺度训练中的Anchor

Fig. 2 Anchor in multi-scale training

图3 CA模块

Fig.3 CA module

图4 RSOD数据集样例

Fig. 4 Samples of RSOD dataset

图5 RSOD数据集上不同类的目标标注框占原图面积的比例分布

Fig. 5 Proportion distribution of area of labeled box to the original image for different classes on RSOD dataset

图6 RSOD数据集中目标占原图大小的比例分布

Fig.6 Distribution of proportion of target size to original image in RSOD dataset

图7 YOLOv3-AKT的模型结构

Fig. 7 Model structure of YOLOv3-AKT

图8 不同算法的检测精度比较

Fig.8 Detection accuracy comparison of different algorithms

表1 不同算法的检测精度比较

Tab.1 Comparison of detection accuracy among different algorithms

算法	mAP@0.5	F1	AP（aircraft）	F1（aircraft）
YOLOv3	0.833	0.825	0.850	0.850
YOLOv4^［7］	0.862	0.835	0.864	0.840
EfficientDet^［18］	0.881	0.858	0.603	0.620
文献［19］算法	0.903	0.865	—	—
YOLOv3^［15］	0.827	0.835	0.833	0.830
YOLOv3-AKT	0.903	0.870	0.923	0.900

图9 不同算法的预测结果对比

Fig. 9 Comparison of prediction results of different algorithms

图10 困难样本损失占总置信度损失的比例

Fig. 10 Proportion of loss of hard samples to total confidence loss

图11 原始YOLOv3和 YOLOv3-C的AP对比

Fig. 11 AP comparison of original YOLOv3 and YOLOv3-C

表2 不同中心框下的锚框

Tab. 2 Anchor boxes under different center anchor boxes

特征图	52×52	26×26	13×13
Anchor^［15］	（8，8），（11，12），（15，14）	（18，19），（23，24），（30，32）	（40，44），（51，58），（145，178）
Anchor-T（c=1）	（8，8），（14，15），（22，20）	（28，30），（38，40），（53，56）	（73，81），（96，109），（290，356）
Anchor-T（c=3）	（4，4），（8，9），（15，14）	（21，22），（31，33），（46，49）	（67，74），（91，103），（290，356）
Anchor-T（c=5）	（4，4），（7，8），（12，12）	（16，17），（23，24），（38，40）	（60，66），（84，95），（290，356）
Anchor-T（c=7）	（4，4），（7，8），（11，11）	（15，16），（20，21），（28，30）	（40，44），（66，75），（290，356）
Anchor-T （c=9）	（4，4），（7，7），（11，10）	（14，15），（19，20），（26，28）	（36，40），（48，54），（290，356）
Anchor^［13］	（4，4），（10，11），（18，17）	（24，26），（35，36），（49，53）	（70，77），（93，106），（290，356）

图12 K-means-T取不同中心框时AP（aircraft）的比较

Fig. 12 Comparison of AP（aircraft） when K-means-T taking different center anchor boxes

图13 不同算法下YOLOv3的平均损失

Fig. 13 Average losses of YOLOv3 using different algorithms

表3 不同锚框优化算法的精度比较

Tab. 3 Accuracy comparison of different anchor box optimization algorithms

算法	mAP@0.5	F1	AP（aircraft）
原始YOLOv3	0.840	0.850	0.852
YOLOv3^［15］	0.827	0.835	0.833
YOLOv3^［13］	0.853	0.855	0.870
YOLOv3-T	0.868	0.860	0.884

图14 优化Anchor前后的YOLOv3 在数据集各类目标上的精度

Fig.14 Accuracy of YOLOv3 before and after optimizing Anchor on different classes of targets in dataset

表4 不同位置插入CA的精度比较

Tab. 4 Accuracy comparison of CA inserting in different positions

CA模块位置	mAP@0.5	AP（aircraft）	F1（aircraft）
YOLOv3	0.840	0.850	0.850
检测头1	0.849	0.882	0.860
检测头2	0.846	0.884	0.860
检测头3	0.869	0.901	0.880

图15 引入CA前后的YOLOv3在数据集各类目标上的AP

Fig.15 AP of YOLOv3 before and after introducing CA on different classes of targets in dataset

参考文献 19

1	LONG J， SHELHAMER E， DARRELL T. Fully convolutional networks for semantic segmentation［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 3431-3440. 10.1109/cvpr.2015.7298965
2	GIRSHICK R， DONAHUE J， DARRELL T， et al. Rich feature hierarchies for accurate object detection and semantic segmentation［C］// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014： 580-587. 10.1109/cvpr.2014.81
3	GIRSHICK R. Fast R-CNN［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 1440-1448. 10.1109/iccv.2015.169
4	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： unified， real-time object detection［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 779-788. 10.1109/cvpr.2016.91
5	REDMON J， FARHADI A. YOLO9000： better， faster， stronger［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017：6517-6525. 10.1109/cvpr.2017.690
6	REDMON J， FARHADI A. YOLOv3： an incremental improvement［EB/OL］. （2018-04-08）［2021-09-10］.. 10.1109/cvpr.2017.690
7	BOCHKOVSKIY A， WANG C Y， LIAO H Y M. YOLOv4： optimal speed and accuracy of object detection［EB/OL］. （2020-04-23）［2021-09-14］..
8	LIN T Y， GOYAL P， GIRSHICK R， et al. Focal loss for dense object detection［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2020， 42（2）： 318-327. 10.1109/tpami.2018.2858826
9	WANG F L， SU J Y. Based on the improved YOLOV3 small target detection algorithm［C］// Proceedings of the IEEE 4th Advanced Information Management， Communicates， Electronic and Automation Control Conference. Piscataway： IEEE， 2021： 2155-2159. 10.1109/imcec51613.2021.9482076
10	LIN T Y， DOLLÁR P， GIRSHICK R， et al. Feature pyramid networks for object detection［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 936-944. 10.1109/cvpr.2017.106
11	KISANTAL M， WOJNA Z， MURAWSKI J， et al. Augmentation for small object detection［EB/OL］. （2019-02-19）［2021-08-15］.. 10.5121/csit.2019.91713
12	LIU S T， HUANG D， WANG Y H. Receptive field block net for accurate and fast object detection［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11215. Cham： Springer， 2018： 404-419.
13	邵慧翔，曾丹. 基于改进YOLOv3算法的水下小目标分类与识别［J］. 上海大学学报（自然科学版）， 2021， 27（3）：481-491. 10.12066/j.issn.1007-2861.2279
	SHAO H X， ZENG D. Classification and recognition of underwater small targets based on improved YOLOv3 algorithm［J］. Journal of Shanghai University （Natural Science Edition）， 2021， 27（3）：481-491. 10.12066/j.issn.1007-2861.2279
14	于洋，李世杰，陈亮，等. 基于改进 YOLO v2 的船舶目标检测方法［J］. 计算机科学， 2019， 46（8）： 332-336.
	YU Y， LI S J， CHEN L， et al. Ship target detection based on improved YOLO v2［J］. Computer Science， 2019， 46（8）： 332-336.
15	YE K Q， FANG Z B， HUANG X J， et al. Research on small target detection algorithm based on improved YOLOv3［C］// Proceedings of the 5th International Conference on Mechanical， Control and Computer Engineering. Piscataway： IEEE， 2020： 1467-1470. 10.1109/icmcce51767.2020.00321
16	REZAEE M， ZHANG Y， MISHRA R， et al. Using a VGG-16 network for individual tree species detection with an object-based approach［C］// Proceedings of the 10th IAPR Workshop on Pattern Recognition in Remote Sensing. Piscataway： IEEE， 2018： 1-7. 10.1109/prrs.2018.8486395
17	LI B Q， HE Y Y. An improved ResNet based on the adjustable shortcut connections［J］. IEEE Access， 2018， 6：18967-18974. 10.1109/access.2018.2814605
18	TAN M X， PANG R M， LE Q V. EfficientDet： scalable and efficient object detection ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 10778-10787. 10.1109/cvpr42600.2020.01079
19	王建军，魏江，梅少辉，等. 面向遥感图像小目标检测的改进YOLOv3算法［J］. 计算机工程与应用， 2021， 57（20）： 133-141.
	WANG J J， WEI J， MEI S H， et al. Improved YOLOv3 for small target detection in remote sensing images［J］. Computer Engineering and Applications， 2021， 57（20）： 133-141.

[1]	秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974.
[2]	李力铤, 华蓓, 贺若舟, 徐况. 基于解耦注意力机制的多变量时序预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2732-2738.
[3]	赵志强, 马培红, 黑新宏. 基于双重注意力机制的人群计数方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2886-2892.
[4]	薛凯鹏, 徐涛, 廖春节. 融合自监督和多层交叉注意力的多模态情感分析网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2387-2392.
[5]	汪雨晴, 朱广丽, 段文杰, 李书羽, 周若彤. 基于交互注意力机制的心理咨询文本情感分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2393-2399.
[6]	高鹏淇, 黄鹤鸣, 樊永红. 融合坐标与多头注意力机制的交互语音情感识别[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2400-2406.
[7]	李钟华, 白云起, 王雪津, 黄雷雷, 林初俊, 廖诗宇. 基于图像增强的低照度人脸检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2588-2594.
[8]	莫尚斌, 王文君, 董凌, 高盛祥, 余正涛. 基于多路信息聚合协同解码的单通道语音增强[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2611-2617.
[9]	李烨恒, 罗光圣, 苏前敏. 基于改进YOLOv5的Logo检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2580-2587.
[10]	刘丽, 侯海金, 王安红, 张涛. 基于多尺度注意力的生成式信息隐藏算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2102-2109.
[11]	徐松, 张文博, 王一帆. 基于时空信息的轻量视频显著性目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2192-2199.
[12]	李大海, 王忠华, 王振东. 结合空间域和频域信息的双分支低光照图像增强网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2175-2182.
[13]	魏文亮, 王阳萍, 岳彪, 王安政, 张哲. 基于光照权重分配和注意力的红外与可见光图像融合深度学习模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2183-2191.
[14]	熊武, 曹从军, 宋雪芳, 邵云龙, 王旭升. 基于多尺度混合域注意力机制的笔迹鉴别方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2225-2232.
[15]	姬张建, 杜娜. 基于改进VariFocalNet的微小目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2200-2207.

基于改进YOLOv3的遥感图像小目标检测

Remote sensing image small target detection based on improved YOLOv3

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 19

参考文献 19

相关文章 15

编辑推荐

Metrics