《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (4): 1114-1120.DOI: 10.11772/j.issn.1001-9081.2023081042
• 人工智能 • 上一篇
收稿日期:
2023-08-08
修回日期:
2023-12-04
发布日期:
2023-12-18
出版日期:
2024-04-10
通讯作者:
朱家煊
作者简介:
陈天华(1966—),男,湖南长沙人,教授,硕士,主要研究方向:图像处理、模式识别、测控技术基金资助:
Tianhua CHEN1, Jiaxuan ZHU1(), Jie YIN2
Received:
2023-08-08
Revised:
2023-12-04
Online:
2023-12-18
Published:
2024-04-10
Contact:
Jiaxuan ZHU
About author:
CHEN Tianhua, born in 1966, M. S., professor. His research interests include image processing, pattern recognition, measurement and control technology.Supported by:
摘要:
针对现有细粒度鸟类目标识别算法准确率不高的问题,提出一种鸟类目标检测算法YOLOv5-Bird。首先,在YOLOv5主干网络中引入基于混合域的坐标注意力(CA)机制,增大有价值的通道权重,以区分目标特征和背景中的冗余特征;其次,在原始主干网络中采用双层路由注意力(BRA)模块替换原网络中的部分C3模块,过滤低相关度的键值对信息,获得高效的长距离依赖关系;最后,使用WIoU(Wise-Intersection over Union)损失函数,增强算法对目标的定位能力。实验结果表明,YOLOv5-Bird在自建数据集上取得了82.8%的精确率和77.0%的召回率,比YOLOv5算法分别提高4.3和7.6个百分点,也优于增加其他注意力机制的算法。验证了YOLOv5-Bird在鸟类目标检测场景中具有较好的性能。
中图分类号:
陈天华, 朱家煊, 印杰. 基于注意力机制的鸟类识别算法[J]. 计算机应用, 2024, 44(4): 1114-1120.
Tianhua CHEN, Jiaxuan ZHU, Jie YIN. Bird recognition algorithm based on attention mechanism[J]. Journal of Computer Applications, 2024, 44(4): 1114-1120.
名称 | 参数 |
---|---|
操作系统 | Windows 10 |
CPU | Intel Core i5-10400F |
GPU | NVIDIA GeForce GTX 1070 |
软件 | Anaconda、PyCharm2021 |
深度学习平台 | Python 3.7 |
深度学习框架 | PyTorch 1.8.0 |
GPU加速库 | CUDA 11.7 |
表1 实验环境
Tab. 1 Experimental environment
名称 | 参数 |
---|---|
操作系统 | Windows 10 |
CPU | Intel Core i5-10400F |
GPU | NVIDIA GeForce GTX 1070 |
软件 | Anaconda、PyCharm2021 |
深度学习平台 | Python 3.7 |
深度学习框架 | PyTorch 1.8.0 |
GPU加速库 | CUDA 11.7 |
算法 | 精确率 | 召回率 |
---|---|---|
YOLOv5 | 78.5 | 69.4 |
Faster R-CNN | 77.7 | 70.2 |
SSD | 63.1 | 64.7 |
YOLOv7 | 78.8 | 70.3 |
EfficientDet | 78.3 | 67.6 |
YOLOv5+CBAM | 79.0 | 70.8 |
YOLOv5+SE | 79.5 | 70.9 |
YOLOv5+CA | 79.5 | 71.1 |
YOLOv5+BRA | 80.5 | 70.4 |
YOLOv5+Swin Transformer | 80.0 | 70.5 |
YOLOv5+CotNet | 79.9 | 69.8 |
YOLOv5+MobileViT | 79.5 | 69.9 |
YOLOv5-Bird | 82.8 | 77.0 |
表2 不同算法实验数据对比 (%)
Tab. 2 Experimental data comparison of different algorithms
算法 | 精确率 | 召回率 |
---|---|---|
YOLOv5 | 78.5 | 69.4 |
Faster R-CNN | 77.7 | 70.2 |
SSD | 63.1 | 64.7 |
YOLOv7 | 78.8 | 70.3 |
EfficientDet | 78.3 | 67.6 |
YOLOv5+CBAM | 79.0 | 70.8 |
YOLOv5+SE | 79.5 | 70.9 |
YOLOv5+CA | 79.5 | 71.1 |
YOLOv5+BRA | 80.5 | 70.4 |
YOLOv5+Swin Transformer | 80.0 | 70.5 |
YOLOv5+CotNet | 79.9 | 69.8 |
YOLOv5+MobileViT | 79.5 | 69.9 |
YOLOv5-Bird | 82.8 | 77.0 |
CA | WIoU | BRA | 精确率 | 召回率 | mAP@0.5 | mAP@0.5:0.95 |
---|---|---|---|---|---|---|
78.5 | 69.4 | 73.6 | 54.3 | |||
√ | 79.5 | 71.1 | 74.5 | 56.1 | ||
√ | 79.7 | 71.9 | 76.5 | 57.5 | ||
√ | 80.5 | 70.4 | 75.5 | 54.7 | ||
√ | √ | 80.7 | 74.3 | 79.6 | 58.0 | |
√ | √ | 80.0 | 72.3 | 76.7 | 57.1 | |
√ | √ | 81.4 | 74.2 | 79.7 | 58.4 | |
√ | √ | √ | 82.8 | 77.0 | 80.7 | 59.4 |
表3 消融实验结果 (%)
Tab. 3 Ablation experimental results
CA | WIoU | BRA | 精确率 | 召回率 | mAP@0.5 | mAP@0.5:0.95 |
---|---|---|---|---|---|---|
78.5 | 69.4 | 73.6 | 54.3 | |||
√ | 79.5 | 71.1 | 74.5 | 56.1 | ||
√ | 79.7 | 71.9 | 76.5 | 57.5 | ||
√ | 80.5 | 70.4 | 75.5 | 54.7 | ||
√ | √ | 80.7 | 74.3 | 79.6 | 58.0 | |
√ | √ | 80.0 | 72.3 | 76.7 | 57.1 | |
√ | √ | 81.4 | 74.2 | 79.7 | 58.4 | |
√ | √ | √ | 82.8 | 77.0 | 80.7 | 59.4 |
1 | 李祎可,王强,李星醇,等. 边缘效应对湿地中鸟类的影响机制研究进展[J]. 湿地科学, 2022, 20(5): 613-621. |
LI Y K, WANG Q, LI X C, et al. Progress on the impact mechanism of edge effect on birds in wetlands[J]. Wetland Science, 2022, 20(5): 613-621. | |
2 | 唐鑫鑫. 基于深度学习的鸟类识别研究[D].贵阳:贵州大学,2022:002606. |
TANG X X. Research on bird recognition based on deep learning [D]. Guiyang: Guizhou University, 2022:002606. | |
3 | 李华超,康彬,王磊. 常识辅助细粒度数据增强方法[J]. 计算机工程与应用, 2024, 60(6):214-221. 10.3778/j.issn.1002-8331.2210-0361 |
LI H C, KANG B, WANG L. Commonsense oriented fine-grained data augmentation[J]. Computer Engineering and Applicaions, 2024, 60(6):214-221. 10.3778/j.issn.1002-8331.2210-0361 | |
4 | 李柯泉,陈燕,刘佳晨,等.基于深度学习的目标检测算法综述[J].计算机工程,2022,48(7):1-12. |
LI K Q, CHEN Y, LIU J C, et al. Survey of deep learning-based object detection algorithms[J]. Computer Engineering, 2022,48(7):1-12. | |
5 | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 580-587. 10.1109/cvpr.2014.81 |
6 | GIRSHICK R. Fast R-CNN[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015:1440-1448. 10.1109/iccv.2015.169 |
7 | REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149. 10.1109/tpami.2016.2577031 |
8 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 779-788. 10.1109/cvpr.2016.91 |
9 | REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017:6517-6525. 10.1109/cvpr.2017.690 |
10 | REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL].(2018-04-08)[2023-07-30]. . 10.1109/cvpr.2017.690 |
11 | BOCHKOVSKIY A, WANG C-Y, LIAO H-Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2023-07-30]. . |
12 | LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8759-8768. 10.1109/cvpr.2018.00913 |
13 | 许德刚,王露,李凡. 深度学习的典型目标检测算法研究综述[J]. 计算机工程与应用, 2021, 57(8): 10-25. |
XU D G, WANG L, LI F. Review of typical object detection algorithms for deep learning[J]. Computer Engineering and Applications, 2021, 57(8): 10-25. | |
14 | HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021:13713-13722. 10.1109/cvpr46437.2021.01350 |
15 | TONG Z, CHEN Y, XU Z, et al. Wise-IoU: bounding box regression loss with dynamic focusing mechanism[EB/OL]. (2023-01-24) [2023-02-06]. . |
16 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018:7132-7141. 10.1109/cvpr.2018.00745 |
17 | WOO S, PARK J, LEE J-Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 3-19. 10.1007/978-3-030-01234-2_1 |
18 | 黄子杰,欧阳,江德港,等. 面向牵引座焊缝表面质量检测的轻量型深度学习算法[J]. 计算机应用, 2024, 44(3):983-988. |
HUANG Z J, OU Y, JIANG D G,et al. Lightweight deep learning algorithm orienting for weld seam surface quality inspection of traction seat[J]. Journal of Computer Applications, 2024, 44(3):983-988. | |
19 | 任欢,王旭光.注意力机制综述[J].计算机应用,2021,41(S1):1-6. 10.11772/j.issn.1001-9081.2020101634 |
REN H, WANG X G. Review of attention mechanism[J]. Journal of Computer Applications, 2021, 41(S1):1-6. 10.11772/j.issn.1001-9081.2020101634 | |
20 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words:Transformers for image recognition at scale[EB/OL]. [2023-07-30]. . |
21 | 顾勇翔, 蓝鑫, 伏博毅, 等. 基于几何适应与全局感知的遥感图像目标检测算法[J]. 计算机应用, 2023, 43(3): 916-922. |
GU Y X, LAN X, FU B Y, et al. Object detection algorithm for remote sensing images based on geometric adaptation and global perception[J]. Journal of Computer Applications, 2023, 43(3): 916-922. | |
22 | LIU Z, LIN Y, CAO Y, et al. Swin Transformer: hierarchical vision Transformer using shifted windows[C]// Proceedings of 2021 IEEE/CVF International Conference on Computer Vision.Piscataway: IEEE, 2021: 10012-10022. 10.1109/iccv48922.2021.00986 |
23 | LI Y, YAO T, PAN Y, et al. Contextual transformer networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(2): 1489-1500. 10.1109/tpami.2022.3164083 |
24 | MEHTA S, RASTEGARI M. MobileViT: light-weight, general-purpose, and mobile-friendly vision transformer [EB/OL]. [2023-07-30]. . 10.1109/cvpr.2019.00941 |
25 | 王越,冯振.基于CAM与双线性网络的鸟类图像识别方法[J].重庆理工大学学报(自然科学),2021,35(11):136-141,239. |
WANG Y, FENG Z. Bird image recognition method based on CAM and bilinear network[J]. Journal of Chongqing University of Technology (Natural Science), 2021,35(11):136-141,239. | |
26 | 林梦翔, 林志玮, 黄秀萍,等. 融合全局与随机局部特征的鸟类姿态识别模型[J]. 计算机辅助设计与图形学学报, 2022,34(4):581-591. |
LIN M X, LIN Z W, HUANG X P, et al. Bird postures recognition model fusing global and random local features[J]. Journal of Computer-Aided Design & Computer Graphics, 2022,34(4):581-591. | |
27 | 吴洋铭,洪翠,高伟.基于雷达点云与视觉图像融合的输电线路探鸟驱鸟技术[J].高电压技术, 2023, 49(8): 3446-3457. |
WU Y M, HONG C, GAO W. Bird detecting and bird repelling technology for transmission lines based on the fusion of radar point cloud and visual image[J]. High Voltage Engineering, 2023, 49(8): 3446-3457. | |
28 | 王蕊,史玉龙,孙辉,等.基于轻量化的高分辨率鸟群识别深度学习网络[J].华中科技大学学报(自然科学版), 2023, 51(5): 81-87. |
WANG R, SHI Y L, SUN H, et al. Lightweight-based high resolution bird flocking recognition deep learning network[J]. Journal of Huazhong University of Science and Technology (Natural Science Edition), 2023, 51(5): 81-87. | |
29 | 邓亚平,李迎江 .YOLO算法及其在自动驾驶场景中目标检测研究综述[J/OL].计算机应用: 1-12 [2023-07-30]. . 10.11772/j.issn.1001-9081.2023060889 |
DENG Y P, LI Y J. Review of YOLO algorithm and its application to object detection in autonomous driving scenes[J/OL].Journal of Computer Applications: 1-12 [2023-07-30].. 10.11772/j.issn.1001-9081.2023060889 | |
30 | WANG C-Y, LIAO H-Y M, WU Y-H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2020: 1571-1580. 10.1109/cvprw50498.2020.00203 |
31 | 李建, 杜建强, 朱彦陈, 等. 基于Transformer的目标检测算法综述[J]. 计算机工程与应用, 2023, 59(10): 48-64. 10.3778/j.issn.1002-8331.2211-0133 |
LI J, DU J Q, ZHU Y C, et al. Survey of Transformer-based object detection algorithms[J]. Computer Engineering and Applications, 2023, 59(10): 48-64. 10.3778/j.issn.1002-8331.2211-0133 | |
32 | ZHU L, WANG X, KE Z, et al. BiFormer: vision transformer with bi-level routing attention[C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023:10323-10333. 10.1109/cvpr52729.2023.00995 |
33 | REN S, ZHOU D, HE S, et al. Shunted self-attention via multi-scale token aggregation[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 10853-10862. 10.1109/cvpr52688.2022.01058 |
34 | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]// Proceedings of the 14th European Conference on Computer Vision, LNCS 9905. Cham: Springer, 2016: 21-37. |
35 | TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 10778-10787. 10.1109/cvpr42600.2020.01079 |
36 | WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 7464-7475. 10.1109/cvpr52729.2023.00721 |
[1] | 袁泉, 陈昌平, 陈泽, 詹林峰. 基于BERT的两次注意力机制远程监督关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1080-1085. |
[2] | 张鹏飞, 韩李涛, 冯恒健, 李洪梅. 基于注意力机制和全局特征优化的点云语义分割[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1086-1092. |
[3] | 王杰, 孟华. 基于点云整体拓扑结构的图像分类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1107-1113. |
[4] | 肖斌, 甘昀, 汪敏, 张兴鹏, 王照星. 基于端口注意力与通道空间注意力的网络异常流量检测[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1027-1034. |
[5] | 杨先凤, 汤依磊, 李自强. 基于交替注意力机制和图卷积网络的方面级情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1058-1064. |
[6] | 王海涵, 朱焱. 融合反讽机制的攻击性言论检测[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1065-1071. |
[7] | 蒋占军, 吴佰靖, 马龙, 廉敬. 多尺度特征和极化自注意力的Faster-RCNN水漂垃圾识别[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 938-944. |
[8] | 周景贤, 李希娜. 基于改进卷积神经网络和射频指纹的无人机检测与识别[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 876-882. |
[9] | 黄子杰, 欧阳, 江德港, 郭彩玲, 李柏林. 面向牵引座焊缝表面质量检测的轻量型深度学习算法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 983-988. |
[10] | 郑宇亮, 陈云华, 白伟杰, 陈平华. 融合事件数据和图像帧的车辆目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 931-937. |
[11] | 赵奎, 仇慧琪, 李旭, 徐知非. 结合注意力和多路径融合的实时肺结节检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 945-952. |
[12] | 侯瑞峰, 张鹏程, 张丽媛, 桂志国, 刘祎, 张浩文, 王书斌. 基于全变分正则项展开的迭代去噪网络[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 916-921. |
[13] | 孙滔, 段张甜, 朱浩楠, 郭沛豪, 孙鹤立. 基于新奇度量的社交事件推荐方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 760-766. |
[14] | 尚爱国, 朱欣娟. 基于多任务学习的意图检测和槽位填充联合方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 690-695. |
[15] | 王伟, 赵春辉, 唐心瑶, 席刘钢. 自适应地平线约束下的车辆三维检测[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 909-915. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||