《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (7): 2210-2218.DOI: 10.11772/j.issn.1001-9081.2021040648
张诗文1,2,3, 邓春华1,2,3(), 张俊雯1,2,3
收稿日期:
2021-04-25
修回日期:
2021-06-25
接受日期:
2021-07-09
发布日期:
2022-07-15
出版日期:
2022-07-10
通讯作者:
邓春华
作者简介:
张诗文(1997—),男,湖北建始人,硕士研究生,主要研究方向:计算机视觉、机器学习基金资助:
Shiwen ZHANG1,2,3, Chunhua DENG1,2,3(), Junwen ZHANG1,2,3
Received:
2021-04-25
Revised:
2021-06-25
Accepted:
2021-07-09
Online:
2022-07-15
Published:
2022-07-10
Contact:
Chunhua DENG
About author:
ZHANG Shiwen, born in 1997, M. S. candidate. His research interests include computer vision, machine learning.Supported by:
摘要:
在某些固定的工业应用场景中,对目标检测算法的漏检容忍性非常低。然而,提升召回率的同时,目标周围容易规律性地产生一些无重叠的虚景框。传统的非极大值抑制(NMS)策略主要作用是抑制同一目标的多个重复检测框,无法解决上述问题。为此设计了一种各向异性NMS方法来对目标周围不同方向采取不同的抑制策略,从而有效消除规律性的虚景框。固定的工业场景中的目标形状和规律的虚景框往往具有一定关联性。为了促进各向异性NMS在不同方向的精确执行,设计了一种比例交并比(IoU)损失函数用来引导模型拟合目标的形状。此外,针对规则目标使用了一种自动标注的数据集增广方法,在降低人工标注工作量的同时扩大了数据集规模。实验结果表明,所提方法在轧辊凹槽检测数据集上的效果显著,应用于YOLO系列算法时在不降低速度的同时提升了检测精度。目前该算法已成功应用于某冷轧厂轧辊自动抓取的生产线。
中图分类号:
张诗文, 邓春华, 张俊雯. 各向异性非极大值抑制在工业目标检测中的应用[J]. 计算机应用, 2022, 42(7): 2210-2218.
Shiwen ZHANG, Chunhua DENG, Junwen ZHANG. Application of anisotropic non-maximum suppression in industrial target detection[J]. Journal of Computer Applications, 2022, 42(7): 2210-2218.
尺度 | YOLOv5默认 | 本文聚类 |
---|---|---|
19×19 | [116,90] [156,198] [373,326] | [54,200] [71,242] [102,314] |
38×38 | [ | [ |
76×76 | [ | [ |
表1 先验框在不同尺度检测层的分配
Tab. 1 Allocation of anchor boxes at different scale detection layers
尺度 | YOLOv5默认 | 本文聚类 |
---|---|---|
19×19 | [116,90] [156,198] [373,326] | [54,200] [71,242] [102,314] |
38×38 | [ | [ |
76×76 | [ | [ |
方法 | RIoU | NMS_l | mAP/% | FPS |
---|---|---|---|---|
YOLOv5s | 71.9 | 64.6 | ||
YOLOv5s+CIoU | 72.1 | 63.2 | ||
YOLOv5s+RIoU | √ | 72.8 | 66.0 | |
YOLOv5s+NMS_l | √ | 76.9 | 63.6 | |
YOLOv5s+CIoU+NMS_l | √ | 77.5 | 65.9 | |
YOLOv5s+RIoU+NMS_l | √ | √ | 79.2 | 64.5 |
表2 本文方法与原始模型的不同组合的效果
Tab. 2 Effect of different combinations of the proposed method and original model
方法 | RIoU | NMS_l | mAP/% | FPS |
---|---|---|---|---|
YOLOv5s | 71.9 | 64.6 | ||
YOLOv5s+CIoU | 72.1 | 63.2 | ||
YOLOv5s+RIoU | √ | 72.8 | 66.0 | |
YOLOv5s+NMS_l | √ | 76.9 | 63.6 | |
YOLOv5s+CIoU+NMS_l | √ | 77.5 | 65.9 | |
YOLOv5s+RIoU+NMS_l | √ | √ | 79.2 | 64.5 |
方法 | mAP@.5/% | mAP@.5:.95/% | FPS |
---|---|---|---|
YOLOv3 | 92.4 | 66.6 | 47.3 |
YOLOv3+CIoU | 93.7 | 67.3 | 48.3 |
YOLOv4[ | 93.3 | 72.2 | 28.2 |
YOLOv4+CIoU | 94.2 | 73.3 | 30.8 |
YOLOv5s | 94.2 | 71.9 | 64.6 |
YOLOv3+RIoU+NMS_l | 94.0 | 68.0 | 55.4 |
YOLOv4+RIoU+NMS_l | 96.7 | 75.4 | 31.3 |
YOLOv5s+RIoU+NMS_l | 97.7 | 79.2 | 64.5 |
表3 本文方法加入的与原始YOLO系列算法的对比
Tab. 3 Comparison of original YOLO series of algorithms and them adding the proposed method
方法 | mAP@.5/% | mAP@.5:.95/% | FPS |
---|---|---|---|
YOLOv3 | 92.4 | 66.6 | 47.3 |
YOLOv3+CIoU | 93.7 | 67.3 | 48.3 |
YOLOv4[ | 93.3 | 72.2 | 28.2 |
YOLOv4+CIoU | 94.2 | 73.3 | 30.8 |
YOLOv5s | 94.2 | 71.9 | 64.6 |
YOLOv3+RIoU+NMS_l | 94.0 | 68.0 | 55.4 |
YOLOv4+RIoU+NMS_l | 96.7 | 75.4 | 31.3 |
YOLOv5s+RIoU+NMS_l | 97.7 | 79.2 | 64.5 |
1 | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9905. Cham: Springer, 2016: 21-37. |
2 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 779-788. 10.1109/cvpr.2016.91 |
3 | REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6517-6525. 10.1109/cvpr.2017.690 |
4 | REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. (2018-04-08) [2021-01-08].. |
5 | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 580-587. 10.1109/cvpr.2014.81 |
6 | GIRSHICK R. Fast R-CNN[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 1440-1448. 10.1109/iccv.2015.169 |
7 | REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. 10.1109/tpami.2016.2577031 |
8 | HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2980-2988. 10.1109/iccv.2017.322 |
9 | LI Z M, PENG C, YU G, et al. Light-head R-CNN: in defense of two-stage object detector[EB/OL]. (2017-11-23) [2021-01-08].. |
10 | CAI Z W, VASCONCELOS N. Cascade R-CNN: delving into high quality object detection[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 6154-6162. 10.1109/cvpr.2018.00644 |
11 | FU C Y, LIU W, RANGA A, et al. DSSD: deconvolutional single shot detector[EB/OL]. (2017-01-23) [2021-01-08].. |
12 | PENG J, SU Y. An improved algorithm for detection and pose estimation of texture-less objects[J]. Journal of Advanced Computational Intelligence and Intelligent Informatics, 2021, 25(2): 204-212. 10.20965/jaciii.2021.p0204 |
13 | LAVIE A, SAGAE K, JAYARAMAN S. The significance of recall in automatic metrics for MT evaluation[C]// Proceedings of the 2004 Conference of the Association for Machine Translation in the Americas, LNCS 3265/LNAI 3265. Berlin: Springer, 2004: 134-143. 10.1007/978-3-540-30194-3_16 |
14 | JUBA B, LE H S. Precision-recall versus accuracy and the role of large data sets[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2019: 4039-4048. 10.1609/aaai.v33i01.33014039 |
15 | MUKHERJEE S. Object detection[M]// ML.NET Revealed. Berkeley: Apress, 2021: 159-170. 10.1007/978-1-4842-6543-7_10 |
16 | RAZAKARIVONY S, JURIE F. Vehicle detection in aerial imagery: a small target detection benchmark[J]. Journal of Visual Communication and Image Representation, 2016, 34: 187-203. 10.1016/j.jvcir.2015.11.002 |
17 | GUO Y L, BENNAMOUN M, SOHEL F, et al. An integrated framework for 3-D modeling, object detection, and pose estimation from point-clouds[J]. IEEE Transactions on Instrumentation and Measurement, 2015, 64(3): 683-693. 10.1109/tim.2014.2358131 |
18 | ZHUANG J F, YANG L J, LI J. An improved segmentation algorithm based on super pixel for typical industrial applications[C]// Proceedings of the 11th International Symposium on Computational Intelligence and Design. Piscataway: IEEE, 2018: 366-370. 10.1109/iscid.2018.10184 |
19 | CATENI S, COLLA V, VANNUCCI M. A method for resampling imbalanced datasets in binary classification tasks for real-world problems[J]. Neurocomputing, 2014, 135: 32-41. 10.1016/j.neucom.2013.05.059 |
20 | JACQUES J C S, Jr, LAPEDRIZA A, PALMERO C, et al. Person perception biases exposed: revisiting the first impressions dataset [C]// Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision Workshops. Piscataway: IEEE, 2020: 13-21. 10.1109/wacvw52041.2021.00006 |
21 | ROSENFELD A, THURSTON M. Edge and curve detection for visual scene analysis[J]. IEEE Transactions on Computers, 1971, C-20(5): 562-569. 10.1109/t-c.1971.223290 |
22 | HARRIS C, STEPHENS M. A combined corner and edge detector[C]// Proceedings of the 1988 Alvey Vision Conference. [S.l.]: Alvety Vision Club, 1988: No.23. 10.5244/c.2.23 |
23 | VIOLA P, JONES M. Rapid object detection using a boosted cascade of simple features[C]// Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2001: Ⅰ-511-Ⅰ-518. |
24 | FELZENSZWALB P F, GIRSHICK R B, McALLESTER D, et al. Object detection with discriminatively trained part-based models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1627-1645. 10.1109/tpami.2009.167 |
25 | DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2005: 886-893. |
26 | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 580-587. 10.1109/cvpr.2014.81 |
27 | BODLA N, SINGH B, CHELLAPPA R, et al. Soft-NMS — improving object detection with one line of code[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 5562-5570. 10.1109/iccv.2017.593 |
28 | YU J H, JIANG Y N, WANG Z Y, et al. UnitBox: an advanced object detection network[C]// Proceedings of the 24th ACM International Conference on Multimedia. New York: ACM, 2016: 516-520. 10.1145/2964284.2967274 |
29 | REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 658-666. 10.1109/cvpr.2019.00075 |
30 | SONG T, SUN L Y, XIE D, et al. Small-scale pedestrian detection based on topological line localization and temporal feature aggregation[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211/LNIP 11211. Cham: Springer, 2018: 554-569. 10.1007/978-3-030-01234-2_33 |
31 | LAW H, DENG J. CornerNet: detecting objects as paired keypoints[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11218/LNIP 11218. Cham: Springer, 2018: 765-781. 10.1007/978-3-030-01264-9_45 |
32 | YANG Z, LIU S H, HU H, et al. RepPoints: point set representation for object detection[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 9656-9665. 10.1109/iccv.2019.00975 |
33 | ZHU C C, HE Y H, SAVVIDES M. Feature selective anchor-free module for single-shot object detection[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 840-849. 10.1109/cvpr.2019.00093 |
34 | LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2999-3007. 10.1109/iccv.2017.324 |
35 | CUI Y, JIA M L, LIN T Y, et al. Class-balanced loss based on effective number of samples[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 9260-9269. 10.1109/cvpr.2019.00949 |
36 | LI B Y, LIU Y, WANG X G. Gradient harmonized single-stage detector[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2019: 8577-8584. 10.1609/aaai.v33i01.33018577 |
37 | TIAN Z, SHEN C H, CHEN H, et al. FCOS: fully convolutional one-stage object detection[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 9626-9635. 10.1109/iccv.2019.00972 |
38 | ZHENG Z H, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020: 12993-13000. 10.1609/aaai.v34i07.6999 |
39 | YANG T, ZHANG X Y, LI Z M, et al. MetaAnchor: learning to detect objects with customized anchors[C]// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2018: 318-328. 10.1016/j.ipl.2018.03.004 |
40 | 赵媛媛,朱军,谢亚坤,等. 改进Yolo-v3的视频图像火焰实时检测算法[J]. 武汉大学学报(信息科学版), 2021, 46(3): 326-334. |
ZHAO Y Y, ZHU J, XIE Y K, et al. A real-time video flame detection algorithm based on improved Yolo-v3[J]. Geomatics and Information Science of Wuhan University, 2021, 46(3): 326-334. | |
41 | 陈静,毛莺池,陈豪,等. 基于改进单点多盒检测器的大坝缺陷目标检测方法[J]. 计算机应用, 2021, 41(8): 2366-2372. |
CHEN J, MAO Y C, CHEN H, et al. Dam defect object detection method based on improved single shot multibox detector[J]. Journal of Computer Applications, 2021, 41(8): 2366-2372. | |
42 | 卢官有,顾正弘. 改进的YOLOv3安检包裹中危险品检测算法[J].计算机应用与软件, 2021, 38(1): 197-204. 10.3969/j.issn.1000-386x.2021.01.033 |
LU G Y, GU Z H. A Dangerous goods detection algorithm based on improved YOLOv3[J]. Computer Applications and Software, 2021, 38(1): 197-204. 10.3969/j.issn.1000-386x.2021.01.033 | |
43 | BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. (2020-04-23) [2021-01-28].. |
[1] | 潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877. |
[2] | 张英俊, 李牛牛, 谢斌红, 张睿, 陆望东. 课程学习指导下的半监督目标检测框架[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2326-2333. |
[3] | 李烨恒, 罗光圣, 苏前敏. 基于改进YOLOv5的Logo检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2580-2587. |
[4] | 姬张建, 杜娜. 基于改进VariFocalNet的微小目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2200-2207. |
[5] | 赵亦群, 张志禹, 董雪. 基于密集残差物理信息神经网络的各向异性旅行时计算方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2310-2318. |
[6] | 徐松, 张文博, 王一帆. 基于时空信息的轻量视频显著性目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2192-2199. |
[7] | 龙伍丹, 彭博, 胡节, 申颖, 丁丹妮. 基于加强特征提取的道路病害检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2264-2270. |
[8] | 张勇进, 徐健, 张明星. 面向轻量化的改进YOLOv7棉杂检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2271-2278. |
[9] | 孙逊, 冯睿锋, 陈彦如. 基于深度与实例分割融合的单目3D目标检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2208-2215. |
[10] | 刘越, 刘芳, 武奥运, 柴秋月, 王天笑. 基于自注意力机制与图卷积的3D目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1972-1977. |
[11] | 邓亚平, 李迎江. YOLO算法及其在自动驾驶场景中目标检测综述[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1949-1958. |
[12] | 耿焕同, 刘振宇, 蒋骏, 范子辰, 李嘉兴. 基于改进YOLOv8的嵌入式道路裂缝检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1613-1618. |
[13] | 李鸿天, 史鑫昊, 潘卫国, 徐成, 徐冰心, 袁家政. 融合多尺度和注意力机制的小样本目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1437-1444. |
[14] | 宋霄罡, 张冬冬, 张鹏飞, 梁莉, 黑新宏. 面向复杂施工环境的实时目标检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1605-1612. |
[15] | 李鑫, 孟乔, 皇甫俊逸, 孟令辰. 基于分离式标签协同学习的YOLOv5多属性分类[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1619-1628. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||