各向异性非极大值抑制在工业目标检测中的应用

doi:10.11772/j.issn.1001-9081.2021040648

《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (7): 2210-2218.DOI: 10.11772/j.issn.1001-9081.2021040648

• 多媒体计算与计算机仿真 • 上一篇下一篇

各向异性非极大值抑制在工业目标检测中的应用

张诗文¹^,²^,³, 邓春华¹^,²^,³(), 张俊雯¹^,²^,³

^1.武汉科技大学计算机科学与技术学院, 武汉 430065
^2.武汉科技大学大数据科学与工程研究院, 武汉 430065
^3.智能信息处理与实时工业系统湖北省重点实验室(武汉科技大学), 武汉 430065

收稿日期:2021-04-25 修回日期:2021-06-25 接受日期:2021-07-09 发布日期:2022-07-15 出版日期:2022-07-10
通讯作者: 邓春华
作者简介:张诗文（1997—），男，湖北建始人，硕士研究生，主要研究方向：计算机视觉、机器学习
张俊雯（1997—），女，湖北荆门人，硕士研究生，主要研究方向：计算机视觉、机器学习。
基金资助:
国家自然科学基金资助项目(61806150)

Application of anisotropic non-maximum suppression in industrial target detection

Shiwen ZHANG¹^,²^,³, Chunhua DENG¹^,²^,³(), Junwen ZHANG¹^,²^,³

^1.School of Computer Science and Technology，Wuhan University of Science and Technology，Wuhan Hubei 430065，China
^2.Institute of Big Data Science and Engineering，Wuhan University of Science and Technology，Wuhan Hubei 430065，China
^3.Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System （Wuhan University of Science and Technology），Wuhan Hubei 430065，China

Received:2021-04-25 Revised:2021-06-25 Accepted:2021-07-09 Online:2022-07-15 Published:2022-07-10
Contact: Chunhua DENG
About author:ZHANG Shiwen， born in 1997， M. S. candidate. His research interests include computer vision， machine learning.
ZHANG Junwen， born in 1997， M. S. candidate. Her research interests include computer vision， machine learning.
Supported by:
National Natural Science Foundation of China(61806150)

摘要/Abstract

摘要：

在某些固定的工业应用场景中，对目标检测算法的漏检容忍性非常低。然而，提升召回率的同时，目标周围容易规律性地产生一些无重叠的虚景框。传统的非极大值抑制（NMS）策略主要作用是抑制同一目标的多个重复检测框，无法解决上述问题。为此设计了一种各向异性NMS方法来对目标周围不同方向采取不同的抑制策略，从而有效消除规律性的虚景框。固定的工业场景中的目标形状和规律的虚景框往往具有一定关联性。为了促进各向异性NMS在不同方向的精确执行，设计了一种比例交并比（IoU）损失函数用来引导模型拟合目标的形状。此外，针对规则目标使用了一种自动标注的数据集增广方法，在降低人工标注工作量的同时扩大了数据集规模。实验结果表明，所提方法在轧辊凹槽检测数据集上的效果显著，应用于YOLO系列算法时在不降低速度的同时提升了检测精度。目前该算法已成功应用于某冷轧厂轧辊自动抓取的生产线。

关键词: 各向异性, 非极大值抑制, 交并比, 目标检测, YOLO

Abstract:

In certain fixed industrial application scenarios， the tolerance of the target detection algorithms to miss detection is very low. However， while increasing the recall， some non-overlapping virtual frames are likely to be regularly generated around the target. The traditional Non-Maximum Suppression （NMS） strategy has the main function to suppress multiple repeated detection frames of the same target， and cannot solve the above problem. To this end， an anisotropic NMS method was designed by adopting different suppression strategies for different directions around the target， and was able to effectively eliminate the regular virtual frames. The target shape and the regular virtual frame in a fixed industrial scene often have a certain relevance. In order to promote the accurate execution of anisotropic NMS in different directions， a ratio Intersection over Union （IoU） loss function was designed to guide the model to fit the shape of the target. In addition， an automatic labeling dataset augmentation method was used for the regular target， which reduced the workload of manual labeling and enlarged the scale of the dataset. Experimental results show that the proposed method has significant effects on the roll groove detection dataset， and when it is applied to the YOLO （You Only Look Once） series of algorithms， the detection precision is improved without reducing the speed. At present， the algorithm has been successfully applied to the production line of a cold rolling mill that automatically grabs rolls.

Key words: anisotropic, Non-Maximum Suppression (NMS), Intersection over Union (IoU), target detection, YOLO (You Only Look Once)

中图分类号:

TP391.41

张诗文, 邓春华, 张俊雯. 各向异性非极大值抑制在工业目标检测中的应用[J]. 计算机应用, 2022, 42(7): 2210-2218.

Shiwen ZHANG, Chunhua DENG, Junwen ZHANG. Application of anisotropic non-maximum suppression in industrial target detection[J]. Journal of Computer Applications, 2022, 42(7): 2210-2218.

图/表 15

图1 不同置信度下的NMS结果

Fig. 1 NMS results under different degrees of confidence

图2 工业场景目标检测框分布

Fig. 2 Target detection box distribution in industrial scenes

图3 椭圆旋转角度示意图

Fig. 3 Schematic diagram of ellipse rotation angle

图4 限定区域的NMS抑制结果

Fig. 4 NMS suppression results in a limited area

图5 凹槽数据集分布情况

Fig. 5 Groove dataset distribution

图6 函数νR 的曲线图像

Fig. 6 Curve image of function νR

图7 数据集的生成过程

Fig. 7 Generation process of dataset

图8 不同填充方式对比

Fig. 8 Comparison of different ways of filling

图9 部分样例展示

Fig. 9 Display of some samples

图10 Focus结构中的slice操作

Fig. 10 Slice operation in Focus structure

表1 先验框在不同尺度检测层的分配

Tab. 1 Allocation of anchor boxes at different scale detection layers

尺度	YOLOv5默认	本文聚类
19×19	［116，90］［156，198］［373，326］	［54，200］［71，242］［102，314］
38×38	［30，61］［62，45］［59，119］	［27，102］［35，121］［45，155］
76×76	［10，13］［16，30］［33，23］	［11，42］［16，60］［22，79］

图11 原始NMS与各向异性NMS对比

Fig. 11 Comparison of original and anisotropic NMS

图12 DIoU与比例IoU loss对比

Fig. 12 Comparison of DIoU and ratio IoU loss

表2 本文方法与原始模型的不同组合的效果

Tab. 2 Effect of different combinations of the proposed method and original model

方法	RIoU	NMS_l	mAP/%	FPS
YOLOv5s			71.9	64.6
YOLOv5s+CIoU			72.1	63.2
YOLOv5s+RIoU	√		72.8	66.0
YOLOv5s+NMS_l		√	76.9	63.6
YOLOv5s+CIoU+NMS_l		√	77.5	65.9
YOLOv5s+RIoU+NMS_l	√	√	79.2	64.5

表3 本文方法加入的与原始YOLO系列算法的对比

Tab. 3 Comparison of original YOLO series of algorithms and them adding the proposed method

方法	mAP@.5/%	mAP@.5：.95/%	FPS
YOLOv3	92.4	66.6	47.3
YOLOv3+CIoU	93.7	67.3	48.3
YOLOv4^［43］	93.3	72.2	28.2
YOLOv4+CIoU	94.2	73.3	30.8
YOLOv5s	94.2	71.9	64.6
YOLOv3+RIoU+NMS_l	94.0	68.0	55.4
YOLOv4+RIoU+NMS_l	96.7	75.4	31.3
YOLOv5s+RIoU+NMS_l	97.7	79.2	64.5

参考文献 43

1	LIU W， ANGUELOV D， ERHAN D， et al. SSD： single shot multibox detector［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS 9905. Cham： Springer， 2016： 21-37.
2	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： unified， real-time object detection［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 779-788. 10.1109/cvpr.2016.91
3	REDMON J， FARHADI A. YOLO9000： better， faster， stronger［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 6517-6525. 10.1109/cvpr.2017.690
4	REDMON J， FARHADI A. YOLOv3： an incremental improvement［EB/OL］. （2018-04-08）［2021-01-08］..
5	GIRSHICK R， DONAHUE J， DARRELL T， et al. Rich feature hierarchies for accurate object detection and semantic segmentation［C］// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014： 580-587. 10.1109/cvpr.2014.81
6	GIRSHICK R. Fast R-CNN［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 1440-1448. 10.1109/iccv.2015.169
7	REN S Q， HE K M， GIRSHICK R， et al. Faster R-CNN： towards real-time object detection with region proposal networks［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（6）： 1137-1149. 10.1109/tpami.2016.2577031
8	HE K M， GKIOXARI G， DOLLÁR P， et al. Mask R-CNN［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2980-2988. 10.1109/iccv.2017.322
9	LI Z M， PENG C， YU G， et al. Light-head R-CNN： in defense of two-stage object detector［EB/OL］. （2017-11-23）［2021-01-08］..
10	CAI Z W， VASCONCELOS N. Cascade R-CNN： delving into high quality object detection［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 6154-6162. 10.1109/cvpr.2018.00644
11	FU C Y， LIU W， RANGA A， et al. DSSD： deconvolutional single shot detector［EB/OL］. （2017-01-23）［2021-01-08］..
12	PENG J， SU Y. An improved algorithm for detection and pose estimation of texture-less objects［J］. Journal of Advanced Computational Intelligence and Intelligent Informatics， 2021， 25（2）： 204-212. 10.20965/jaciii.2021.p0204
13	LAVIE A， SAGAE K， JAYARAMAN S. The significance of recall in automatic metrics for MT evaluation［C］// Proceedings of the 2004 Conference of the Association for Machine Translation in the Americas， LNCS 3265/LNAI 3265. Berlin： Springer， 2004： 134-143. 10.1007/978-3-540-30194-3_16
14	JUBA B， LE H S. Precision-recall versus accuracy and the role of large data sets［C］// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2019： 4039-4048. 10.1609/aaai.v33i01.33014039
15	MUKHERJEE S. Object detection［M］// ML.NET Revealed. Berkeley： Apress， 2021： 159-170. 10.1007/978-1-4842-6543-7_10
16	RAZAKARIVONY S， JURIE F. Vehicle detection in aerial imagery： a small target detection benchmark［J］. Journal of Visual Communication and Image Representation， 2016， 34： 187-203. 10.1016/j.jvcir.2015.11.002
17	GUO Y L， BENNAMOUN M， SOHEL F， et al. An integrated framework for 3-D modeling， object detection， and pose estimation from point-clouds［J］. IEEE Transactions on Instrumentation and Measurement， 2015， 64（3）： 683-693. 10.1109/tim.2014.2358131
18	ZHUANG J F， YANG L J， LI J. An improved segmentation algorithm based on super pixel for typical industrial applications［C］// Proceedings of the 11th International Symposium on Computational Intelligence and Design. Piscataway： IEEE， 2018： 366-370. 10.1109/iscid.2018.10184
19	CATENI S， COLLA V， VANNUCCI M. A method for resampling imbalanced datasets in binary classification tasks for real-world problems［J］. Neurocomputing， 2014， 135： 32-41. 10.1016/j.neucom.2013.05.059
20	JACQUES J C S， Jr， LAPEDRIZA A， PALMERO C， et al. Person perception biases exposed： revisiting the first impressions dataset ［C］// Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision Workshops. Piscataway： IEEE， 2020： 13-21. 10.1109/wacvw52041.2021.00006
21	ROSENFELD A， THURSTON M. Edge and curve detection for visual scene analysis［J］. IEEE Transactions on Computers， 1971， C-20（5）： 562-569. 10.1109/t-c.1971.223290
22	HARRIS C， STEPHENS M. A combined corner and edge detector［C］// Proceedings of the 1988 Alvey Vision Conference. ［S.l.］： Alvety Vision Club， 1988： No.23. 10.5244/c.2.23
23	VIOLA P， JONES M. Rapid object detection using a boosted cascade of simple features［C］// Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2001： Ⅰ-511-Ⅰ-518.
24	FELZENSZWALB P F， GIRSHICK R B， McALLESTER D， et al. Object detection with discriminatively trained part-based models［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2010， 32（9）： 1627-1645. 10.1109/tpami.2009.167
25	DALAL N， TRIGGS B. Histograms of oriented gradients for human detection［C］// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2005： 886-893.
26	GIRSHICK R， DONAHUE J， DARRELL T， et al. Rich feature hierarchies for accurate object detection and semantic segmentation［C］// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014： 580-587. 10.1109/cvpr.2014.81
27	BODLA N， SINGH B， CHELLAPPA R， et al. Soft-NMS — improving object detection with one line of code［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 5562-5570. 10.1109/iccv.2017.593
28	YU J H， JIANG Y N， WANG Z Y， et al. UnitBox： an advanced object detection network［C］// Proceedings of the 24th ACM International Conference on Multimedia. New York： ACM， 2016： 516-520. 10.1145/2964284.2967274
29	REZATOFIGHI H， TSOI N， GWAK J， et al. Generalized intersection over union： a metric and a loss for bounding box regression［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 658-666. 10.1109/cvpr.2019.00075
30	SONG T， SUN L Y， XIE D， et al. Small-scale pedestrian detection based on topological line localization and temporal feature aggregation［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11211/LNIP 11211. Cham： Springer， 2018： 554-569. 10.1007/978-3-030-01234-2_33
31	LAW H， DENG J. CornerNet： detecting objects as paired keypoints［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11218/LNIP 11218. Cham： Springer， 2018： 765-781. 10.1007/978-3-030-01264-9_45
32	YANG Z， LIU S H， HU H， et al. RepPoints： point set representation for object detection［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 9656-9665. 10.1109/iccv.2019.00975
33	ZHU C C， HE Y H， SAVVIDES M. Feature selective anchor-free module for single-shot object detection［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 840-849. 10.1109/cvpr.2019.00093
34	LIN T Y， GOYAL P， GIRSHICK R， et al. Focal loss for dense object detection［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2999-3007. 10.1109/iccv.2017.324
35	CUI Y， JIA M L， LIN T Y， et al. Class-balanced loss based on effective number of samples［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 9260-9269. 10.1109/cvpr.2019.00949
36	LI B Y， LIU Y， WANG X G. Gradient harmonized single-stage detector［C］// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2019： 8577-8584. 10.1609/aaai.v33i01.33018577
37	TIAN Z， SHEN C H， CHEN H， et al. FCOS： fully convolutional one-stage object detection［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 9626-9635. 10.1109/iccv.2019.00972
38	ZHENG Z H， WANG P， LIU W， et al. Distance-IoU loss： faster and better learning for bounding box regression［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2020： 12993-13000. 10.1609/aaai.v34i07.6999
39	YANG T， ZHANG X Y， LI Z M， et al. MetaAnchor： learning to detect objects with customized anchors［C］// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2018： 318-328. 10.1016/j.ipl.2018.03.004
40	赵媛媛，朱军，谢亚坤，等. 改进Yolo-v3的视频图像火焰实时检测算法［J］. 武汉大学学报（信息科学版）， 2021， 46（3）： 326-334.
	ZHAO Y Y， ZHU J， XIE Y K， et al. A real-time video flame detection algorithm based on improved Yolo-v3［J］. Geomatics and Information Science of Wuhan University， 2021， 46（3）： 326-334.
41	陈静，毛莺池，陈豪，等. 基于改进单点多盒检测器的大坝缺陷目标检测方法［J］. 计算机应用， 2021， 41（8）： 2366-2372.
	CHEN J， MAO Y C， CHEN H， et al. Dam defect object detection method based on improved single shot multibox detector［J］. Journal of Computer Applications， 2021， 41（8）： 2366-2372.
42	卢官有，顾正弘. 改进的YOLOv3安检包裹中危险品检测算法［J］.计算机应用与软件， 2021， 38（1）： 197-204. 10.3969/j.issn.1000-386x.2021.01.033
	LU G Y， GU Z H. A Dangerous goods detection algorithm based on improved YOLOv3［J］. Computer Applications and Software， 2021， 38（1）： 197-204. 10.3969/j.issn.1000-386x.2021.01.033
43	BOCHKOVSKIY A， WANG C Y， LIAO H Y M. YOLOv4： optimal speed and accuracy of object detection［EB/OL］. （2020-04-23）［2021-01-28］..

[1]	潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877.
[2]	张英俊, 李牛牛, 谢斌红, 张睿, 陆望东. 课程学习指导下的半监督目标检测框架[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2326-2333.
[3]	李烨恒, 罗光圣, 苏前敏. 基于改进YOLOv5的Logo检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2580-2587.
[4]	姬张建, 杜娜. 基于改进VariFocalNet的微小目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2200-2207.
[5]	赵亦群, 张志禹, 董雪. 基于密集残差物理信息神经网络的各向异性旅行时计算方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2310-2318.
[6]	徐松, 张文博, 王一帆. 基于时空信息的轻量视频显著性目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2192-2199.
[7]	龙伍丹, 彭博, 胡节, 申颖, 丁丹妮. 基于加强特征提取的道路病害检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2264-2270.
[8]	张勇进, 徐健, 张明星. 面向轻量化的改进YOLOv7棉杂检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2271-2278.
[9]	孙逊, 冯睿锋, 陈彦如. 基于深度与实例分割融合的单目3D目标检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2208-2215.
[10]	刘越, 刘芳, 武奥运, 柴秋月, 王天笑. 基于自注意力机制与图卷积的3D目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1972-1977.
[11]	邓亚平, 李迎江. YOLO算法及其在自动驾驶场景中目标检测综述[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1949-1958.
[12]	耿焕同, 刘振宇, 蒋骏, 范子辰, 李嘉兴. 基于改进YOLOv8的嵌入式道路裂缝检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1613-1618.
[13]	李鸿天, 史鑫昊, 潘卫国, 徐成, 徐冰心, 袁家政. 融合多尺度和注意力机制的小样本目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1437-1444.
[14]	宋霄罡, 张冬冬, 张鹏飞, 梁莉, 黑新宏. 面向复杂施工环境的实时目标检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1605-1612.
[15]	李鑫, 孟乔, 皇甫俊逸, 孟令辰. 基于分离式标签协同学习的YOLOv5多属性分类[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1619-1628.

各向异性非极大值抑制在工业目标检测中的应用

Application of anisotropic non-maximum suppression in industrial target detection

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 15

参考文献 43

相关文章 15

编辑推荐

Metrics