Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (7): 2361-2368.DOI: 10.11772/j.issn.1001-9081.2024070959
• Multimedia computing and computer simulation • Previous Articles Next Articles
Zhenzhou WANG1, Fangfang GUO1, Jingfang SU1(), He SU2, Jianchao WANG1
Received:
2024-07-09
Revised:
2024-09-29
Accepted:
2024-10-09
Online:
2025-07-10
Published:
2025-07-10
Contact:
Jingfang SU
About author:
WANG Zhenzhou, born in 1978, Ph. D., professor. His research interests include image processing, pattern recognition.Supported by:
通讯作者:
宿景芳
作者简介:
王震洲(1978—),男,河北石家庄人,教授,博士,主要研究方向:图像处理、模式识别基金资助:
CLC Number:
Zhenzhou WANG, Fangfang GUO, Jingfang SU, He SU, Jianchao WANG. Robustness optimization method of visual model for intelligent inspection[J]. Journal of Computer Applications, 2025, 45(7): 2361-2368.
王震洲, 郭方方, 宿景芳, 苏鹤, 王建超. 面向智能巡检的视觉模型鲁棒性优化方法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2361-2368.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2024070959
类别 | 图像总数 | 检测目标数 |
---|---|---|
绝缘子(insulator) | 370 | 625 |
杆塔(pole tower) | 366 | 549 |
挖掘机(excavator) | 355 | 400 |
塔吊(tower crane) | 356 | 534 |
吊车(crane) | 355 | 375 |
油罐车(tanker truck) | 354 | 388 |
鸟巢(nest) | 352 | 400 |
铲车(forklift) | 352 | 405 |
大卡车(big truck) | 354 | 378 |
推土车(bulldozer) | 352 | 380 |
Tab. 1 Categories and numbers in transmission line dataset
类别 | 图像总数 | 检测目标数 |
---|---|---|
绝缘子(insulator) | 370 | 625 |
杆塔(pole tower) | 366 | 549 |
挖掘机(excavator) | 355 | 400 |
塔吊(tower crane) | 356 | 534 |
吊车(crane) | 355 | 375 |
油罐车(tanker truck) | 354 | 388 |
鸟巢(nest) | 352 | 400 |
铲车(forklift) | 352 | 405 |
大卡车(big truck) | 354 | 378 |
推土车(bulldozer) | 352 | 380 |
网络模型 | TLD | TLD-C | |
---|---|---|---|
AP | mCP | rCP | |
AllConvNet | 83.80 | 52.70 | 62.89 |
DenseNet | 81.20 | 51.80 | 63.80 |
WideResNet | 80.70 | 53.10 | 65.80 |
ResNeXt | 82.40 | 54.30 | 65.90 |
ResNet | 81.40 | 53.20 | 65.36 |
Tab. 2 Robustness performance of different network models
网络模型 | TLD | TLD-C | |
---|---|---|---|
AP | mCP | rCP | |
AllConvNet | 83.80 | 52.70 | 62.89 |
DenseNet | 81.20 | 51.80 | 63.80 |
WideResNet | 80.70 | 53.10 | 65.80 |
ResNeXt | 82.40 | 54.30 | 65.90 |
ResNet | 81.40 | 53.20 | 65.36 |
数据增强方法 | AllConvNet | DenseNet | WideResNet | ResNeXt | ResNet |
---|---|---|---|---|---|
Standard | 52.70 | 51.80 | 53.10 | 54.30 | 53.20 |
Cutout | 53.60 | 50.40 | 54.30 | 54.70 | 53.60 |
mixup | 57.50 | 57.40 | 58.50 | 61.20 | 59.70 |
Auto Augment | 54.30 | 55.10 | 57.40 | 58.70 | 57.10 |
AugMix | 62.50 | 61.60 | 63.30 | 64.20 | 63.50 |
本文方法 | 63.10 | 61.90 | 63.50 | 65.20 | 63.80 |
Tab. 3 mCP results of different data augmentation methods on TLD-C dataset
数据增强方法 | AllConvNet | DenseNet | WideResNet | ResNeXt | ResNet |
---|---|---|---|---|---|
Standard | 52.70 | 51.80 | 53.10 | 54.30 | 53.20 |
Cutout | 53.60 | 50.40 | 54.30 | 54.70 | 53.60 |
mixup | 57.50 | 57.40 | 58.50 | 61.20 | 59.70 |
Auto Augment | 54.30 | 55.10 | 57.40 | 58.70 | 57.10 |
AugMix | 62.50 | 61.60 | 63.30 | 64.20 | 63.50 |
本文方法 | 63.10 | 61.90 | 63.50 | 65.20 | 63.80 |
训练方法 | TLD | TLD-C | |
---|---|---|---|
AP | mCP | rCP | |
AugMix | 82.10 | 48.80 | 59.44 |
本文方法 | 83.40 | 52.20 | 62.59 |
Tab. 4 Robustness performance of different models under different training methods
训练方法 | TLD | TLD-C | |
---|---|---|---|
AP | mCP | rCP | |
AugMix | 82.10 | 48.80 | 59.44 |
本文方法 | 83.40 | 52.20 | 62.59 |
模型 | TLD | TLD-C | |
---|---|---|---|
AP | mCP | rCP | |
ResNet-50 | 81.40 | 57.00 | 70.02 |
ResNet-50NFD | 82.60 | 58.60 | 70.94 |
Tab. 5 Influence of NFD on model robustness
模型 | TLD | TLD-C | |
---|---|---|---|
AP | mCP | rCP | |
ResNet-50 | 81.40 | 57.00 | 70.02 |
ResNet-50NFD | 82.60 | 58.60 | 70.94 |
方法 | LS | NFD | TLD | TLD-C | |
---|---|---|---|---|---|
AP | mCP | rCP | |||
MD | — | √ | 82.60 | 55.70 | 67.43 |
ML | √ | — | 83.50 | 57.90 | 69.34 |
MO | — | — | 81.40 | 53.20 | 65.36 |
本文方法 | √ | √ | 84.80 | 59.40 | 70.05 |
Tab. 6 Results of ablation experiments
方法 | LS | NFD | TLD | TLD-C | |
---|---|---|---|---|---|
AP | mCP | rCP | |||
MD | — | √ | 82.60 | 55.70 | 67.43 |
ML | √ | — | 83.50 | 57.90 | 69.34 |
MO | — | — | 81.40 | 53.20 | 65.36 |
本文方法 | √ | √ | 84.80 | 59.40 | 70.05 |
[1] | 卢志博,徐澄宇,杨罡,等.基于改进YOLOv3的输电线路部件实时检测[J].电测与仪表,2023, 60(7): 138-144. |
LU Z B, XU C Y, YANG G, et al. Real-time detection of transmission line components based on improved YOLOv3 [J]. Electrical Measurement and Instrumentation, 2023, 60(7): 138-144. | |
[2] | 刘黎,韩睿,韩译锋,等.改进的Faster-RCNN目标检测方法在变电站悬挂异物检测中的应用[J].电测与仪表,2021, 58(1): 142-146. |
LIU L, HAN R, HAN Y F, et al. Application of an improved Faster-RCNN object detection method in the detection of suspended foreign matters in substation [J]. Electrical Measurement and Instrumentation, 2021, 58(1): 142-146. | |
[3] | HUANG G, LIN Z, VAN DER MAATEN L, et al. Densely connected convolutional networks [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 2261-2269. |
[4] | TAORI R, DAVE A, SHANKAR V, et al. Measuring robustness to natural distribution shifts in image classification [C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 18583-18599. |
[5] | RECHT B, ROELOFS R, SCHMIDT L, et al. Do ImageNet classifiers generalize to ImageNet? [C]// Proceedings of the 36th International Conference on Machine Learning. New York: JMLR.org, 2019: 5389-5400. |
[6] | DODGE S, KARAM L. A study and comparison of human and deep learning recognition performance under visual distortions [C]// Proceedings of the 26th International Conference on Computer Communication and Networks. Piscataway: IEEE, 2017: 1-7. |
[7] | SHEN H, HU B C, CZARNECKI K, et al. Assessing visually-continuous corruption robustness of neural networks relative to human performance [C]// Proceedings of the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2025: 6300-6310. |
[8] | YANG K, YAU J H, LI F F, et al. A study of face obfuscation in ImageNet [C]// Proceedings of the 39th International Conference on Machine Learning. New York: JMLR.org, 2022: 25313-25330. |
[9] | HENDRYCKS D, DIETTERICH T. Benchmarking neural network robustness to common corruptions and perturbations [EB/OL]. [2024-05-24]. . |
[10] | HENDRYCKS D, MU N, CUBUK E D, et al. AugMix: a simple method to improve robustness and uncertainty under data shift [EB/OL]. [2024-05-24]. . |
[11] | XING W, YAO J, LIU Z, et al. Contrastive JS: a novel scheme for enhancing the accuracy and robustness of deep models [J]. IEEE Transactions on Multimedia, 2023, 25: 7881-7893. |
[12] | SAIKIA T, SCHMID C, BROX T. Improving robustness against common corruptions with frequency biased models [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 10191-10200. |
[13] | WANG S, VELDUIS R, BRUNE C, et al. A survey on the robustness of computer vision models against common corruptions [EB/OL]. [2024-08-24]. . |
[14] | JAGANATHAN A P. Meta-styled CNNs: boosting robustness through adaptive learning and style transfer [J/OL]. International Journal of Information Technology, 2024: 1-14 [2024-08-20] . |
[15] | 王彦海,郭宸昕,吴德强.基于改进YOLOv7的输电线路机械外破隐患目标检测方法[J/OL].电测与仪表[2024-09-04]. . |
WANG Y H, GUO C X, WU D Q. Hidden target detection method for mechanical external damage of transmission line based on improved YOLOv7 [J/OL]. Electrical Measurement and Instrumentation [2024-09-04]. . | |
[16] | 龙乐云,周腊吾,刘淑琴,等.改进YOLOv5算法下的输电线路外破隐患目标检测研究[J].电子测量与仪器学报,2022, 36(11): 245-253. |
LONG L Y, ZHOU L W, LIU S Q, et al. Identification of hidden damage targets by external forces based on domain adaptation and attention mechanism [J]. Journal of Electronic Measurement and Instrumentation, 2022, 36(11): 245-253. | |
[17] | 叶翔,孙嘉兴,甘永叶,等.改进YOLOv3模型在无人机巡检输电线路部件缺陷检测中的应用研究[J].电测与仪表,2023, 60(5): 85-91. |
YE X, SUN J X, GAN Y Y, et al. Application of improved YOLOv3 model in defect detection of transmission line components in UAV patrol inspection [J]. Electrical Measurement and Instrumentation, 2023, 60(5): 85-91. | |
[18] | 张重生,陈杰,李岐龙,等.深度对比学习综述[J].自动化学报,2023, 49(1): 15-39. |
ZHANG C S, CHEN J, LI Q L, et al. Deep contrastive learning: a survey [J]. Acta Automatica Sinica, 2023, 49(1): 15-39. | |
[19] | PATACCHIOLA M, STORKEY A. Self-supervised relational reasoning for representation learning [C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 4003-4014. |
[20] | 钱来,赵卫伟.基于对比学习和注意力机制的文本分类方法[J].计算机工程,2024, 50(7): 104-111. |
QIAN L, ZHAO W W. Text classification method based on contrastive learning and attention mechanism [J]. Computer Engineering, 2024, 50(7): 104-111. | |
[21] | 刘传洋,吴一全.基于深度学习的输电线路视觉检测方法研究进展[J].中国电机工程学报, 2023, 43(19): 7423-7446. |
LIU C Y, WU Y Q. Research progress of vision detection methods based on deep learning for transmission lines [J]. Proceedings of the CSEE, 2023, 43(19): 7423-7446. | |
[22] | SPRINGENBERG J T, DOSOVITSKIY A, BROX T, et al. Striving for simplicity: the all convolutional net [EB/OL]. [2024-08-25]. . |
[23] | ZAGORUYKO S, KOMODAKIS N. Wide residual networks [C]// Proceedings of the 2016 British Machine Vision Conference. Durham: BMVA Press, 2016: No.87. |
[24] | XIE S, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 5987-5995. |
[25] | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. |
[26] | TERMRITTHIKUN C, JAMTSHO Y, MUNEESAWANG P. An improved residual network model for image recognition using a combination of snapshot ensembles and the cutout technique [J]. Multimedia Tools and Applications, 2020, 79(1/2): 1475-1495. |
[27] | ZHANG H, CISSE M, DAUPHIN Y N, et al. mixup: beyond empirical risk minimization [EB/OL]. [2023-10-11]. . |
[28] | CUBUK E D, ZOPH B, MANÉ D, et al. AutoAugment: learning augmentation policies from data [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 113-123. |
[1] | Jinxian SUO, Liping ZHANG, Sheng YAN, Dongqi WANG, Yawen ZHANG. Review of interpretable deep knowledge tracing methods [J]. Journal of Computer Applications, 2025, 45(7): 2043-2055. |
[2] | Jin XIE, Surong CHU, Yan QIANG, Juanjuan ZHAO, Hua ZHANG, Yong GAO. Dual-branch distribution consistency contrastive learning model for hard negative sample identification in chest X-rays [J]. Journal of Computer Applications, 2025, 45(7): 2369-2377. |
[3] | Qiaoling QI, Xiaoxiao WANG, Qianqian ZHANG, Peng WANG, Yongfeng DONG. Label noise adaptive learning algorithm based on meta-learning [J]. Journal of Computer Applications, 2025, 45(7): 2113-2122. |
[4] | Xiaoyang ZHAO, Xinzheng XU, Zhongnian LI. Research review on explainable artificial intelligence in internet of things applications [J]. Journal of Computer Applications, 2025, 45(7): 2169-2179. |
[5] | Chaoying JIANG, Qian LI, Ning LIU, Lei LIU, Lizhen CUI. Readmission prediction model based on graph contrastive learning [J]. Journal of Computer Applications, 2025, 45(6): 1784-1792. |
[6] | Lanhao LI, Haojun YAN, Haoyi ZHOU, Qingyun SUN, Jianxin LI. Multi-scale information fusion time series long-term forecasting model based on neural network [J]. Journal of Computer Applications, 2025, 45(6): 1776-1783. |
[7] | Mingfeng YU, Yongbin QIN, Ruizhang HUANG, Yanping CHEN, Chuan LIN. Multi-label text classification method based on contrastive learning enhanced dual-attention mechanism [J]. Journal of Computer Applications, 2025, 45(6): 1732-1740. |
[8] | Wenjing YAN, Ruidong WANG, Min ZUO, Qingchuan ZHANG. Recipe recommendation model based on hierarchical learning of flavor embedding heterogeneous graph [J]. Journal of Computer Applications, 2025, 45(6): 1869-1878. |
[9] | Tianchen HUA, Xiaoning MA, Hui ZHI. Portable executable malware static detection model based on shallow artificial neural network [J]. Journal of Computer Applications, 2025, 45(6): 1911-1921. |
[10] | Yufei LONG, Yuchen MOU, Ye LIU. Multi-source data representation learning model based on tensorized graph convolutional network and contrastive learning [J]. Journal of Computer Applications, 2025, 45(5): 1372-1378. |
[11] | Sijie NIU, Yuliang LIU. Auxiliary diagnostic method for retinopathy based on dual-branch structure with knowledge distillation [J]. Journal of Computer Applications, 2025, 45(5): 1410-1414. |
[12] | Wenbin HU, Tianxiang CAI, Tianle HAN, Zhaoman ZHONG, Changxia MA. Multimodal sarcasm detection model integrating contrastive learning with sentiment analysis [J]. Journal of Computer Applications, 2025, 45(5): 1432-1438. |
[13] | Dan WANG, Wenhao ZHANG, Lijuan PENG. Channel estimation of reconfigurable intelligent surface assisted communication system based on deep learning [J]. Journal of Computer Applications, 2025, 45(5): 1613-1618. |
[14] | Kai CHEN, Hailiang YE, Feilong CAO. Classification algorithm for point cloud based on local-global interaction and structural Transformer [J]. Journal of Computer Applications, 2025, 45(5): 1671-1676. |
[15] | Wenpeng WANG, Yinchang QIN, Wenxuan SHI. Review of unsupervised deep learning methods for industrial defect detection [J]. Journal of Computer Applications, 2025, 45(5): 1658-1670. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||