面向智能巡检的视觉模型鲁棒性优化方法

doi:10.11772/j.issn.1001-9081.2024070959

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (7): 2361-2368.DOI: 10.11772/j.issn.1001-9081.2024070959

• 多媒体计算与计算机仿真 • 上一篇下一篇

面向智能巡检的视觉模型鲁棒性优化方法

王震洲¹, 郭方方¹, 宿景芳¹(), 苏鹤², 王建超¹

^1.河北科技大学信息科学与工程学院，石家庄 050018
^2.河北工业大学电气工程学院，天津 300130

收稿日期:2024-07-09 修回日期:2024-09-29 接受日期:2024-10-09 发布日期:2025-07-10 出版日期:2025-07-10
通讯作者: 宿景芳
作者简介:王震洲（1978—），男，河北石家庄人，教授，博士，主要研究方向：图像处理、模式识别
郭方方（2000—），女，河南安阳人，硕士研究生，主要研究方向：计算机视觉、图像处理
苏鹤（1993—），男，河北衡水人，博士研究生，主要研究方向：电力系统分析与控制、电工装备可靠性理论及应用
王建超（1990—），男，河北石家庄人，讲师，博士，主要研究方向：深度学习、人工智能、智能信息处理。
基金资助:
河北省高等学校科学技术研究项目(QN2023185)

Robustness optimization method of visual model for intelligent inspection

Zhenzhou WANG¹, Fangfang GUO¹, Jingfang SU¹(), He SU², Jianchao WANG¹

^1.School of Information Science and Engineering，Hebei University of Science and Technology，Shijiazhuang Hebei 050018，China
^2.School of Electrical Engineering，Hebei University of Technology，Tianjin 300130，China

Received:2024-07-09 Revised:2024-09-29 Accepted:2024-10-09 Online:2025-07-10 Published:2025-07-10
Contact: Jingfang SU
About author:WANG Zhenzhou， born in 1978， Ph. D.， professor. His research interests include image processing， pattern recognition.
GUO Fangfang， born in 2000， M. S. candidate. Her research interests include computer vision， image processing.
SU He， born in 1993， Ph. D. candidate. His research interests include analysis and control of power system， reliability theory and application of electrical equipment.
WANG Jianchao， born in 1990， Ph. D.， lecturer. His research interests include deep learning， artificial intelligence， intelligent information processing.
Supported by:
Science and Technology Research Project of Colleges and Universities in Hebei Province(QN2023185)

摘要/Abstract

摘要：

输电线路的智能巡检视觉任务对电力系统的安全稳定至关重要。尽管深度学习网络在分布一致的训练和测试数据集上表现良好，但实际应用中数据分布的偏差常常会降低模型性能。为了解决这一问题，提出一种基于对比学习的训练方法（TMCL），旨在增强模型鲁棒性。首先，构建专为输电线路场景设计的基准测试集TLD-C （Transmission Line Dataset-Corruption）用于评估模型在面对图像损坏时的鲁棒性；其次，通过构建对类别特征敏感的正负样本对，提升模型对不同类别特征的区分能力；然后，使用结合对比损失和交叉熵损失的联合优化策略对特征提取过程施加额外约束，以优化特征向量的表征；最后，引入非局部特征去噪网络（NFD）用于提取与类别密切相关的特征。实验结果表明，模型改进后的训练方法在输电线路数据集（TLD）上的平均精度比原始方法高出3.40个百分点，在TLD-C数据集上的相对损坏精度（rCP）比原始方法高出4.69个百分点。

关键词: 智能巡检, 深度学习, 鲁棒性, 对比学习, 训练方法

Abstract:

The vision task of intelligent inspection of transmission lines is crucial to safety and stability of the power system. Although deep learning networks perform well on uniformly distributed training and test datasets， deviations in data distribution often degrade model performance in real-world applications. To solve this problem， a Training Method based on Contrastive Learning （TMCL） was proposed， aiming to enhance robustness of the model. Firstly， a benchmark test set， TLD-C （Transmission Line Dataset-Corruption）， specially designed for transmission line scenario was constructed to evaluate the model’s robustness facing image corruption. Secondly， the model’s ability to distinguish different categories of features was improved by constructing positive and negative sample pairs that are sensitive to category features. Thirdly， a joint optimization strategy combining contrastive loss and cross-entropy loss was used to impose additional constraints on the feature extraction process， so as to optimize representation of the feature vectors. Finally， a Non-local Feature Denoising network （NFD） was introduced to extract features closely related to categories. Experimental results show that compared to the original method， the improved training method achieves an average precision improved by 3.40 percentage points on Transmission Line Dataset （TLD）， and a relative Corruption Precision （rCP） increased by 4.69 percentage points on TLD-C dataset.

Key words: intelligent inspection, deep learning, robustness, contrastive learning, training method

中图分类号:

TP391.41

王震洲, 郭方方, 宿景芳, 苏鹤, 王建超. 面向智能巡检的视觉模型鲁棒性优化方法[J]. 计算机应用, 2025, 45(7): 2361-2368.

Zhenzhou WANG, Fangfang GUO, Jingfang SU, He SU, Jianchao WANG. Robustness optimization method of visual model for intelligent inspection[J]. Journal of Computer Applications, 2025, 45(7): 2361-2368.

图/表 16

图1 输电线路数据集采集图示

Fig. 1 Schematic diagram for transmission line dataset collection

图2 输电线路目标示例

Fig. 2 Examples of transmission line targets

表1 输电线路数据集的类别和数量

Tab. 1 Categories and numbers in transmission line dataset

类别	图像总数	检测目标数
绝缘子（insulator）	370	625
杆塔（pole tower）	366	549
挖掘机（excavator）	355	400
塔吊（tower crane）	356	534
吊车（crane）	355	375
油罐车（tanker truck）	354	388
鸟巢（nest）	352	400
铲车（forklift）	352	405
大卡车（big truck）	354	378
推土车（bulldozer）	352	380

图3 基准测试集中损坏分类的可视化

Fig. 3 Visualization of corruption classification in benchmark test set

图4 强光损坏下不同损坏程度的可视化

Fig. 4 Visualization of different corruption levels under strong brightness

图5 基于对比学习的训练方法的整体框架

Fig. 5 Overall framework of training method based on contrastive learning

图6 非局部特征去噪网络

Fig. 6 Non-local feature denoising network

表2 不同网络模型的鲁棒性性能 ( %)

Tab. 2 Robustness performance of different network models

网络模型	TLD	TLD-C
网络模型	AP	mCP	rCP
AllConvNet	83.80	52.70	62.89
DenseNet	81.20	51.80	63.80
WideResNet	80.70	53.10	65.80
ResNeXt	82.40	54.30	65.90
ResNet	81.40	53.20	65.36

图7 不同模型在不同损坏下的平均损坏精度

Fig. 7 Mean corruption precisions of different models under different corruptions

图8 不同模型在5种损坏程度下的平均损坏精度

Fig. 8 Mean corruption precisions of different models under five corruption levels

表3 不同数据增强方法在TLD-C数据集上的mCP结果 ( %)

Tab. 3 mCP results of different data augmentation methods on TLD-C dataset

数据增强方法	AllConvNet	DenseNet	WideResNet	ResNeXt	ResNet
Standard	52.70	51.80	53.10	54.30	53.20
Cutout	53.60	50.40	54.30	54.70	53.60
mixup	57.50	57.40	58.50	61.20	59.70
Auto Augment	54.30	55.10	57.40	58.70	57.10
AugMix	62.50	61.60	63.30	64.20	63.50
本文方法	63.10	61.90	63.50	65.20	63.80

图9 不同模型在不同训练方法下的平均损坏精度

Fig. 9 Mean corruption precision of different models under different training methods

表4 不同模型在不同训练方法下的鲁棒性性能 ( %)

Tab. 4 Robustness performance of different models under different training methods

训练方法	TLD	TLD-C
训练方法	AP	mCP	rCP
AugMix	82.10	48.80	59.44
本文方法	83.40	52.20	62.59

图10 归一化的混淆矩阵的可视化

Fig. 10 Visualization of normalized confusion matrices

表5 NFD对模型鲁棒性的影响 ( %)

Tab. 5 Influence of NFD on model robustness

模型	TLD	TLD-C
模型	AP	mCP	rCP
ResNet-50	81.40	57.00	70.02
ResNet-50_NFD	82.60	58.60	70.94

表6 消融实验结果 ( %)

Tab. 6 Results of ablation experiments

方法	LS	NFD	TLD	TLD-C
方法	LS	NFD	AP	mCP	rCP
M_D	—	√	82.60	55.70	67.43
M_L	√	—	83.50	57.90	69.34
M_O	—	—	81.40	53.20	65.36
本文方法	√	√	84.80	59.40	70.05

参考文献 28

[1]	卢志博，徐澄宇，杨罡，等.基于改进YOLOv3的输电线路部件实时检测［J］.电测与仪表，2023， 60（7）： 138-144.
	LU Z B， XU C Y， YANG G， et al. Real-time detection of transmission line components based on improved YOLOv3 ［J］. Electrical Measurement and Instrumentation， 2023， 60（7）： 138-144.
[2]	刘黎，韩睿，韩译锋，等.改进的Faster-RCNN目标检测方法在变电站悬挂异物检测中的应用［J］.电测与仪表，2021， 58（1）： 142-146.
	LIU L， HAN R， HAN Y F， et al. Application of an improved Faster-RCNN object detection method in the detection of suspended foreign matters in substation ［J］. Electrical Measurement and Instrumentation， 2021， 58（1）： 142-146.
[3]	HUANG G， LIN Z， VAN DER MAATEN L， et al. Densely connected convolutional networks ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 2261-2269.
[4]	TAORI R， DAVE A， SHANKAR V， et al. Measuring robustness to natural distribution shifts in image classification ［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2020： 18583-18599.
[5]	RECHT B， ROELOFS R， SCHMIDT L， et al. Do ImageNet classifiers generalize to ImageNet？［C］// Proceedings of the 36th International Conference on Machine Learning. New York： JMLR.org， 2019： 5389-5400.
[6]	DODGE S， KARAM L. A study and comparison of human and deep learning recognition performance under visual distortions ［C］// Proceedings of the 26th International Conference on Computer Communication and Networks. Piscataway： IEEE， 2017： 1-7.
[7]	SHEN H， HU B C， CZARNECKI K， et al. Assessing visually-continuous corruption robustness of neural networks relative to human performance ［C］// Proceedings of the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway： IEEE， 2025： 6300-6310.
[8]	YANG K， YAU J H， LI F F， et al. A study of face obfuscation in ImageNet ［C］// Proceedings of the 39th International Conference on Machine Learning. New York： JMLR.org， 2022： 25313-25330.
[9]	HENDRYCKS D， DIETTERICH T. Benchmarking neural network robustness to common corruptions and perturbations ［EB/OL］. ［2024-05-24］. .
[10]	HENDRYCKS D， MU N， CUBUK E D， et al. AugMix： a simple method to improve robustness and uncertainty under data shift ［EB/OL］. ［2024-05-24］. .
[11]	XING W， YAO J， LIU Z， et al. Contrastive JS： a novel scheme for enhancing the accuracy and robustness of deep models ［J］. IEEE Transactions on Multimedia， 2023， 25： 7881-7893.
[12]	SAIKIA T， SCHMID C， BROX T. Improving robustness against common corruptions with frequency biased models ［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 10191-10200.
[13]	WANG S， VELDUIS R， BRUNE C， et al. A survey on the robustness of computer vision models against common corruptions ［EB/OL］. ［2024-08-24］. .
[14]	JAGANATHAN A P. Meta-styled CNNs： boosting robustness through adaptive learning and style transfer ［J/OL］. International Journal of Information Technology， 2024： 1-14 ［2024-08-20］ .
[15]	王彦海，郭宸昕，吴德强.基于改进YOLOv7的输电线路机械外破隐患目标检测方法［J/OL］.电测与仪表［2024-09-04］. .
	WANG Y H， GUO C X， WU D Q. Hidden target detection method for mechanical external damage of transmission line based on improved YOLOv7 ［J/OL］. Electrical Measurement and Instrumentation ［2024-09-04］. .
[16]	龙乐云，周腊吾，刘淑琴，等.改进YOLOv5算法下的输电线路外破隐患目标检测研究［J］.电子测量与仪器学报，2022， 36（11）： 245-253.
	LONG L Y， ZHOU L W， LIU S Q， et al. Identification of hidden damage targets by external forces based on domain adaptation and attention mechanism ［J］. Journal of Electronic Measurement and Instrumentation， 2022， 36（11）： 245-253.
[17]	叶翔，孙嘉兴，甘永叶，等.改进YOLOv3模型在无人机巡检输电线路部件缺陷检测中的应用研究［J］.电测与仪表，2023， 60（5）： 85-91.
	YE X， SUN J X， GAN Y Y， et al. Application of improved YOLOv3 model in defect detection of transmission line components in UAV patrol inspection ［J］. Electrical Measurement and Instrumentation， 2023， 60（5）： 85-91.
[18]	张重生，陈杰，李岐龙，等.深度对比学习综述［J］.自动化学报，2023， 49（1）： 15-39.
	ZHANG C S， CHEN J， LI Q L， et al. Deep contrastive learning： a survey ［J］. Acta Automatica Sinica， 2023， 49（1）： 15-39.
[19]	PATACCHIOLA M， STORKEY A. Self-supervised relational reasoning for representation learning ［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2020： 4003-4014.
[20]	钱来，赵卫伟.基于对比学习和注意力机制的文本分类方法［J］.计算机工程，2024， 50（7）： 104-111.
	QIAN L， ZHAO W W. Text classification method based on contrastive learning and attention mechanism ［J］. Computer Engineering， 2024， 50（7）： 104-111.
[21]	刘传洋，吴一全.基于深度学习的输电线路视觉检测方法研究进展［J］.中国电机工程学报， 2023， 43（19）： 7423-7446.
	LIU C Y， WU Y Q. Research progress of vision detection methods based on deep learning for transmission lines ［J］. Proceedings of the CSEE， 2023， 43（19）： 7423-7446.
[22]	SPRINGENBERG J T， DOSOVITSKIY A， BROX T， et al. Striving for simplicity： the all convolutional net ［EB/OL］. ［2024-08-25］. .
[23]	ZAGORUYKO S， KOMODAKIS N. Wide residual networks ［C］// Proceedings of the 2016 British Machine Vision Conference. Durham： BMVA Press， 2016： No.87.
[24]	XIE S， GIRSHICK R， DOLLÁR P， et al. Aggregated residual transformations for deep neural networks ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 5987-5995.
[25]	HE K， ZHANG X， REN S， et al. Deep residual learning for image recognition ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778.
[26]	TERMRITTHIKUN C， JAMTSHO Y， MUNEESAWANG P. An improved residual network model for image recognition using a combination of snapshot ensembles and the cutout technique ［J］. Multimedia Tools and Applications， 2020， 79（1/2）： 1475-1495.
[27]	ZHANG H， CISSE M， DAUPHIN Y N， et al. mixup： beyond empirical risk minimization ［EB/OL］. ［2023-10-11］. .
[28]	CUBUK E D， ZOPH B， MANÉ D， et al. AutoAugment： learning augmentation policies from data ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 113-123.

面向智能巡检的视觉模型鲁棒性优化方法

Robustness optimization method of visual model for intelligent inspection

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 16

参考文献 28

相关文章 15

编辑推荐

Metrics

[1]	廖炎华, 鄢元霞, 潘文林. 基于YOLOv9的交通路口图像的多目标检测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2555-2565.
[2]	葛丽娜, 王明禹, 田蕾. 联邦学习的高效性研究综述[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2387-2398.
[3]	彭鹏, 蔡子婷, 刘雯玲, 陈才华, 曾维, 黄宝来. 基于CNN和双向GRU混合孪生网络的语音情感识别方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2515-2521.
[4]	张硕, 孙国凯, 庄园, 冯小雨, 王敬之. 面向区块链节点分析的eclipse攻击动态检测方法[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2428-2436.
[5]	王祉苑, 彭涛, 杨捷. 分布外检测中训练与测试的内外数据整合[J]. 《计算机应用》唯一官方网站, 2025, 45(8): 2497-2506.
[6]	索晋贤, 张丽萍, 闫盛, 王东奇, 张雅雯. 可解释的深度知识追踪方法综述[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2043-2055.
[7]	谢劲, 褚苏荣, 强彦, 赵涓涓, 张华, 高勇. 用于胸片中硬负样本识别的双支分布一致性对比学习模型[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2369-2377.
[8]	齐巧玲, 王啸啸, 张茜茜, 汪鹏, 董永峰. 基于元学习的标签噪声自适应学习算法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2113-2122.
[9]	赵小阳, 许新征, 李仲年. 物联网应用中的可解释人工智能研究综述[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2169-2179.
[10]	余明峰, 秦永彬, 黄瑞章, 陈艳平, 林川. 基于对比学习增强双注意力机制的多标签文本分类方法[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1732-1740.
[11]	颜文婧, 王瑞东, 左敏, 张青川. 基于风味嵌入异构图层次学习的食谱推荐模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1869-1878.
[12]	花天辰, 马晓宁, 智慧. 基于浅层人工神经网络的可移植执行恶意软件静态检测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1911-1921.
[13]	姜超英, 李倩, 刘宁, 刘磊, 崔立真. 基于图对比学习的再入院预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1784-1792.
[14]	李岚皓, 严皓钧, 周号益, 孙庆赟, 李建欣. 基于神经网络的多尺度信息融合时间序列长期预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1776-1783.
[15]	王文鹏, 秦寅畅, 师文轩. 工业缺陷检测无监督深度学习方法综述[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1658-1670.