《计算机应用》唯一官方网站 ›› 2026, Vol. 46 ›› Issue (4): 1334-1343.DOI: 10.11772/j.issn.1001-9081.2025040416
• 前沿与综合应用 • 上一篇
白翔1, 李巨川1,2, 王慧民1, 景超1,2(
), 钮键2, 张兴忠1,2, 程永强1,3
收稿日期:2025-04-18
修回日期:2025-06-05
接受日期:2025-06-09
发布日期:2025-06-12
出版日期:2026-04-10
通讯作者:
景超
作者简介:白翔(1992—),男,山西柳林人,工程师,硕士,主要研究方向:人工智能、能源互联网基金资助:
Xiang BAI1, Juchuan LI1,2, Huimin WANG1, Chao JING1,2(
), Jian NIU2, Xingzhong ZHANG1,2, Yongqiang CHENG1,3
Received:2025-04-18
Revised:2025-06-05
Accepted:2025-06-09
Online:2025-06-12
Published:2026-04-10
Contact:
Chao JING
About author:BAI Xiang, born in 1992, M. S., engineer. His research interests include artificial intelligence, energy internet.Supported by:摘要:
针对现有的图像检索方法难以有效辨别和提取电力设备的相似结构信息和纹理细节特征,导致检索精度和效率低的问题,提出基于改进Swin Transformer的电力图像检索方法(PIR-iSwinT)。首先,提出多特征结构交叉增强模块(MFSCE),通过结合梯度幅值图的交叉注意力机制增强模型对设备结构和边缘特征的感知能力;其次,设计自适应类间差异中心损失模块(AIDCL)加强模型对同类样本和异类样本的辨别能力;最后,构建层次聚类检索模块(HCR)优化检索过程中的样本匹配策略并降低计算复杂度,进一步提升检索精度和效率。在自建电力场景数据集和NUS-WIDE数据集上的实验结果表明,当哈希码长度为32 bit时,PIR-iSwinT的平均精度均值(mAP)分别达到96.76%和92.68%,与HRMPA(Hash image Retrieval based on Mixed attention and Polarization Asymmetric loss)相比分别提升了2.35%和0.56%。可见,PIR-iSwinT能有效提取和辨别电力设备的细节结构特征,提升检索效率,同时展现出良好的泛化能力,验证了所提方法的有效性。
中图分类号:
白翔, 李巨川, 王慧民, 景超, 钮键, 张兴忠, 程永强. 基于改进Swin Transformer的电力图像检索方法[J]. 计算机应用, 2026, 46(4): 1334-1343.
Xiang BAI, Juchuan LI, Huimin WANG, Chao JING, Jian NIU, Xingzhong ZHANG, Yongqiang CHENG. Power image retrieval method based on improved Swin Transformer[J]. Journal of Computer Applications, 2026, 46(4): 1334-1343.
| 数据类别 | 样本 总数 | 样本数 | |||
|---|---|---|---|---|---|
| 训练集 | 验证集 | 测试集 | 图像库集 | ||
| 母线(Busbar) | 640 | 400 | 80 | 20 | 140 |
| 套管(Bushing) | 750 | 400 | 80 | 20 | 250 |
电容器组 (Capacitor Bank) | 580 | 400 | 80 | 20 | 80 |
| 电感器(Inductor) | 610 | 400 | 80 | 20 | 110 |
| 绝缘子(Insulator) | 750 | 400 | 80 | 20 | 250 |
| 变压器(Transformer) | 715 | 400 | 80 | 20 | 215 |
| 输电塔(Power Tower) | 730 | 400 | 80 | 20 | 230 |
表1 类别统计
Tab. 1 Category statistics
| 数据类别 | 样本 总数 | 样本数 | |||
|---|---|---|---|---|---|
| 训练集 | 验证集 | 测试集 | 图像库集 | ||
| 母线(Busbar) | 640 | 400 | 80 | 20 | 140 |
| 套管(Bushing) | 750 | 400 | 80 | 20 | 250 |
电容器组 (Capacitor Bank) | 580 | 400 | 80 | 20 | 80 |
| 电感器(Inductor) | 610 | 400 | 80 | 20 | 110 |
| 绝缘子(Insulator) | 750 | 400 | 80 | 20 | 250 |
| 变压器(Transformer) | 715 | 400 | 80 | 20 | 215 |
| 输电塔(Power Tower) | 730 | 400 | 80 | 20 | 230 |
| 数据集 | 哈希码 长度/bit | 不同方法的mAP/% | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| DPN | PSLDH | VTS16-CSQ | LSCSH | HashFormer | TransHash | AE-ViT | DHST | HRMPA | PIR-iSwinT | ||
自建电力 场景数据集 | 16 | 85.65 | 88.41 | 89.48 | 77.43 | 83.23 | 83.48 | 88.51 | 92.66 | 94.28 | 95.83 |
| 32 | 86.23 | 87.37 | 88.18 | 80.63 | 83.43 | 84.64 | 88.97 | 93.05 | 94.54 | 96.76 | |
| 48 | 86.67 | 89.11 | 88.86 | 79.94 | 82.19 | 83.68 | 88.23 | 93.74 | 94.87 | 96.23 | |
| 64 | 87.47 | 89.17 | 88.44 | 79.67 | 82.11 | 83.31 | 88.37 | 93.37 | 94.52 | 96.43 | |
| NUS-WIDE | 16 | 80.81 | 81.37 | 82.64 | 70.42 | 73.45 | 72.67 | 82.24 | 90.66 | 90.94 | 91.11 |
| 32 | 83.07 | 83.22 | 84.48 | 75.63 | 74.63 | 73.91 | 86.53 | 91.84 | 92.16 | 92.68 | |
| 48 | 84.51 | 84.47 | — | 69.94 | 75.49 | — | — | 92.86 | 92.65 | 91.93 | |
| 64 | 85.54 | 84.32 | 85.42 | 69.67 | 74.95 | 75.34 | 85.55 | 92.44 | 92.76 | 92.15 | |
表2 不同方法的对比实验结果
Tab. 2 Comparison experimental results of different methods
| 数据集 | 哈希码 长度/bit | 不同方法的mAP/% | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| DPN | PSLDH | VTS16-CSQ | LSCSH | HashFormer | TransHash | AE-ViT | DHST | HRMPA | PIR-iSwinT | ||
自建电力 场景数据集 | 16 | 85.65 | 88.41 | 89.48 | 77.43 | 83.23 | 83.48 | 88.51 | 92.66 | 94.28 | 95.83 |
| 32 | 86.23 | 87.37 | 88.18 | 80.63 | 83.43 | 84.64 | 88.97 | 93.05 | 94.54 | 96.76 | |
| 48 | 86.67 | 89.11 | 88.86 | 79.94 | 82.19 | 83.68 | 88.23 | 93.74 | 94.87 | 96.23 | |
| 64 | 87.47 | 89.17 | 88.44 | 79.67 | 82.11 | 83.31 | 88.37 | 93.37 | 94.52 | 96.43 | |
| NUS-WIDE | 16 | 80.81 | 81.37 | 82.64 | 70.42 | 73.45 | 72.67 | 82.24 | 90.66 | 90.94 | 91.11 |
| 32 | 83.07 | 83.22 | 84.48 | 75.63 | 74.63 | 73.91 | 86.53 | 91.84 | 92.16 | 92.68 | |
| 48 | 84.51 | 84.47 | — | 69.94 | 75.49 | — | — | 92.86 | 92.65 | 91.93 | |
| 64 | 85.54 | 84.32 | 85.42 | 69.67 | 74.95 | 75.34 | 85.55 | 92.44 | 92.76 | 92.15 | |
| 模型 | mAP | |
|---|---|---|
| 自建电力场景数据集 | NUS-WIDE | |
| Base Model | 92.52 | 90.47 |
| Base Model+MFSCE | 94.31 | 91.21 |
| PIR-iSwinT | 96.76 | 92.68 |
表3 MFSCE的消融实验结果 (%)
Tab. 3 Ablation experimental results of MFSCE
| 模型 | mAP | |
|---|---|---|
| 自建电力场景数据集 | NUS-WIDE | |
| Base Model | 92.52 | 90.47 |
| Base Model+MFSCE | 94.31 | 91.21 |
| PIR-iSwinT | 96.76 | 92.68 |
| 模型 | mAP | |
|---|---|---|
| 自建电力场景数据集 | NUS-WIDE | |
| Base Model | 92.52 | 90.47 |
| Base Model+AIDCL | 93.87 | 90.96 |
| PIR-iSwinT | 96.76 | 92.68 |
表4 AIDCL的消融实验结果 (%)
Tab. 4 Ablation experimental results of AIDCL
| 模型 | mAP | |
|---|---|---|
| 自建电力场景数据集 | NUS-WIDE | |
| Base Model | 92.52 | 90.47 |
| Base Model+AIDCL | 93.87 | 90.96 |
| PIR-iSwinT | 96.76 | 92.68 |
| 模型 | 检索时间 | |
|---|---|---|
| 自建电力场景数据集 | NUS-WIDE | |
| Base Model | 13.358 | 1 892.347 |
| Base Model+HCR | 4.125 | 557.957 |
| PIR-iSwinT | 3.842 | 561.125 |
表5 HCR的消融实验结果 (s)
Tab. 5 Ablation experimental results of HCR
| 模型 | 检索时间 | |
|---|---|---|
| 自建电力场景数据集 | NUS-WIDE | |
| Base Model | 13.358 | 1 892.347 |
| Base Model+HCR | 4.125 | 557.957 |
| PIR-iSwinT | 3.842 | 561.125 |
| [1] | 张雪凝,刘兴波,宋井宽,等. 面向大规模图像检索的哈希学习综述[J]. 软件学报, 2025, 36(1): 79-106. |
| ZHANG X N, LIU X B, SONG J K, et al. Survey on hash learning for large-scale image retrieval [J]. Journal of Software, 2025, 36(1): 79-106. | |
| [2] | ZHANG Q, SI T, JIANG H, et al. Improved image retrieval technology based on singular value decomposition [C]// Proceedings of the IEEE 2nd International Conference on Power, Electronics and Computer Applications. Piscataway: IEEE, 2022: 660-664. |
| [3] | 赵庆生,王雨滢,梁定康,等. 基于BOF图像检索算法的变电站设备图像分类[J]. 激光与光电子学进展, 2020, 57(18): No.181011. |
| ZHAO Q S, WANG Y Y, LIANG D K, et al. Image classification of substation equipment based on BOF image retrieval algorithm [J]. Laser and Optoelectronics Progress, 2020, 57(18): No.181011. | |
| [4] | 苗壮,王亚鹏,李阳,等. 一种等量约束聚类的无监督蒸馏哈希图像检索方法[J]. 计算机应用研究, 2023, 40(2): 601-606. |
| MIAO Z, WANG Y P, LI Y, et al. Unsupervised distillation hashing image retrieval method based on equivalent constraint clustering [J]. Application Research of Computers, 2023, 40(2): 601-606. | |
| [5] | 姚佩昀,于炯,李雪,等. 基于跨尺度Vision Transformer的深度哈希算法[J]. 计算机应用研究, 2024, 41(11): 3477-3483. |
| YAO P Y, YU J, LI X, et al. Deep hashing method based on cross-scale Vision Transformer [J]. Application Research of Computers, 2024, 41(11): 3477-3483. | |
| [6] | 李为杰,杨志景. 基于自监督蒸馏辅助学习的哈希图像检索[J]. 计算机工程与设计, 2023, 44(11): 3420-3426. |
| LI W J, YANG Z J. Deep hashing aided by self-supervised distillation learning for image retrieval [J]. Computer Engineering and Design, 2023, 44(11): 3420-3426. | |
| [7] | 潘丽丽,马俊勇,熊思宇,等. 基于类相似特征扩充与中心三元组损失的哈希图像检索[J]. 模式识别与人工智能, 2023, 36(8): 685-700. |
| PAN L L, MA J Y, XIONG S Y, et al. Hash image retrieval based on category similarity feature expansion and center triplet loss [J]. Pattern Recognition and Artificial Intelligence, 2023, 36(8): 685-700. | |
| [8] | 邵伟志,熊思宇,潘丽丽. 基于三元组哈希损失的半监督图像检索[J]. 北京航空航天大学学报, 2025, 51(7): 2526-2537. |
| SHAO W Z, XIONG S Y, PAN L L. Semi-supervised image retrieval based on triplet hash loss [J]. Journal of Beijing University of Aeronautics and Astronautics, 2025, 51(7): 2526-2537. | |
| [9] | 盖枚岭,张辉辉,秦琦冰. 基于特征金字塔网络的余弦四元组哈希图像检索方法[J]. 计算机工程与设计, 2024, 45(7): 2127-2133. |
| GAI M L, ZHANG H H, QIN Q B. Deep cosine quadruplet hashing based on feature pyramid network for image retrieval [J]. Computer Engineering and Design, 2024, 45(7): 2127-2133. | |
| [10] | HAMEED I M, ABDULHUSSAIN S H, MAHMMOD B M. Content-based image retrieval: a review of recent trends [J]. Cogent Engineering, 2021, 8(1): No.1927469. |
| [11] | FADAEI S, RASHNO A, RASHNO E. Content-based image retrieval speedup [C]// Proceedings of the 5th Iranian Conference on Signal Processing and Intelligent Systems. Piscataway: IEEE, 2019: 1-5. |
| [12] | LOWE D G. Distinctive image features from scale-invariant keypoints [J]. International Journal of Computer Vision, 2004, 60(2): 91-110. |
| [13] | BAY H, TUYTELAARS T, VAN GOOL L. SURF: speeded up robust features [C]// Proceedings of the 2006 European Conference on Computer Vision, LNCS 3951. Berlin: Springer, 2006: 404-417. |
| [14] | LI W J, WANG S, KANG W C. Feature learning based deep supervised hashing with pairwise labels [C]// Proceedings of the 25th International Joint Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2016: 1711-1717. |
| [15] | CAO Z, LONG M, WANG J, et al. HashNet: deep learning to hash by continuation [C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 5609-5618. |
| [16] | YUAN L, WANG T, ZHANG X, et al. Central similarity quantization for efficient image and video retrieval [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 3080-3089. |
| [17] | CHEN Y, ZHANG S, LIU F, et al. TransHash: Transformer-based hamming hashing for efficient image retrieval [C]// Proceedings of the 2022 International Conference on Multimedia Retrieval. New York: ACM, 2022: 127-136. |
| [18] | LI T, ZHANG Z, PEI L, et al. HashFormer: vision Transformer based deep hashing for image retrieval [J]. IEEE Signal Processing Letters, 2022, 29: 827-831. |
| [19] | DUBEY S R, SINGH S K, CHU W T. Vision Transformer hashing for image retrieval [C]// Proceedings of the 2022 IEEE International Conference on Multimedia and Expo. Piscataway: IEEE, 2022: 1-6. |
| [20] | 苗壮,赵昕昕,李阳,等. 基于Swin Transformer的深度有监督哈希图像检索方法[J]. 湖南大学学报(自然科学版), 2023, 50(8): 62-71. |
| MIAO Z, ZHAO X X, LI Y, et al. Deep supervised hashing image retrieval method based on Swin Transformer [J]. Journal of Hunan University (Natural Sciences), 2023, 50(8): 62-71. | |
| [21] | LIU Z, LIN Y, CAO Y, et al. Swin Transformer: hierarchical vision Transformer using shifted windows [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 9992-10002. |
| [22] | WEN Y, ZHANG K, LI Z, et al. A discriminative feature learning approach for deep face recognition [C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9911. Cham: Springer, 2016: 499-515. |
| [23] | 周锐,王桂娟,邓皓天,等. 复杂网络聚类特征层次布局算法[J]. 计算机应用研究, 2022, 39(2): 479-484. |
| ZHOU R, WANG G J, DENG H T, et al. Complex network clustering feature multi-level layout algorithm [J]. Application Research of Computers, 2022, 39(2): 479-484. | |
| [24] | CHUA T S, TANG J, HONG R, et al. NUS-WIDE: a real world web image database from national university of Singapore [C] // Proceedings of the 2009 ACM International Conference on Image and Video Retrieval. New York: ACM, 2009: No.48. |
| [25] | FAN L, NG K W, JU C, et al. Deep polarized network for supervised learning of accurate binary hashing codes [C]// Proceedings of the 29th International Joint Conference on Artificial Intelligence. California: ijcai.org, 2020: 825-831. |
| [26] | TU R C, MAO X L, GUO J N, et al. Partial-Softmax loss based deep hashing [C]// Proceedings of the Web Conference 2021. New York: ACM, 2021: 2869-2878. |
| [27] | XIE Y, WEI R, SONG J, et al. Label-affinity self-adaptive central similarity hashing for image retrieval[J]. IEEE Transactions on Multimedia, 2023, 25: 9161-9174. |
| [28] | 刘华咏,黄聪,金汉均. 注意力增强的视觉Transformer图像检索算法[J]. 电子测量技术, 2023, 46(23): 50-55. |
| LIU H Y, HUANG C, JIN H J. Image retrieval method with attention-enhanced visual Transformer [J]. Electronic Measurement Technology, 2023, 46(23): 50-55. | |
| [29] | 刘华咏,徐明慧. 基于混合注意力与偏振非对称损失的哈希图像检索[J]. 计算机科学, 2025, 52(8): 204-213. |
| LIU H Y, XU M H. Hash image retrieval based on mixed attention and polarization asymmetric loss [J]. Computer Science, 2025, 52(8): 204-213. |
| [1] | 李亚男, 郭梦阳, 邓国军, 陈允峰, 任建吉, 原永亮. 基于多模态融合特征的并分支发动机寿命预测方法[J]. 《计算机应用》唯一官方网站, 2026, 46(1): 305-313. |
| [2] | 谷铮, 陈学斌, 张宏扬, 李雨欣. 基于凝聚式层次聚类的微调筛选过采样方法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2138-2144. |
| [3] | 王艺涵, 路翀, 陈忠源. 跨模态文本信息增强的多模态情感分析模型[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2237-2244. |
| [4] | 崔双双, 王宏志, 朱加昊, 吴昊. 面向低能耗高性能的分类器两阶段数据选择方法[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1703-1711. |
| [5] | 李慧, 贾炳志, 王晨曦, 董子宇, 李纪龙, 仲兆满, 陈艳艳. 基于Swin Transformer的生成对抗网络水下图像增强模型[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1439-1446. |
| [6] | 王子怡, 李卫军, 刘雪洋, 丁建平, 刘世侠, 苏易礌. 基于Swin Transformer与多尺度特征融合的图像描述方法[J]. 《计算机应用》唯一官方网站, 2025, 45(10): 3154-3160. |
| [7] | 朱子蒙, 李志新, 郇战, 陈瑛, 梁久祯. 基于三元中心引导的弱监督视频异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1452-1457. |
| [8] | 吴宁, 罗杨洋, 许华杰. 基于多尺度特征融合的遥感图像语义分割方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 737-744. |
| [9] | 王林, 刘景亮, 王无为. 基于空洞卷积融合Transformer的无人机图像小目标检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3595-3602. |
| [10] | 陈丽安, 过弋. 融合个体偏差信息的文本情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 145-151. |
| [11] | 孙杰, 吴绍鑫, 王学军, 华璟. 基于Sophon SC5+芯片构架的行人搜索算法与优化[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 744-751. |
| [12] | 姚英茂, 姜晓燕. 基于图卷积网络与自注意力图池化的视频行人重识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 728-735. |
| [13] | 廖列法, 李志明, 张赛赛. 基于深度残差网络的迭代量化哈希图像检索方法[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2845-2852. |
| [14] | 韩亚茹, 闫连山, 姚涛. 基于元学习的深度哈希检索算法[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2015-2021. |
| [15] | 杨粟, 欧阳智, 杜逆索. 基于相关度距离的无监督并行哈希图像检索[J]. 计算机应用, 2021, 41(7): 1902-1907. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||