《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (6): 1876-1883.DOI: 10.11772/j.issn.1001-9081.2021040545
所属专题: 人工智能
收稿日期:
2021-04-12
修回日期:
2021-07-09
接受日期:
2021-07-09
发布日期:
2022-06-22
出版日期:
2022-06-10
通讯作者:
宋治国
作者简介:
廖光锴(1993—),男,四川内江人,硕士研究生,主要研究方向:车辆重识别、图像检索基金资助:
Guangkai LIAO1, Zheng ZHANG1, Zhiguo SONG2()
Received:
2021-04-12
Revised:
2021-07-09
Accepted:
2021-07-09
Online:
2022-06-22
Published:
2022-06-10
Contact:
Zhiguo SONG
About author:
LIAO Guangkai,born in 1993,M. S. candidate. His research interests include vehicle re-identification,image retrieval.Supported by:
摘要:
针对现有的基于卷积神经网络(CNN)的车辆重识别方法所提取的特征表达力不足的问题,提出一种基于小波特征与注意力机制相结合的车辆重识别方法。首先,将单层小波模块嵌入到卷积模块中代替池化层进行下采样,减少细粒度特征的丢失;其次,结合通道注意力(CA)机制和像素注意力(PA)机制提出一种新的局部注意力模块——特征提取模块(FEM)嵌入到卷积网络中,对关键信息进行加权强化。在VeRi数据集上与基准残差网络ResNet-50、ResNet-101进行对比。实验结果表明,在ResNet-50中增加小波变换层数能提高平均精度均值(mAP);在消融实验中,虽然ResNet-50+离散小波变换(DWT)比ResNet-101的mAP降低了0.25个百分点,但是其参数量和计算复杂度都比ResNet-101低,且mAP、Rank-1和Rank-5均比单独的ResNet-50高,说明该模型在车辆重识别中能够有效提高车辆检索精度。
中图分类号:
廖光锴, 张正, 宋治国. 基于小波特征与注意力机制结合的卷积网络车辆重识别[J]. 计算机应用, 2022, 42(6): 1876-1883.
Guangkai LIAO, Zheng ZHANG, Zhiguo SONG. Convolutional network-based vehicle re-identification combining wavelet features and attention mechanism[J]. Journal of Computer Applications, 2022, 42(6): 1876-1883.
结构 | 卷积核,通道 | 输出 |
---|---|---|
Conv1 | 7×7,64 | 112×112 |
DWT | 4×64 | 56×56 |
FEM | 256 | 56×56 |
stage1 | 56×56 | |
DWT | 4×128 | 28×28 |
FEM | 512 | 28×28 |
stage2 | 28×28 | |
DWT | 4×256 | 14×14 |
FEM | 1 024 | 14×14 |
stage3 | 14×14 | |
DWT | 4×512 | 7×7 |
FEM | 2 048 | 7×7 |
stage4 | 7×7 |
表1 本文网络的基本结构和各模块对应参数
Tab. 1 Basic structure of proposed network and corresponding parameters of each module
结构 | 卷积核,通道 | 输出 |
---|---|---|
Conv1 | 7×7,64 | 112×112 |
DWT | 4×64 | 56×56 |
FEM | 256 | 56×56 |
stage1 | 56×56 | |
DWT | 4×128 | 28×28 |
FEM | 512 | 28×28 |
stage2 | 28×28 | |
DWT | 4×256 | 14×14 |
FEM | 1 024 | 14×14 |
stage3 | 14×14 | |
DWT | 4×512 | 7×7 |
FEM | 2 048 | 7×7 |
stage4 | 7×7 |
方法 | Rank-1 | Rank-5 | mAP |
---|---|---|---|
基线(ResNet-50) | 83.49 | 92.31 | 52.88 |
本文方法 | 88.70 | 94.60 | 63.90 |
表2 在VeRi数据集上与ResNet-50的比较 ( %)
Tab. 2 Comparison with ResNet-50 on VeRi dataset
方法 | Rank-1 | Rank-5 | mAP |
---|---|---|---|
基线(ResNet-50) | 83.49 | 92.31 | 52.88 |
本文方法 | 88.70 | 94.60 | 63.90 |
测试集 | 本文方法 | 基线(ResNet-50) |
---|---|---|
Test800 | 69.30 | 67.27 |
Test1600 | 67.32 | 62.03 |
Test2400 | 63.94 | 55.12 |
表3 在VehicleID数据集上的Rank-1比较 ( %)
Tab. 3 Comparison of Rank-1 on VehicleID dataset
测试集 | 本文方法 | 基线(ResNet-50) |
---|---|---|
Test800 | 69.30 | 67.27 |
Test1600 | 67.32 | 62.03 |
Test2400 | 63.94 | 55.12 |
方法 | Rank-1 | Rank-5 | mAP |
---|---|---|---|
ResNet-50 | 83.49 | 92.31 | 52.88 |
ResNet-101 | 84.74 | 94.34 | 55.75 |
ResNet-50+DWT | 85.20 | 93.70 | 55.50 |
ResNet-50+FEM | 85.60 | 93.10 | 56.90 |
本文方法+Lc | 88.10 | 94.00 | 62.80 |
本文方法+Lc+Lt | 88.70 | 94.60 | 63.90 |
表4 在VeRi数据集上本文方法的消融实验结果 ( %)
Tab. 4 Ablation experimental results of the proposed method on VeRi dataset
方法 | Rank-1 | Rank-5 | mAP |
---|---|---|---|
ResNet-50 | 83.49 | 92.31 | 52.88 |
ResNet-101 | 84.74 | 94.34 | 55.75 |
ResNet-50+DWT | 85.20 | 93.70 | 55.50 |
ResNet-50+FEM | 85.60 | 93.10 | 56.90 |
本文方法+Lc | 88.10 | 94.00 | 62.80 |
本文方法+Lc+Lt | 88.70 | 94.60 | 63.90 |
方法 | Test800 | Test1600 | Test2400 | |||
---|---|---|---|---|---|---|
Rank-1 | Rank-5 | Rank-1 | Rank-5 | Rank-1 | Rank-5 | |
BOW-SIFT[ | 2.81 | 4.23 | 3.11 | 5.22 | 2.11 | 3.76 |
LOMO[ | 19.74 | 32.14 | 18.95 | 29.46 | 15.26 | 25.63 |
BOW-CN[ | 13.14 | 22.69 | 12.94 | 21.09 | 10.20 | 17.89 |
GoogLeNet[ | 47.90 | 67.43 | 43.45 | 63.53 | 38.24 | 59.51 |
FACT[ | 49.53 | 67.96 | 44.63 | 64.19 | 39.91 | 60.49 |
NuFACT[ | 48.90 | 69.51 | 43.64 | 65.34 | 38.63 | 60.72 |
MLL+MLSR[ | 65.78 | 78.09 | 64.24 | 73.11 | 60.05 | 70.81 |
VAMI[ | 63.12 | 83.25 | 52.87 | 75.12 | 47.34 | 70.29 |
EALN[ | 67.19 | 78.20 | 63.23 | 77.12 | 59.98 | 74.20 |
本文方法 | 69.30 | 82.80 | 67.32 | 79.86 | 63.94 | 77.57 |
表5 VehicleID数据集上不同方法的对比 ( %)
Tab. 5 Comparison of different methods on VehicleID dataset
方法 | Test800 | Test1600 | Test2400 | |||
---|---|---|---|---|---|---|
Rank-1 | Rank-5 | Rank-1 | Rank-5 | Rank-1 | Rank-5 | |
BOW-SIFT[ | 2.81 | 4.23 | 3.11 | 5.22 | 2.11 | 3.76 |
LOMO[ | 19.74 | 32.14 | 18.95 | 29.46 | 15.26 | 25.63 |
BOW-CN[ | 13.14 | 22.69 | 12.94 | 21.09 | 10.20 | 17.89 |
GoogLeNet[ | 47.90 | 67.43 | 43.45 | 63.53 | 38.24 | 59.51 |
FACT[ | 49.53 | 67.96 | 44.63 | 64.19 | 39.91 | 60.49 |
NuFACT[ | 48.90 | 69.51 | 43.64 | 65.34 | 38.63 | 60.72 |
MLL+MLSR[ | 65.78 | 78.09 | 64.24 | 73.11 | 60.05 | 70.81 |
VAMI[ | 63.12 | 83.25 | 52.87 | 75.12 | 47.34 | 70.29 |
EALN[ | 67.19 | 78.20 | 63.23 | 77.12 | 59.98 | 74.20 |
本文方法 | 69.30 | 82.80 | 67.32 | 79.86 | 63.94 | 77.57 |
方法 | mAP | Rank-1 | Rank-5 |
---|---|---|---|
LOMO[ | 9.64 | 25.33 | 46.48 |
VGGNet[ | 12.76 | 44.10 | 62.63 |
GoogLeNet[ | 17.89 | 52.32 | 72.17 |
FACT[ | 18.49 | 50.95 | 73.48 |
NuFACT+Pate-SNN[ | 50.87 | 81.11 | 92.79 |
PROVID[ | 53.42 | 81.56 | 95.11 |
MLL+MLSR[ | 57.03 | 85.94 | 94.16 |
VAMI[ | 50.10 | 77.00 | 90.90 |
EALN[ | 57.40 | 84.40 | 94.10 |
AAVER[ | 58.50 | 88.70 | 94.10 |
QD-DLF[ | 61.80 | 88.50 | 94.50 |
本文方法 | 63.90 | 88.70 | 94.60 |
表6 在VeRi数据集上不同方法的对比 ( %)
Tab. 6 Comparison of different methods on VeRi dataset
方法 | mAP | Rank-1 | Rank-5 |
---|---|---|---|
LOMO[ | 9.64 | 25.33 | 46.48 |
VGGNet[ | 12.76 | 44.10 | 62.63 |
GoogLeNet[ | 17.89 | 52.32 | 72.17 |
FACT[ | 18.49 | 50.95 | 73.48 |
NuFACT+Pate-SNN[ | 50.87 | 81.11 | 92.79 |
PROVID[ | 53.42 | 81.56 | 95.11 |
MLL+MLSR[ | 57.03 | 85.94 | 94.16 |
VAMI[ | 50.10 | 77.00 | 90.90 |
EALN[ | 57.40 | 84.40 | 94.10 |
AAVER[ | 58.50 | 88.70 | 94.10 |
QD-DLF[ | 61.80 | 88.50 | 94.50 |
本文方法 | 63.90 | 88.70 | 94.60 |
方法 | 参数量/MB | 计算复杂度(GFLOPS) |
---|---|---|
ResNet-50 | 25.56 | 4.14 |
ResNet-50+DWT | 29.51 | 7.73 |
ResNet101 | 44.55 | 7.87 |
表7 复杂度分析
Tab. 7 Complexity analysis
方法 | 参数量/MB | 计算复杂度(GFLOPS) |
---|---|---|
ResNet-50 | 25.56 | 4.14 |
ResNet-50+DWT | 29.51 | 7.73 |
ResNet101 | 44.55 | 7.87 |
方法 | Rank-1 | Rank-5 | mAP |
---|---|---|---|
ResNet-50+DWT1 | 82.80 | 92.00 | 53.40 |
ResNet-50+DWT2 | 82.40 | 93.10 | 54.50 |
ResNet-50+DWT3 | 85.00 | 93.90 | 54.90 |
ResNet-50+DWT4 | 85.20 | 93.70 | 55.50 |
表8 VeRi数据集上小波变换层数对性能的影响 ( %)
Tab. 8 Effect of wavelet transform layers onperformance on VeRi dataset
方法 | Rank-1 | Rank-5 | mAP |
---|---|---|---|
ResNet-50+DWT1 | 82.80 | 92.00 | 53.40 |
ResNet-50+DWT2 | 82.40 | 93.10 | 54.50 |
ResNet-50+DWT3 | 85.00 | 93.90 | 54.90 |
ResNet-50+DWT4 | 85.20 | 93.70 | 55.50 |
1 | YANG L J, LUO P, LOY C C, et al. A large-scale car dataset for fine-grained categorization and verification[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 3973-3981. 10.1109/cvpr.2015.7299023 |
2 | GUO J M, HSIA C H, WONG K, et al. Nighttime vehicle lamp detection and tracking with adaptive mask training[J]. IEEE Transactions on Vehicular Technology, 2016, 65(6): 4023-4032. 10.1109/tvt.2015.2508020 |
3 | CHEN X Y, XIANG S M, LIU C L, et al. Vehicle detection in satellite images by hybrid deep convolutional neural networks[J]. IEEE Geoscience and Remote Sensing Letters, 2014, 11(10): 1797-1801. 10.1109/lgrs.2014.2309695 |
4 | ZHAO R, OUYANG W L, WANG X G. Unsupervised salience learning for person re-identification[C]// Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2013: 3586-3593. 10.1109/cvpr.2013.460 |
5 | LIAO S C, HU Y, ZHU X Y, et al. Person re-identification by local maximal occurrence representation and metric learning[C]// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 2197-2206. 10.1109/cvpr.2015.7298832 |
6 | ZHENG L, SHEN L Y, TIAN L, et al. Scalable person re-identification: a benchmark[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 1116-1124. 10.1109/iccv.2015.133 |
7 | ZHENG L, WANG S J, ZHOU W G, et al. Bayes merging of multiple vocabularies for scalable image retrieval[C]// Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 1963-1970. 10.1109/cvpr.2014.252 |
8 | LIU X C, LIU W, MA H D, et al. Large-scale vehicle re-identification in urban surveillance videos[C]// Proceedings of the 2016 IEEE International Conference on Multimedia and Expo. Piscataway: IEEE, 2016: 1-6. 10.1109/icme.2016.7553002 |
9 | LIU H Y, TIAN Y H, WANG Y W, et al. Deep relative distance learning: tell the difference between similar vehicles[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 2167-2175. 10.1109/cvpr.2016.238 |
10 | SIMONYAN K, ZISSERMAN A. Very deep convolution networks for large-scale image recognition[EB/OL]. (2015-04-10) [2021-02-20].. |
11 | SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 1-9. 10.1109/cvpr.2015.7298594 |
12 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 |
13 | WANG Z D, TANG L M, LIU X H, et al. Orientation invariant feature embedding and spatial temporal regularization for vehicle re-identification[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 379-387. 10.1109/iccv.2017.49 |
14 | ZHOU Y, LIU L, SHAO L. Vehicle re-identification by deep hidden multi-view inference[J]. IEEE Transactions on Image Processing, 2018, 27(7): 3275-3287. 10.1109/tip.2018.2819820 |
15 | ZHOU Y, SHAO L. Aware attentive multi-view inference for vehicle re-identification[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 6489-6498. 10.1109/cvpr.2018.00679 |
16 | SHEN Y T, XIAO T, LI H S, et al. Learning deep neural networks for vehicle re-ID with visual-spatio-temporal path proposals[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 1918-1927. 10.1109/iccv.2017.210 |
17 | LIU X C, LIU W, MEI T, et al. PROVID: progressive and multimodal vehicle reidentification for large-scale urban surveillance[J]. IEEE Transactions on Multimedia, 2018, 20(3): 645-658. 10.1109/tmm.2017.2751966 |
18 | TANG Y, WU D, JIN Z, et al. Multi-modal metric learning for vehicle re-identification in traffic surveillance environment[C]// Proceedings of the 2017 IEEE International Conference on Image Processing. Piscataway: IEEE, 2017: 2254-2258. 10.1109/icip.2017.8296683 |
19 | ZHAO L M, LI X, ZHUANG Y T, et al. Deeply-learned part-aligned representations for person re-identification[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 3239-3248. 10.1109/iccv.2017.349 |
20 | LI D W, CHEN X T, ZHANG Z, et al. Learning deep context-aware features over body and latent parts for person re-identification[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 7398-7407. 10.1109/cvpr.2017.782 |
21 | 邱奕敏,周毅. 基于小波变换的雾霾立体图像增强算法研究[J]. 计算机工程与应用, 2015, 51(9):30-33. 10.3778/j.issn.1002-8331.1409-0008 |
QIU Y M, ZHOU Y. Wavelet transform stereoscopic images enhancement algorithms based on fog and haze[J]. Computer Engineering and Applications, 2015, 51(9):30-33. 10.3778/j.issn.1002-8331.1409-0008 | |
22 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 2018 European Conference on Computer Vision, LNIP 11211. Cham: Springer, 2018: 3-19. |
23 | QIN X, WANG Z L, BAI Y C, et al. FFA-Net: feature fusion attention network for single image dehazing[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020:11908-11915. 10.1609/aaai.v34i07.6865 |
24 | HERMANS A, BEYER L, LEIBE B. In defense of the triplet loss for person re-identification[EB/OL]. (2017-11-21) [2021-02-20].. |
25 | HOU J H, ZENG H Q, CAI L, et al. Multi-label learning with multi-label smoothing regularization for vehicle re-identification[J]. Neurocomputing, 2019, 345:15-22. 10.1016/j.neucom.2018.11.088 |
26 | CHU R H, SUN Y F, LI Y D, et al. Vehicle re-identification with viewpoint-aware metric learning[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 8281-8290. 10.1109/iccv.2019.00837 |
27 | LOU Y H, BAI Y, LIU J, et al. Embedding adversarial learning for vehicle re-identification[J]. IEEE Transactions on Image Processing, 2019, 28(8):3794-3807. 10.1109/tip.2019.2902112 |
28 | KHORRAMSHAHI P, KUMAR A, PERI N, et al. A dual-path model with adaptive attention for vehicle re-identification[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 6131-6140. 10.1109/iccv.2019.00623 |
29 | ZHU J Q, ZENG H Q, HUANG J C, et al. Vehicle re-identification using quadruple directional deep learning features[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 21(1): 410-420. 10.1109/tits.2019.2901312 |
[1] | 秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974. |
[2] | 李云, 王富铕, 井佩光, 王粟, 肖澳. 基于不确定度感知的帧关联短视频事件检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2903-2910. |
[3] | 陈虹, 齐兵, 金海波, 武聪, 张立昂. 融合1D-CNN与BiGRU的类不平衡流量异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2493-2499. |
[4] | 赵宇博, 张丽萍, 闫盛, 侯敏, 高茂. 基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2421-2429. |
[5] | 张春雪, 仇丽青, 孙承爱, 荆彩霞. 基于两阶段动态兴趣识别的购买行为预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2365-2371. |
[6] | 陈彤, 杨丰玉, 熊宇, 严荭, 邱福星. 基于多尺度频率通道注意力融合的声纹库构建方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2407-2413. |
[7] | 高阳峄, 雷涛, 杜晓刚, 李岁永, 王营博, 闵重丹. 基于像素距离图和四维动态卷积网络的密集人群计数与定位方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2233-2242. |
[8] | 唐媛, 陈艳平, 扈应, 黄瑞章, 秦永彬. 基于多尺度混合注意力卷积神经网络的关系抽取模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2011-2017. |
[9] | 王东炜, 刘柏辰, 韩志, 王艳美, 唐延东. 基于低秩分解和向量量化的深度网络压缩方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 1987-1994. |
[10] | 李牧, 骆宇, 柯熙政. 基于调频连续波雷达的人体生命体征检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1978-1986. |
[11] | 姚迅, 秦忠正, 杨捷. 生成式标签对抗的文本分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1781-1785. |
[12] | 沈君凤, 周星辰, 汤灿. 基于改进的提示学习方法的双通道情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1796-1806. |
[13] | 黄梦源, 常侃, 凌铭阳, 韦新杰, 覃团发. 基于层间引导的低光照图像渐进增强算法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1911-1919. |
[14] | 李健京, 李贯峰, 秦飞舟, 李卫军. 基于不确定知识图谱嵌入的多关系近似推理模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1751-1759. |
[15] | 高文烁, 陈晓云. 基于节点结构的点云分类网络[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1471-1478. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||