《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (6): 1876-1883.DOI: 10.11772/j.issn.1001-9081.2021040545
所属专题: 人工智能
收稿日期:2021-04-12
									
				
											修回日期:2021-07-09
									
				
											接受日期:2021-07-09
									
				
											发布日期:2022-06-22
									
				
											出版日期:2022-06-10
									
				
			通讯作者:
					宋治国
							作者简介:廖光锴(1993—),男,四川内江人,硕士研究生,主要研究方向:车辆重识别、图像检索基金资助:
        
                                                                                                            Guangkai LIAO1, Zheng ZHANG1, Zhiguo SONG2( )
)
			  
			
			
			
                
        
    
Received:2021-04-12
									
				
											Revised:2021-07-09
									
				
											Accepted:2021-07-09
									
				
											Online:2022-06-22
									
				
											Published:2022-06-10
									
			Contact:
					Zhiguo SONG   
							About author:LIAO Guangkai,born in 1993,M. S. candidate. His research interests include vehicle re-identification,image retrieval.Supported by:摘要:
针对现有的基于卷积神经网络(CNN)的车辆重识别方法所提取的特征表达力不足的问题,提出一种基于小波特征与注意力机制相结合的车辆重识别方法。首先,将单层小波模块嵌入到卷积模块中代替池化层进行下采样,减少细粒度特征的丢失;其次,结合通道注意力(CA)机制和像素注意力(PA)机制提出一种新的局部注意力模块——特征提取模块(FEM)嵌入到卷积网络中,对关键信息进行加权强化。在VeRi数据集上与基准残差网络ResNet-50、ResNet-101进行对比。实验结果表明,在ResNet-50中增加小波变换层数能提高平均精度均值(mAP);在消融实验中,虽然ResNet-50+离散小波变换(DWT)比ResNet-101的mAP降低了0.25个百分点,但是其参数量和计算复杂度都比ResNet-101低,且mAP、Rank-1和Rank-5均比单独的ResNet-50高,说明该模型在车辆重识别中能够有效提高车辆检索精度。
中图分类号:
廖光锴, 张正, 宋治国. 基于小波特征与注意力机制结合的卷积网络车辆重识别[J]. 计算机应用, 2022, 42(6): 1876-1883.
Guangkai LIAO, Zheng ZHANG, Zhiguo SONG. Convolutional network-based vehicle re-identification combining wavelet features and attention mechanism[J]. Journal of Computer Applications, 2022, 42(6): 1876-1883.
| 结构 | 卷积核,通道 | 输出 | 
|---|---|---|
| Conv1 | 7×7,64 | 112×112 | 
| DWT | 4×64 | 56×56 | 
| FEM | 256 | 56×56 | 
| stage1 | 56×56 | |
| DWT | 4×128 | 28×28 | 
| FEM | 512 | 28×28 | 
| stage2 | 28×28 | |
| DWT | 4×256 | 14×14 | 
| FEM | 1 024 | 14×14 | 
| stage3 | 14×14 | |
| DWT | 4×512 | 7×7 | 
| FEM | 2 048 | 7×7 | 
| stage4 | 7×7 | 
表1 本文网络的基本结构和各模块对应参数
Tab. 1 Basic structure of proposed network and corresponding parameters of each module
| 结构 | 卷积核,通道 | 输出 | 
|---|---|---|
| Conv1 | 7×7,64 | 112×112 | 
| DWT | 4×64 | 56×56 | 
| FEM | 256 | 56×56 | 
| stage1 | 56×56 | |
| DWT | 4×128 | 28×28 | 
| FEM | 512 | 28×28 | 
| stage2 | 28×28 | |
| DWT | 4×256 | 14×14 | 
| FEM | 1 024 | 14×14 | 
| stage3 | 14×14 | |
| DWT | 4×512 | 7×7 | 
| FEM | 2 048 | 7×7 | 
| stage4 | 7×7 | 
| 方法 | Rank-1 | Rank-5 | mAP | 
|---|---|---|---|
| 基线(ResNet-50) | 83.49 | 92.31 | 52.88 | 
| 本文方法 | 88.70 | 94.60 | 63.90 | 
表2 在VeRi数据集上与ResNet-50的比较 ( %)
Tab. 2 Comparison with ResNet-50 on VeRi dataset
| 方法 | Rank-1 | Rank-5 | mAP | 
|---|---|---|---|
| 基线(ResNet-50) | 83.49 | 92.31 | 52.88 | 
| 本文方法 | 88.70 | 94.60 | 63.90 | 
| 测试集 | 本文方法 | 基线(ResNet-50) | 
|---|---|---|
| Test800 | 69.30 | 67.27 | 
| Test1600 | 67.32 | 62.03 | 
| Test2400 | 63.94 | 55.12 | 
表3 在VehicleID数据集上的Rank-1比较 ( %)
Tab. 3 Comparison of Rank-1 on VehicleID dataset
| 测试集 | 本文方法 | 基线(ResNet-50) | 
|---|---|---|
| Test800 | 69.30 | 67.27 | 
| Test1600 | 67.32 | 62.03 | 
| Test2400 | 63.94 | 55.12 | 
| 方法 | Rank-1 | Rank-5 | mAP | 
|---|---|---|---|
| ResNet-50 | 83.49 | 92.31 | 52.88 | 
| ResNet-101 | 84.74 | 94.34 | 55.75 | 
| ResNet-50+DWT | 85.20 | 93.70 | 55.50 | 
| ResNet-50+FEM | 85.60 | 93.10 | 56.90 | 
| 本文方法+Lc | 88.10 | 94.00 | 62.80 | 
| 本文方法+Lc+Lt | 88.70 | 94.60 | 63.90 | 
表4 在VeRi数据集上本文方法的消融实验结果 ( %)
Tab. 4 Ablation experimental results of the proposed method on VeRi dataset
| 方法 | Rank-1 | Rank-5 | mAP | 
|---|---|---|---|
| ResNet-50 | 83.49 | 92.31 | 52.88 | 
| ResNet-101 | 84.74 | 94.34 | 55.75 | 
| ResNet-50+DWT | 85.20 | 93.70 | 55.50 | 
| ResNet-50+FEM | 85.60 | 93.10 | 56.90 | 
| 本文方法+Lc | 88.10 | 94.00 | 62.80 | 
| 本文方法+Lc+Lt | 88.70 | 94.60 | 63.90 | 
| 方法 | Test800 | Test1600 | Test2400 | |||
|---|---|---|---|---|---|---|
| Rank-1 | Rank-5 | Rank-1 | Rank-5 | Rank-1 | Rank-5 | |
| BOW-SIFT[ | 2.81 | 4.23 | 3.11 | 5.22 | 2.11 | 3.76 | 
| LOMO[ | 19.74 | 32.14 | 18.95 | 29.46 | 15.26 | 25.63 | 
| BOW-CN[ | 13.14 | 22.69 | 12.94 | 21.09 | 10.20 | 17.89 | 
| GoogLeNet[ | 47.90 | 67.43 | 43.45 | 63.53 | 38.24 | 59.51 | 
| FACT[ | 49.53 | 67.96 | 44.63 | 64.19 | 39.91 | 60.49 | 
| NuFACT[ | 48.90 | 69.51 | 43.64 | 65.34 | 38.63 | 60.72 | 
| MLL+MLSR[ | 65.78 | 78.09 | 64.24 | 73.11 | 60.05 | 70.81 | 
| VAMI[ | 63.12 | 83.25 | 52.87 | 75.12 | 47.34 | 70.29 | 
| EALN[ | 67.19 | 78.20 | 63.23 | 77.12 | 59.98 | 74.20 | 
| 本文方法 | 69.30 | 82.80 | 67.32 | 79.86 | 63.94 | 77.57 | 
表5 VehicleID数据集上不同方法的对比 ( %)
Tab. 5 Comparison of different methods on VehicleID dataset
| 方法 | Test800 | Test1600 | Test2400 | |||
|---|---|---|---|---|---|---|
| Rank-1 | Rank-5 | Rank-1 | Rank-5 | Rank-1 | Rank-5 | |
| BOW-SIFT[ | 2.81 | 4.23 | 3.11 | 5.22 | 2.11 | 3.76 | 
| LOMO[ | 19.74 | 32.14 | 18.95 | 29.46 | 15.26 | 25.63 | 
| BOW-CN[ | 13.14 | 22.69 | 12.94 | 21.09 | 10.20 | 17.89 | 
| GoogLeNet[ | 47.90 | 67.43 | 43.45 | 63.53 | 38.24 | 59.51 | 
| FACT[ | 49.53 | 67.96 | 44.63 | 64.19 | 39.91 | 60.49 | 
| NuFACT[ | 48.90 | 69.51 | 43.64 | 65.34 | 38.63 | 60.72 | 
| MLL+MLSR[ | 65.78 | 78.09 | 64.24 | 73.11 | 60.05 | 70.81 | 
| VAMI[ | 63.12 | 83.25 | 52.87 | 75.12 | 47.34 | 70.29 | 
| EALN[ | 67.19 | 78.20 | 63.23 | 77.12 | 59.98 | 74.20 | 
| 本文方法 | 69.30 | 82.80 | 67.32 | 79.86 | 63.94 | 77.57 | 
| 方法 | mAP | Rank-1 | Rank-5 | 
|---|---|---|---|
| LOMO[ | 9.64 | 25.33 | 46.48 | 
| VGGNet[ | 12.76 | 44.10 | 62.63 | 
| GoogLeNet[ | 17.89 | 52.32 | 72.17 | 
| FACT[ | 18.49 | 50.95 | 73.48 | 
| NuFACT+Pate-SNN[ | 50.87 | 81.11 | 92.79 | 
| PROVID[ | 53.42 | 81.56 | 95.11 | 
| MLL+MLSR[ | 57.03 | 85.94 | 94.16 | 
| VAMI[ | 50.10 | 77.00 | 90.90 | 
| EALN[ | 57.40 | 84.40 | 94.10 | 
| AAVER[ | 58.50 | 88.70 | 94.10 | 
| QD-DLF[ | 61.80 | 88.50 | 94.50 | 
| 本文方法 | 63.90 | 88.70 | 94.60 | 
表6 在VeRi数据集上不同方法的对比 ( %)
Tab. 6 Comparison of different methods on VeRi dataset
| 方法 | mAP | Rank-1 | Rank-5 | 
|---|---|---|---|
| LOMO[ | 9.64 | 25.33 | 46.48 | 
| VGGNet[ | 12.76 | 44.10 | 62.63 | 
| GoogLeNet[ | 17.89 | 52.32 | 72.17 | 
| FACT[ | 18.49 | 50.95 | 73.48 | 
| NuFACT+Pate-SNN[ | 50.87 | 81.11 | 92.79 | 
| PROVID[ | 53.42 | 81.56 | 95.11 | 
| MLL+MLSR[ | 57.03 | 85.94 | 94.16 | 
| VAMI[ | 50.10 | 77.00 | 90.90 | 
| EALN[ | 57.40 | 84.40 | 94.10 | 
| AAVER[ | 58.50 | 88.70 | 94.10 | 
| QD-DLF[ | 61.80 | 88.50 | 94.50 | 
| 本文方法 | 63.90 | 88.70 | 94.60 | 
| 方法 | 参数量/MB | 计算复杂度(GFLOPS) | 
|---|---|---|
| ResNet-50 | 25.56 | 4.14 | 
| ResNet-50+DWT | 29.51 | 7.73 | 
| ResNet101 | 44.55 | 7.87 | 
表7 复杂度分析
Tab. 7 Complexity analysis
| 方法 | 参数量/MB | 计算复杂度(GFLOPS) | 
|---|---|---|
| ResNet-50 | 25.56 | 4.14 | 
| ResNet-50+DWT | 29.51 | 7.73 | 
| ResNet101 | 44.55 | 7.87 | 
| 方法 | Rank-1 | Rank-5 | mAP | 
|---|---|---|---|
| ResNet-50+DWT1 | 82.80 | 92.00 | 53.40 | 
| ResNet-50+DWT2 | 82.40 | 93.10 | 54.50 | 
| ResNet-50+DWT3 | 85.00 | 93.90 | 54.90 | 
| ResNet-50+DWT4 | 85.20 | 93.70 | 55.50 | 
表8 VeRi数据集上小波变换层数对性能的影响 ( %)
Tab. 8 Effect of wavelet transform layers onperformance on VeRi dataset
| 方法 | Rank-1 | Rank-5 | mAP | 
|---|---|---|---|
| ResNet-50+DWT1 | 82.80 | 92.00 | 53.40 | 
| ResNet-50+DWT2 | 82.40 | 93.10 | 54.50 | 
| ResNet-50+DWT3 | 85.00 | 93.90 | 54.90 | 
| ResNet-50+DWT4 | 85.20 | 93.70 | 55.50 | 
| 1 | YANG L J, LUO P, LOY C C, et al. A large-scale car dataset for fine-grained categorization and verification[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 3973-3981. 10.1109/cvpr.2015.7299023 | 
| 2 | GUO J M, HSIA C H, WONG K, et al. Nighttime vehicle lamp detection and tracking with adaptive mask training[J]. IEEE Transactions on Vehicular Technology, 2016, 65(6): 4023-4032. 10.1109/tvt.2015.2508020 | 
| 3 | CHEN X Y, XIANG S M, LIU C L, et al. Vehicle detection in satellite images by hybrid deep convolutional neural networks[J]. IEEE Geoscience and Remote Sensing Letters, 2014, 11(10): 1797-1801. 10.1109/lgrs.2014.2309695 | 
| 4 | ZHAO R, OUYANG W L, WANG X G. Unsupervised salience learning for person re-identification[C]// Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2013: 3586-3593. 10.1109/cvpr.2013.460 | 
| 5 | LIAO S C, HU Y, ZHU X Y, et al. Person re-identification by local maximal occurrence representation and metric learning[C]// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 2197-2206. 10.1109/cvpr.2015.7298832 | 
| 6 | ZHENG L, SHEN L Y, TIAN L, et al. Scalable person re-identification: a benchmark[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 1116-1124. 10.1109/iccv.2015.133 | 
| 7 | ZHENG L, WANG S J, ZHOU W G, et al. Bayes merging of multiple vocabularies for scalable image retrieval[C]// Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 1963-1970. 10.1109/cvpr.2014.252 | 
| 8 | LIU X C, LIU W, MA H D, et al. Large-scale vehicle re-identification in urban surveillance videos[C]// Proceedings of the 2016 IEEE International Conference on Multimedia and Expo. Piscataway: IEEE, 2016: 1-6. 10.1109/icme.2016.7553002 | 
| 9 | LIU H Y, TIAN Y H, WANG Y W, et al. Deep relative distance learning: tell the difference between similar vehicles[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 2167-2175. 10.1109/cvpr.2016.238 | 
| 10 | SIMONYAN K, ZISSERMAN A. Very deep convolution networks for large-scale image recognition[EB/OL]. (2015-04-10) [2021-02-20].. | 
| 11 | SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 1-9. 10.1109/cvpr.2015.7298594 | 
| 12 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 | 
| 13 | WANG Z D, TANG L M, LIU X H, et al. Orientation invariant feature embedding and spatial temporal regularization for vehicle re-identification[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 379-387. 10.1109/iccv.2017.49 | 
| 14 | ZHOU Y, LIU L, SHAO L. Vehicle re-identification by deep hidden multi-view inference[J]. IEEE Transactions on Image Processing, 2018, 27(7): 3275-3287. 10.1109/tip.2018.2819820 | 
| 15 | ZHOU Y, SHAO L. Aware attentive multi-view inference for vehicle re-identification[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 6489-6498. 10.1109/cvpr.2018.00679 | 
| 16 | SHEN Y T, XIAO T, LI H S, et al. Learning deep neural networks for vehicle re-ID with visual-spatio-temporal path proposals[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 1918-1927. 10.1109/iccv.2017.210 | 
| 17 | LIU X C, LIU W, MEI T, et al. PROVID: progressive and multimodal vehicle reidentification for large-scale urban surveillance[J]. IEEE Transactions on Multimedia, 2018, 20(3): 645-658. 10.1109/tmm.2017.2751966 | 
| 18 | TANG Y, WU D, JIN Z, et al. Multi-modal metric learning for vehicle re-identification in traffic surveillance environment[C]// Proceedings of the 2017 IEEE International Conference on Image Processing. Piscataway: IEEE, 2017: 2254-2258. 10.1109/icip.2017.8296683 | 
| 19 | ZHAO L M, LI X, ZHUANG Y T, et al. Deeply-learned part-aligned representations for person re-identification[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 3239-3248. 10.1109/iccv.2017.349 | 
| 20 | LI D W, CHEN X T, ZHANG Z, et al. Learning deep context-aware features over body and latent parts for person re-identification[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 7398-7407. 10.1109/cvpr.2017.782 | 
| 21 | 邱奕敏,周毅. 基于小波变换的雾霾立体图像增强算法研究[J]. 计算机工程与应用, 2015, 51(9):30-33. 10.3778/j.issn.1002-8331.1409-0008 | 
| QIU Y M, ZHOU Y. Wavelet transform stereoscopic images enhancement algorithms based on fog and haze[J]. Computer Engineering and Applications, 2015, 51(9):30-33. 10.3778/j.issn.1002-8331.1409-0008 | |
| 22 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 2018 European Conference on Computer Vision, LNIP 11211. Cham: Springer, 2018: 3-19. | 
| 23 | QIN X, WANG Z L, BAI Y C, et al. FFA-Net: feature fusion attention network for single image dehazing[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020:11908-11915. 10.1609/aaai.v34i07.6865 | 
| 24 | HERMANS A, BEYER L, LEIBE B. In defense of the triplet loss for person re-identification[EB/OL]. (2017-11-21) [2021-02-20].. | 
| 25 | HOU J H, ZENG H Q, CAI L, et al. Multi-label learning with multi-label smoothing regularization for vehicle re-identification[J]. Neurocomputing, 2019, 345:15-22. 10.1016/j.neucom.2018.11.088 | 
| 26 | CHU R H, SUN Y F, LI Y D, et al. Vehicle re-identification with viewpoint-aware metric learning[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 8281-8290. 10.1109/iccv.2019.00837 | 
| 27 | LOU Y H, BAI Y, LIU J, et al. Embedding adversarial learning for vehicle re-identification[J]. IEEE Transactions on Image Processing, 2019, 28(8):3794-3807. 10.1109/tip.2019.2902112 | 
| 28 | KHORRAMSHAHI P, KUMAR A, PERI N, et al. A dual-path model with adaptive attention for vehicle re-identification[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 6131-6140. 10.1109/iccv.2019.00623 | 
| 29 | ZHU J Q, ZENG H Q, HUANG J C, et al. Vehicle re-identification using quadruple directional deep learning features[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 21(1): 410-420. 10.1109/tits.2019.2901312 | 
| [1] | 秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974. | 
| [2] | 李云, 王富铕, 井佩光, 王粟, 肖澳. 基于不确定度感知的帧关联短视频事件检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2903-2910. | 
| [3] | 陈虹, 齐兵, 金海波, 武聪, 张立昂. 融合1D-CNN与BiGRU的类不平衡流量异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2493-2499. | 
| [4] | 赵宇博, 张丽萍, 闫盛, 侯敏, 高茂. 基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2421-2429. | 
| [5] | 张春雪, 仇丽青, 孙承爱, 荆彩霞. 基于两阶段动态兴趣识别的购买行为预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2365-2371. | 
| [6] | 陈彤, 杨丰玉, 熊宇, 严荭, 邱福星. 基于多尺度频率通道注意力融合的声纹库构建方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2407-2413. | 
| [7] | 高阳峄, 雷涛, 杜晓刚, 李岁永, 王营博, 闵重丹. 基于像素距离图和四维动态卷积网络的密集人群计数与定位方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2233-2242. | 
| [8] | 唐媛, 陈艳平, 扈应, 黄瑞章, 秦永彬. 基于多尺度混合注意力卷积神经网络的关系抽取模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2011-2017. | 
| [9] | 王东炜, 刘柏辰, 韩志, 王艳美, 唐延东. 基于低秩分解和向量量化的深度网络压缩方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 1987-1994. | 
| [10] | 李牧, 骆宇, 柯熙政. 基于调频连续波雷达的人体生命体征检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1978-1986. | 
| [11] | 姚迅, 秦忠正, 杨捷. 生成式标签对抗的文本分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1781-1785. | 
| [12] | 沈君凤, 周星辰, 汤灿. 基于改进的提示学习方法的双通道情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1796-1806. | 
| [13] | 黄梦源, 常侃, 凌铭阳, 韦新杰, 覃团发. 基于层间引导的低光照图像渐进增强算法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1911-1919. | 
| [14] | 李健京, 李贯峰, 秦飞舟, 李卫军. 基于不确定知识图谱嵌入的多关系近似推理模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1751-1759. | 
| [15] | 高文烁, 陈晓云. 基于节点结构的点云分类网络[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1471-1478. | 
| 阅读次数 | ||||||
| 全文 |  | |||||
| 摘要 |  | |||||