Convolutional network-based vehicle re-identification combining wavelet features and attention mechanism

doi:10.11772/j.issn.1001-9081.2021040545

Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (6): 1876-1883.DOI: 10.11772/j.issn.1001-9081.2021040545

Special Issue: 人工智能

• Artificial intelligence • Previous Articles Next Articles

Convolutional network-based vehicle re-identification combining wavelet features and attention mechanism

Guangkai LIAO¹, Zheng ZHANG¹, Zhiguo SONG²()

^1.College of Information Science and Engineering，Jishou University，Jishou Hunan 416000，China
^2.College of Physics and Mechanical and Electrical Engineering，Jishou University，Jishou Hunan 416000，China

Received:2021-04-12 Revised:2021-07-09 Accepted:2021-07-09 Online:2022-06-22 Published:2022-06-10
Contact: Zhiguo SONG
About author:LIAO Guangkai，born in 1993，M. S. candidate. His research interests include vehicle re-identification，image retrieval.
ZHANG Zheng，born in 1981，Ph. D.，associate professor. His research interests include matrix computation
Supported by:
National Natural Science Foundation(32060238)

基于小波特征与注意力机制结合的卷积网络车辆重识别

廖光锴¹, 张正¹, 宋治国²()

^1.吉首大学信息科学与工程学院，湖南吉首 416000
^2.吉首大学物理与机电工程学院，湖南吉首 416000

通讯作者: 宋治国
作者简介:廖光锴（1993—），男，四川内江人，硕士研究生，主要研究方向：车辆重识别、图像检索
张正（1981—），男，湖南吉首人，副教授，博士，主要研究方向：矩阵计算
基金资助:
国家自然科学基金资助项目(32060238)

Abstract

Abstract:

Aiming at the problem of insufficient representation ability of features extracted by the existing vehicle re-identification methods based on convolution Neural Network （CNN）， a vehicle re-identification method based on the combination of wavelet features and attention mechanism was proposed. Firstly， the single-layer wavelet module was embedded in the convolution module to replace the pooling layer for subsampling， thereby reducing the loss of fine-grained features. Secondly， a new local attention module named Feature Extraction Module （FEM） was put forward by combining Channel Attention （CA） mechanism and Pixel Attention （PA） mechanism， which was embedded into CNN to weight and strengthen the key information. Comparison experiments with the benchmark residual convolutional network ResNet-50 and ResNet-101 were conducted on VeRi dataset. Experimental results show that increasing the number of wavelet decomposition layers in ResNet-50 can improve mean Average Precision （mAP）. In the ablation experiment， although ResNet-50+Discrete Wavelet Transform （DWT） has the mAP reduced by 0.25 percentage points compared with ResNet-101， it has the number of parameters and computational complexity lower than those of ResNet-101， and has the mAP， Rank-1 and Rank-5 higher than those of ResNet-50 without DWT， verifying that the proposed model can effectively improve the accuracy of vehicle retrieval in vehicle re-identification.

Key words: vehicle re-identification, Channel Attention (CA), Pixel Attention (PA), wavelet transform, Convolutional Neural Network (CNN)

摘要：

针对现有的基于卷积神经网络（CNN）的车辆重识别方法所提取的特征表达力不足的问题，提出一种基于小波特征与注意力机制相结合的车辆重识别方法。首先，将单层小波模块嵌入到卷积模块中代替池化层进行下采样，减少细粒度特征的丢失；其次，结合通道注意力（CA）机制和像素注意力（PA）机制提出一种新的局部注意力模块——特征提取模块（FEM）嵌入到卷积网络中，对关键信息进行加权强化。在VeRi数据集上与基准残差网络ResNet-50、ResNet-101进行对比。实验结果表明，在ResNet-50中增加小波变换层数能提高平均精度均值（mAP）；在消融实验中，虽然ResNet-50+离散小波变换（DWT）比ResNet-101的mAP降低了0.25个百分点，但是其参数量和计算复杂度都比ResNet-101低，且mAP、Rank-1和Rank-5均比单独的ResNet-50高，说明该模型在车辆重识别中能够有效提高车辆检索精度。

关键词: 车辆重识别, 通道注意力, 像素注意力, 小波变换, 卷积神经网络

CLC Number:

TP 391.41

Guangkai LIAO, Zheng ZHANG, Zhiguo SONG. Convolutional network-based vehicle re-identification combining wavelet features and attention mechanism[J]. Journal of Computer Applications, 2022, 42(6): 1876-1883.

廖光锴, 张正, 宋治国. 基于小波特征与注意力机制结合的卷积网络车辆重识别[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1876-1883.

Figures/Tables 15

Fig. 1 Network overall framework

Fig. 2 Two-dimensional discrete wavelet transform

Fig. 3 Residual unit

Tab. 1 Basic structure of proposed network and corresponding parameters of each module

结构	卷积核，通道	输出
Conv1	7×7，64	112×112
DWT	4×64	56×56
FEM	256	56×56
stage1	$1 × 1, 256,256 3 × 3, 256,256 1 × 1, 256,256, 1 × 1, 256,256 3 × 3, 256,256 1 × 1, 256,128$	56×56
DWT	4×128	28×28
FEM	512	28×28
stage2	$1 × 1, 512,512 3 × 3, 512,512 1 × 1, 512,512 × 2, 1 × 1, 512,512 3 × 3, 512,512 1 × 1, 512,256$	28×28
DWT	4×256	14×14
FEM	1 024	14×14
stage3	$1 × 1, 1 024,1 024 3 × 3, 1 024,1 024 1 × 1, 1 024,1 024 × 2, 1 × 1, 1 024,1 024 3 × 3, 1 024,1 024 1 × 1, 1 024,512$	14×14
DWT	4×512	7×7
FEM	2 048	7×7
stage4	$1 × 1, 2 048,2 048 3 × 3, 2 048,2 048 1 × 1, 2 048,2 048 × 2$	7×7

Tab. 1 Basic structure of proposed network and corresponding parameters of each module

结构	卷积核，通道	输出
Conv1	7×7，64	112×112
DWT	4×64	56×56
FEM	256	56×56
stage1	$1 × 1, 256,256 3 × 3, 256,256 1 × 1, 256,256, 1 × 1, 256,256 3 × 3, 256,256 1 × 1, 256,128$	56×56
DWT	4×128	28×28
FEM	512	28×28
stage2	$1 × 1, 512,512 3 × 3, 512,512 1 × 1, 512,512 × 2, 1 × 1, 512,512 3 × 3, 512,512 1 × 1, 512,256$	28×28
DWT	4×256	14×14
FEM	1 024	14×14
stage3	$1 × 1, 1 024,1 024 3 × 3, 1 024,1 024 1 × 1, 1 024,1 024 × 2, 1 × 1, 1 024,1 024 3 × 3, 1 024,1 024 1 × 1, 1 024,512$	14×14
DWT	4×512	7×7
FEM	2 048	7×7
stage4	$1 × 1, 2 048,2 048 3 × 3, 2 048,2 048 1 × 1, 2 048,2 048 × 2$	7×7

Fig. 4 CA module

Fig. 5 PA module

Fig. 6 Feature extraction module

Tab. 2 Comparison with ResNet-50 on VeRi dataset

方法	Rank-1	Rank-5	mAP
基线（ResNet-50）	83.49	92.31	52.88
本文方法	88.70	94.60	63.90

Tab. 3 Comparison of Rank-1 on VehicleID dataset

测试集	本文方法	基线（ResNet-50）
Test800	69.30	67.27
Test1600	67.32	62.03
Test2400	63.94	55.12

Tab. 4 Ablation experimental results of the proposed method on VeRi dataset

方法	Rank-1	Rank-5	mAP
ResNet-50	83.49	92.31	52.88
ResNet-101	84.74	94.34	55.75
ResNet-50+DWT	85.20	93.70	55.50
ResNet-50+FEM	85.60	93.10	56.90
本文方法+L_c	88.10	94.00	62.80
本文方法+L_c+L_t	88.70	94.60	63.90

Tab. 5 Comparison of different methods on VehicleID dataset

方法	Test800		Test1600		Test2400
方法	Rank-1	Rank-5	Rank-1	Rank-5	Rank-1	Rank-5
BOW-SIFT^［7］	2.81	4.23	3.11	5.22	2.11	3.76
LOMO^［5］	19.74	32.14	18.95	29.46	15.26	25.63
BOW-CN^［6］	13.14	22.69	12.94	21.09	10.20	17.89
GoogLeNet^［11］	47.90	67.43	43.45	63.53	38.24	59.51
FACT^［8］	49.53	67.96	44.63	64.19	39.91	60.49
NuFACT^［17］	48.90	69.51	43.64	65.34	38.63	60.72
MLL+MLSR^［25］	65.78	78.09	64.24	73.11	60.05	70.81
VAMI^［26］	63.12	83.25	52.87	75.12	47.34	70.29
EALN^［27］	67.19	78.20	63.23	77.12	59.98	74.20
本文方法	69.30	82.80	67.32	79.86	63.94	77.57

Tab. 6 Comparison of different methods on VeRi dataset

方法	mAP	Rank-1	Rank-5
LOMO^［5］	9.64	25.33	46.48
VGGNet^［10］	12.76	44.10	62.63
GoogLeNet^［11］	17.89	52.32	72.17
FACT^［8］	18.49	50.95	73.48
NuFACT+Pate-SNN^［17］	50.87	81.11	92.79
PROVID^［17］	53.42	81.56	95.11
MLL+MLSR^［25］	57.03	85.94	94.16
VAMI^［26］	50.10	77.00	90.90
EALN^［27］	57.40	84.40	94.10
AAVER^［28］	58.50	88.70	94.10
QD-DLF^［29］	61.80	88.50	94.50
本文方法	63.90	88.70	94.60

Tab. 7 Complexity analysis

方法	参数量/MB	计算复杂度（GFLOPS）
ResNet-50	25.56	4.14
ResNet-50+DWT	29.51	7.73
ResNet101	44.55	7.87

Tab. 8 Effect of wavelet transform layers onperformance on VeRi dataset

方法	Rank-1	Rank-5	mAP
ResNet-50+DWT1	82.80	92.00	53.40
ResNet-50+DWT2	82.40	93.10	54.50
ResNet-50+DWT3	85.00	93.90	54.90
ResNet-50+DWT4	85.20	93.70	55.50

Fig. 7 Query visualization of Rank-10 results

References 29

1	YANG L J， LUO P， LOY C C， et al. A large-scale car dataset for fine-grained categorization and verification［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 3973-3981. 10.1109/cvpr.2015.7299023
2	GUO J M， HSIA C H， WONG K， et al. Nighttime vehicle lamp detection and tracking with adaptive mask training［J］. IEEE Transactions on Vehicular Technology， 2016， 65（6）： 4023-4032. 10.1109/tvt.2015.2508020
3	CHEN X Y， XIANG S M， LIU C L， et al. Vehicle detection in satellite images by hybrid deep convolutional neural networks［J］. IEEE Geoscience and Remote Sensing Letters， 2014， 11（10）： 1797-1801. 10.1109/lgrs.2014.2309695
4	ZHAO R， OUYANG W L， WANG X G. Unsupervised salience learning for person re-identification［C］// Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2013： 3586-3593. 10.1109/cvpr.2013.460
5	LIAO S C， HU Y， ZHU X Y， et al. Person re-identification by local maximal occurrence representation and metric learning［C］// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 2197-2206. 10.1109/cvpr.2015.7298832
6	ZHENG L， SHEN L Y， TIAN L， et al. Scalable person re-identification： a benchmark［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 1116-1124. 10.1109/iccv.2015.133
7	ZHENG L， WANG S J， ZHOU W G， et al. Bayes merging of multiple vocabularies for scalable image retrieval［C］// Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014： 1963-1970. 10.1109/cvpr.2014.252
8	LIU X C， LIU W， MA H D， et al. Large-scale vehicle re-identification in urban surveillance videos［C］// Proceedings of the 2016 IEEE International Conference on Multimedia and Expo. Piscataway： IEEE， 2016： 1-6. 10.1109/icme.2016.7553002
9	LIU H Y， TIAN Y H， WANG Y W， et al. Deep relative distance learning： tell the difference between similar vehicles［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 2167-2175. 10.1109/cvpr.2016.238
10	SIMONYAN K， ZISSERMAN A. Very deep convolution networks for large-scale image recognition［EB/OL］. （2015-04-10）［2021-02-20］..
11	SZEGEDY C， LIU W， JIA Y Q， et al. Going deeper with convolutions［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 1-9. 10.1109/cvpr.2015.7298594
12	HE K M， ZHANG X Y， REN S Q， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778. 10.1109/cvpr.2016.90
13	WANG Z D， TANG L M， LIU X H， et al. Orientation invariant feature embedding and spatial temporal regularization for vehicle re-identification［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 379-387. 10.1109/iccv.2017.49
14	ZHOU Y， LIU L， SHAO L. Vehicle re-identification by deep hidden multi-view inference［J］. IEEE Transactions on Image Processing， 2018， 27（7）： 3275-3287. 10.1109/tip.2018.2819820
15	ZHOU Y， SHAO L. Aware attentive multi-view inference for vehicle re-identification［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 6489-6498. 10.1109/cvpr.2018.00679
16	SHEN Y T， XIAO T， LI H S， et al. Learning deep neural networks for vehicle re-ID with visual-spatio-temporal path proposals［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 1918-1927. 10.1109/iccv.2017.210
17	LIU X C， LIU W， MEI T， et al. PROVID： progressive and multimodal vehicle reidentification for large-scale urban surveillance［J］. IEEE Transactions on Multimedia， 2018， 20（3）： 645-658. 10.1109/tmm.2017.2751966
18	TANG Y， WU D， JIN Z， et al. Multi-modal metric learning for vehicle re-identification in traffic surveillance environment［C］// Proceedings of the 2017 IEEE International Conference on Image Processing. Piscataway： IEEE， 2017： 2254-2258. 10.1109/icip.2017.8296683
19	ZHAO L M， LI X， ZHUANG Y T， et al. Deeply-learned part-aligned representations for person re-identification［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 3239-3248. 10.1109/iccv.2017.349
20	LI D W， CHEN X T， ZHANG Z， et al. Learning deep context-aware features over body and latent parts for person re-identification［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 7398-7407. 10.1109/cvpr.2017.782
21	邱奕敏，周毅. 基于小波变换的雾霾立体图像增强算法研究［J］. 计算机工程与应用， 2015， 51（9）：30-33. 10.3778/j.issn.1002-8331.1409-0008
	QIU Y M， ZHOU Y. Wavelet transform stereoscopic images enhancement algorithms based on fog and haze［J］. Computer Engineering and Applications， 2015， 51（9）：30-33. 10.3778/j.issn.1002-8331.1409-0008
22	WOO S， PARK J， LEE J Y， et al. CBAM： convolutional block attention module［C］// Proceedings of the 2018 European Conference on Computer Vision， LNIP 11211. Cham： Springer， 2018： 3-19.
23	QIN X， WANG Z L， BAI Y C， et al. FFA-Net： feature fusion attention network for single image dehazing［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2020：11908-11915. 10.1609/aaai.v34i07.6865
24	HERMANS A， BEYER L， LEIBE B. In defense of the triplet loss for person re-identification［EB/OL］. （2017-11-21）［2021-02-20］..
25	HOU J H， ZENG H Q， CAI L， et al. Multi-label learning with multi-label smoothing regularization for vehicle re-identification［J］. Neurocomputing， 2019， 345：15-22. 10.1016/j.neucom.2018.11.088
26	CHU R H， SUN Y F， LI Y D， et al. Vehicle re-identification with viewpoint-aware metric learning［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 8281-8290. 10.1109/iccv.2019.00837
27	LOU Y H， BAI Y， LIU J， et al. Embedding adversarial learning for vehicle re-identification［J］. IEEE Transactions on Image Processing， 2019， 28（8）：3794-3807. 10.1109/tip.2019.2902112
28	KHORRAMSHAHI P， KUMAR A， PERI N， et al. A dual-path model with adaptive attention for vehicle re-identification［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 6131-6140. 10.1109/iccv.2019.00623
29	ZHU J Q， ZENG H Q， HUANG J C， et al. Vehicle re-identification using quadruple directional deep learning features［J］. IEEE Transactions on Intelligent Transportation Systems， 2020， 21（1）： 410-420. 10.1109/tits.2019.2901312

[1]	Yun LI, Fuyou WANG, Peiguang JING, Su WANG, Ao XIAO. Uncertainty-based frame associated short video event detection method [J]. Journal of Computer Applications, 2024, 44(9): 2903-2910.
[2]	Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG. Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU [J]. Journal of Computer Applications, 2024, 44(8): 2493-2499.
[3]	Dongwei WANG, Baichen LIU, Zhi HAN, Yanmei WANG, Yandong TANG. Deep network compression method based on low-rank decomposition and vector quantization [J]. Journal of Computer Applications, 2024, 44(7): 1987-1994.
[4]	Yangyi GAO, Tao LEI, Xiaogang DU, Suiyong LI, Yingbo WANG, Chongdan MIN. Crowd counting and locating method based on pixel distance map and four-dimensional dynamic convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2233-2242.
[5]	Mu LI, Yu LUO, Xizheng KE. Human vital signs detection algorithm based on frequency modulated continuous wave radar [J]. Journal of Computer Applications, 2024, 44(6): 1978-1986.
[6]	Mengyuan HUANG, Kan CHANG, Mingyang LING, Xinjie WEI, Tuanfa QIN. Progressive enhancement algorithm for low-light images based on layer guidance [J]. Journal of Computer Applications, 2024, 44(6): 1911-1919.
[7]	Jianjing LI, Guanfeng LI, Feizhou QIN, Weijun LI. Multi-relation approximate reasoning model based on uncertain knowledge graph embedding [J]. Journal of Computer Applications, 2024, 44(6): 1751-1759.
[8]	Wenshuo GAO, Xiaoyun CHEN. Point cloud classification network based on node structure [J]. Journal of Computer Applications, 2024, 44(5): 1471-1478.
[9]	Min SUN, Qian CHENG, Xining DING. CBAM-CGRU-SVM based malware detection method for Android [J]. Journal of Computer Applications, 2024, 44(5): 1539-1545.
[10]	Jie WANG, Hua MENG. Image classification algorithm based on overall topological structure of point cloud [J]. Journal of Computer Applications, 2024, 44(4): 1107-1113.
[11]	Tianhua CHEN, Jiaxuan ZHU, Jie YIN. Bird recognition algorithm based on attention mechanism [J]. Journal of Computer Applications, 2024, 44(4): 1114-1120.
[12]	Lijun XU, Hui LI, Zuyang LIU, Kansong CHEN, Weixuan MA. 3D-GA-Unet： MRI image segmentation algorithm for glioma based on 3D-Ghost CNN [J]. Journal of Computer Applications, 2024, 44(4): 1294-1302.
[13]	Jingxian ZHOU, Xina LI. UAV detection and recognition based on improved convolutional neural network and radio frequency fingerprint [J]. Journal of Computer Applications, 2024, 44(3): 876-882.
[14]	Ruifeng HOU, Pengcheng ZHANG, Liyuan ZHANG, Zhiguo GUI, Yi LIU, Haowen ZHANG, Shubin WANG. Iterative denoising network based on total variation regular term expansion [J]. Journal of Computer Applications, 2024, 44(3): 916-921.
[15]	Yongfeng DONG, Jiaming BAI, Liqin WANG, Xu WANG. Chinese named entity recognition combining prior knowledge and glyph features [J]. Journal of Computer Applications, 2024, 44(3): 702-708.

Convolutional network-based vehicle re-identification combining wavelet features and attention mechanism

基于小波特征与注意力机制结合的卷积网络车辆重识别

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 15

References 29

Related Articles 15

Recommended Articles

Metrics