基于小波特征与注意力机制结合的卷积网络车辆重识别

doi:10.11772/j.issn.1001-9081.2021040545

《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (6): 1876-1883.DOI: 10.11772/j.issn.1001-9081.2021040545

• 人工智能 • 上一篇

基于小波特征与注意力机制结合的卷积网络车辆重识别

廖光锴¹, 张正¹, 宋治国²()

^1.吉首大学信息科学与工程学院，湖南吉首 416000
^2.吉首大学物理与机电工程学院，湖南吉首 416000

收稿日期:2021-04-12 修回日期:2021-07-09 接受日期:2021-07-09 发布日期:2022-06-22 出版日期:2022-06-10
通讯作者: 宋治国
作者简介:廖光锴（1993—），男，四川内江人，硕士研究生，主要研究方向：车辆重识别、图像检索
张正（1981—），男，湖南吉首人，副教授，博士，主要研究方向：矩阵计算
基金资助:
国家自然科学基金资助项目(32060238)

Convolutional network-based vehicle re-identification combining wavelet features and attention mechanism

Guangkai LIAO¹, Zheng ZHANG¹, Zhiguo SONG²()

^1.College of Information Science and Engineering，Jishou University，Jishou Hunan 416000，China
^2.College of Physics and Mechanical and Electrical Engineering，Jishou University，Jishou Hunan 416000，China

Received:2021-04-12 Revised:2021-07-09 Accepted:2021-07-09 Online:2022-06-22 Published:2022-06-10
Contact: Zhiguo SONG
About author:LIAO Guangkai，born in 1993，M. S. candidate. His research interests include vehicle re-identification，image retrieval.
ZHANG Zheng，born in 1981，Ph. D.，associate professor. His research interests include matrix computation
Supported by:
National Natural Science Foundation(32060238)

摘要/Abstract

摘要：

针对现有的基于卷积神经网络（CNN）的车辆重识别方法所提取的特征表达力不足的问题，提出一种基于小波特征与注意力机制相结合的车辆重识别方法。首先，将单层小波模块嵌入到卷积模块中代替池化层进行下采样，减少细粒度特征的丢失；其次，结合通道注意力（CA）机制和像素注意力（PA）机制提出一种新的局部注意力模块——特征提取模块（FEM）嵌入到卷积网络中，对关键信息进行加权强化。在VeRi数据集上与基准残差网络ResNet-50、ResNet-101进行对比。实验结果表明，在ResNet-50中增加小波变换层数能提高平均精度均值（mAP）；在消融实验中，虽然ResNet-50+离散小波变换（DWT）比ResNet-101的mAP降低了0.25个百分点，但是其参数量和计算复杂度都比ResNet-101低，且mAP、Rank-1和Rank-5均比单独的ResNet-50高，说明该模型在车辆重识别中能够有效提高车辆检索精度。

关键词: 车辆重识别, 通道注意力, 像素注意力, 小波变换, 卷积神经网络

Abstract:

Aiming at the problem of insufficient representation ability of features extracted by the existing vehicle re-identification methods based on convolution Neural Network （CNN）， a vehicle re-identification method based on the combination of wavelet features and attention mechanism was proposed. Firstly， the single-layer wavelet module was embedded in the convolution module to replace the pooling layer for subsampling， thereby reducing the loss of fine-grained features. Secondly， a new local attention module named Feature Extraction Module （FEM） was put forward by combining Channel Attention （CA） mechanism and Pixel Attention （PA） mechanism， which was embedded into CNN to weight and strengthen the key information. Comparison experiments with the benchmark residual convolutional network ResNet-50 and ResNet-101 were conducted on VeRi dataset. Experimental results show that increasing the number of wavelet decomposition layers in ResNet-50 can improve mean Average Precision （mAP）. In the ablation experiment， although ResNet-50+Discrete Wavelet Transform （DWT） has the mAP reduced by 0.25 percentage points compared with ResNet-101， it has the number of parameters and computational complexity lower than those of ResNet-101， and has the mAP， Rank-1 and Rank-5 higher than those of ResNet-50 without DWT， verifying that the proposed model can effectively improve the accuracy of vehicle retrieval in vehicle re-identification.

Key words: vehicle re-identification, Channel Attention (CA), Pixel Attention (PA), wavelet transform, Convolutional Neural Network (CNN)

中图分类号:

TP 391.41

廖光锴, 张正, 宋治国. 基于小波特征与注意力机制结合的卷积网络车辆重识别[J]. 计算机应用, 2022, 42(6): 1876-1883.

Guangkai LIAO, Zheng ZHANG, Zhiguo SONG. Convolutional network-based vehicle re-identification combining wavelet features and attention mechanism[J]. Journal of Computer Applications, 2022, 42(6): 1876-1883.

图/表 15

图1 网络整体框架

Fig. 1 Network overall framework

图2 二维离散小波变换

Fig. 2 Two-dimensional discrete wavelet transform

图3 残差单元

Fig. 3 Residual unit

表1 本文网络的基本结构和各模块对应参数

Tab. 1 Basic structure of proposed network and corresponding parameters of each module

结构	卷积核，通道	输出
Conv1	7×7，64	112×112
DWT	4×64	56×56
FEM	256	56×56
stage1	$1 × 1, 256,256 3 × 3, 256,256 1 × 1, 256,256, 1 × 1, 256,256 3 × 3, 256,256 1 × 1, 256,128$	56×56
DWT	4×128	28×28
FEM	512	28×28
stage2	$1 × 1, 512,512 3 × 3, 512,512 1 × 1, 512,512 × 2, 1 × 1, 512,512 3 × 3, 512,512 1 × 1, 512,256$	28×28
DWT	4×256	14×14
FEM	1 024	14×14
stage3	$1 × 1, 1 024,1 024 3 × 3, 1 024,1 024 1 × 1, 1 024,1 024 × 2, 1 × 1, 1 024,1 024 3 × 3, 1 024,1 024 1 × 1, 1 024,512$	14×14
DWT	4×512	7×7
FEM	2 048	7×7
stage4	$1 × 1, 2 048,2 048 3 × 3, 2 048,2 048 1 × 1, 2 048,2 048 × 2$	7×7

表1 本文网络的基本结构和各模块对应参数

Tab. 1 Basic structure of proposed network and corresponding parameters of each module

结构	卷积核，通道	输出
Conv1	7×7，64	112×112
DWT	4×64	56×56
FEM	256	56×56
stage1	$1 × 1, 256,256 3 × 3, 256,256 1 × 1, 256,256, 1 × 1, 256,256 3 × 3, 256,256 1 × 1, 256,128$	56×56
DWT	4×128	28×28
FEM	512	28×28
stage2	$1 × 1, 512,512 3 × 3, 512,512 1 × 1, 512,512 × 2, 1 × 1, 512,512 3 × 3, 512,512 1 × 1, 512,256$	28×28
DWT	4×256	14×14
FEM	1 024	14×14
stage3	$1 × 1, 1 024,1 024 3 × 3, 1 024,1 024 1 × 1, 1 024,1 024 × 2, 1 × 1, 1 024,1 024 3 × 3, 1 024,1 024 1 × 1, 1 024,512$	14×14
DWT	4×512	7×7
FEM	2 048	7×7
stage4	$1 × 1, 2 048,2 048 3 × 3, 2 048,2 048 1 × 1, 2 048,2 048 × 2$	7×7

图4 CA模块

Fig. 4 CA module

图5 PA模块

Fig. 5 PA module

图6 特征提取模块

Fig. 6 Feature extraction module

表2 在VeRi数据集上与ResNet-50的比较 ( %)

Tab. 2 Comparison with ResNet-50 on VeRi dataset

方法	Rank-1	Rank-5	mAP
基线（ResNet-50）	83.49	92.31	52.88
本文方法	88.70	94.60	63.90

表3 在VehicleID数据集上的Rank-1比较 ( %)

Tab. 3 Comparison of Rank-1 on VehicleID dataset

测试集	本文方法	基线（ResNet-50）
Test800	69.30	67.27
Test1600	67.32	62.03
Test2400	63.94	55.12

表4 在VeRi数据集上本文方法的消融实验结果 ( %)

Tab. 4 Ablation experimental results of the proposed method on VeRi dataset

方法	Rank-1	Rank-5	mAP
ResNet-50	83.49	92.31	52.88
ResNet-101	84.74	94.34	55.75
ResNet-50+DWT	85.20	93.70	55.50
ResNet-50+FEM	85.60	93.10	56.90
本文方法+L_c	88.10	94.00	62.80
本文方法+L_c+L_t	88.70	94.60	63.90

表5 VehicleID数据集上不同方法的对比 ( %)

Tab. 5 Comparison of different methods on VehicleID dataset

方法	Test800		Test1600		Test2400
方法	Rank-1	Rank-5	Rank-1	Rank-5	Rank-1	Rank-5
BOW-SIFT^［7］	2.81	4.23	3.11	5.22	2.11	3.76
LOMO^［5］	19.74	32.14	18.95	29.46	15.26	25.63
BOW-CN^［6］	13.14	22.69	12.94	21.09	10.20	17.89
GoogLeNet^［11］	47.90	67.43	43.45	63.53	38.24	59.51
FACT^［8］	49.53	67.96	44.63	64.19	39.91	60.49
NuFACT^［17］	48.90	69.51	43.64	65.34	38.63	60.72
MLL+MLSR^［25］	65.78	78.09	64.24	73.11	60.05	70.81
VAMI^［26］	63.12	83.25	52.87	75.12	47.34	70.29
EALN^［27］	67.19	78.20	63.23	77.12	59.98	74.20
本文方法	69.30	82.80	67.32	79.86	63.94	77.57

表6 在VeRi数据集上不同方法的对比 ( %)

Tab. 6 Comparison of different methods on VeRi dataset

方法	mAP	Rank-1	Rank-5
LOMO^［5］	9.64	25.33	46.48
VGGNet^［10］	12.76	44.10	62.63
GoogLeNet^［11］	17.89	52.32	72.17
FACT^［8］	18.49	50.95	73.48
NuFACT+Pate-SNN^［17］	50.87	81.11	92.79
PROVID^［17］	53.42	81.56	95.11
MLL+MLSR^［25］	57.03	85.94	94.16
VAMI^［26］	50.10	77.00	90.90
EALN^［27］	57.40	84.40	94.10
AAVER^［28］	58.50	88.70	94.10
QD-DLF^［29］	61.80	88.50	94.50
本文方法	63.90	88.70	94.60

表7 复杂度分析

Tab. 7 Complexity analysis

方法	参数量/MB	计算复杂度（GFLOPS）
ResNet-50	25.56	4.14
ResNet-50+DWT	29.51	7.73
ResNet101	44.55	7.87

表8 VeRi数据集上小波变换层数对性能的影响 ( %)

Tab. 8 Effect of wavelet transform layers onperformance on VeRi dataset

方法	Rank-1	Rank-5	mAP
ResNet-50+DWT1	82.80	92.00	53.40
ResNet-50+DWT2	82.40	93.10	54.50
ResNet-50+DWT3	85.00	93.90	54.90
ResNet-50+DWT4	85.20	93.70	55.50

图7 查询可视化Rank-10结果

Fig. 7 Query visualization of Rank-10 results

参考文献 29

1	YANG L J， LUO P， LOY C C， et al. A large-scale car dataset for fine-grained categorization and verification［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 3973-3981. 10.1109/cvpr.2015.7299023
2	GUO J M， HSIA C H， WONG K， et al. Nighttime vehicle lamp detection and tracking with adaptive mask training［J］. IEEE Transactions on Vehicular Technology， 2016， 65（6）： 4023-4032. 10.1109/tvt.2015.2508020
3	CHEN X Y， XIANG S M， LIU C L， et al. Vehicle detection in satellite images by hybrid deep convolutional neural networks［J］. IEEE Geoscience and Remote Sensing Letters， 2014， 11（10）： 1797-1801. 10.1109/lgrs.2014.2309695
4	ZHAO R， OUYANG W L， WANG X G. Unsupervised salience learning for person re-identification［C］// Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2013： 3586-3593. 10.1109/cvpr.2013.460
5	LIAO S C， HU Y， ZHU X Y， et al. Person re-identification by local maximal occurrence representation and metric learning［C］// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 2197-2206. 10.1109/cvpr.2015.7298832
6	ZHENG L， SHEN L Y， TIAN L， et al. Scalable person re-identification： a benchmark［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 1116-1124. 10.1109/iccv.2015.133
7	ZHENG L， WANG S J， ZHOU W G， et al. Bayes merging of multiple vocabularies for scalable image retrieval［C］// Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014： 1963-1970. 10.1109/cvpr.2014.252
8	LIU X C， LIU W， MA H D， et al. Large-scale vehicle re-identification in urban surveillance videos［C］// Proceedings of the 2016 IEEE International Conference on Multimedia and Expo. Piscataway： IEEE， 2016： 1-6. 10.1109/icme.2016.7553002
9	LIU H Y， TIAN Y H， WANG Y W， et al. Deep relative distance learning： tell the difference between similar vehicles［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 2167-2175. 10.1109/cvpr.2016.238
10	SIMONYAN K， ZISSERMAN A. Very deep convolution networks for large-scale image recognition［EB/OL］. （2015-04-10）［2021-02-20］..
11	SZEGEDY C， LIU W， JIA Y Q， et al. Going deeper with convolutions［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 1-9. 10.1109/cvpr.2015.7298594
12	HE K M， ZHANG X Y， REN S Q， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778. 10.1109/cvpr.2016.90
13	WANG Z D， TANG L M， LIU X H， et al. Orientation invariant feature embedding and spatial temporal regularization for vehicle re-identification［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 379-387. 10.1109/iccv.2017.49
14	ZHOU Y， LIU L， SHAO L. Vehicle re-identification by deep hidden multi-view inference［J］. IEEE Transactions on Image Processing， 2018， 27（7）： 3275-3287. 10.1109/tip.2018.2819820
15	ZHOU Y， SHAO L. Aware attentive multi-view inference for vehicle re-identification［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 6489-6498. 10.1109/cvpr.2018.00679
16	SHEN Y T， XIAO T， LI H S， et al. Learning deep neural networks for vehicle re-ID with visual-spatio-temporal path proposals［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 1918-1927. 10.1109/iccv.2017.210
17	LIU X C， LIU W， MEI T， et al. PROVID： progressive and multimodal vehicle reidentification for large-scale urban surveillance［J］. IEEE Transactions on Multimedia， 2018， 20（3）： 645-658. 10.1109/tmm.2017.2751966
18	TANG Y， WU D， JIN Z， et al. Multi-modal metric learning for vehicle re-identification in traffic surveillance environment［C］// Proceedings of the 2017 IEEE International Conference on Image Processing. Piscataway： IEEE， 2017： 2254-2258. 10.1109/icip.2017.8296683
19	ZHAO L M， LI X， ZHUANG Y T， et al. Deeply-learned part-aligned representations for person re-identification［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 3239-3248. 10.1109/iccv.2017.349
20	LI D W， CHEN X T， ZHANG Z， et al. Learning deep context-aware features over body and latent parts for person re-identification［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 7398-7407. 10.1109/cvpr.2017.782
21	邱奕敏，周毅. 基于小波变换的雾霾立体图像增强算法研究［J］. 计算机工程与应用， 2015， 51（9）：30-33. 10.3778/j.issn.1002-8331.1409-0008
	QIU Y M， ZHOU Y. Wavelet transform stereoscopic images enhancement algorithms based on fog and haze［J］. Computer Engineering and Applications， 2015， 51（9）：30-33. 10.3778/j.issn.1002-8331.1409-0008
22	WOO S， PARK J， LEE J Y， et al. CBAM： convolutional block attention module［C］// Proceedings of the 2018 European Conference on Computer Vision， LNIP 11211. Cham： Springer， 2018： 3-19.
23	QIN X， WANG Z L， BAI Y C， et al. FFA-Net： feature fusion attention network for single image dehazing［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2020：11908-11915. 10.1609/aaai.v34i07.6865
24	HERMANS A， BEYER L， LEIBE B. In defense of the triplet loss for person re-identification［EB/OL］. （2017-11-21）［2021-02-20］..
25	HOU J H， ZENG H Q， CAI L， et al. Multi-label learning with multi-label smoothing regularization for vehicle re-identification［J］. Neurocomputing， 2019， 345：15-22. 10.1016/j.neucom.2018.11.088
26	CHU R H， SUN Y F， LI Y D， et al. Vehicle re-identification with viewpoint-aware metric learning［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 8281-8290. 10.1109/iccv.2019.00837
27	LOU Y H， BAI Y， LIU J， et al. Embedding adversarial learning for vehicle re-identification［J］. IEEE Transactions on Image Processing， 2019， 28（8）：3794-3807. 10.1109/tip.2019.2902112
28	KHORRAMSHAHI P， KUMAR A， PERI N， et al. A dual-path model with adaptive attention for vehicle re-identification［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 6131-6140. 10.1109/iccv.2019.00623
29	ZHU J Q， ZENG H Q， HUANG J C， et al. Vehicle re-identification using quadruple directional deep learning features［J］. IEEE Transactions on Intelligent Transportation Systems， 2020， 21（1）： 410-420. 10.1109/tits.2019.2901312

[1]	杨磊, 赵红东, 于快快. 基于多头注意力机制的端到端语音情感识别[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1869-1875.
[2]	苏珊, 张杨, 张冬雯. 基于深度学习的耦合度相关代码坏味检测方法[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1702-1707.
[3]	王利娥, 李小聪, 刘红翼. 融合知识图谱和差分隐私的新闻推荐方法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1339-1346.
[4]	陈学勤, 陶涛, 张钟旺, 王一蕾. 融合成对编码方案及二维卷积神经网络的长短期会话推荐算法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1347-1354.
[5]	屈震, 李堃婷, 冯志玺. 基于有效通道注意力的遥感图像场景分类[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1431-1439.
[6]	李默, 芦天亮, 谢子恒. 基于代码图像合成的Android恶意软件家族分类方法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1490-1499.
[7]	王艺霏, 于雷, 滕飞, 宋佳玉, 袁玥. 基于长-短时序特征融合的资源负载预测模型[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1508-1515.
[8]	董永峰, 孙跃华, 高立超, 韩鹏, 季海鹏. 基于改进一维卷积和双向长短期记忆神经网络的故障诊断方法[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1207-1215.
[9]	刘志华, 陈文洁, 陈爱斌. 基于自注意力机制时频谱同源特征融合的鸟鸣声分类[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1260-1268.
[10]	滕腾, 潘海为, 张可佳, 牟雪莲, 张锡明, 陈伟鹏. 支持中文医疗问答的基于注意力机制的栈卷积神经网络模型[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1125-1130.
[11]	潘列, 曾诚, 张海丰, 温超东, 郝儒松, 何鹏. 结合广义自回归预训练语言模型与循环卷积神经网络的文本情感分析方法[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1108-1115.
[12]	季长清, 高志勇, 秦静, 汪祖民. 基于卷积神经网络的图像分类算法综述[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1044-1049.
[13]	乔桂芳, 侯守明, 刘彦彦. 基于改进卷积神经网络与支持向量机结合的面部表情识别算法[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1253-1259.
[14]	李昆鹏, 张鹏程, 上官宏, 王燕玲, 杨婕, 桂志国. 基于卷积神经网络的时频域CT重建算法[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1308-1316.
[15]	陈浩杰, 范江亭, 刘勇. 深度强化学习解决动态旅行商问题[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1194-1200.

基于小波特征与注意力机制结合的卷积网络车辆重识别

Convolutional network-based vehicle re-identification combining wavelet features and attention mechanism

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 15

参考文献 29

相关文章 15

编辑推荐

Metrics