基于小波特征与注意力机制结合的卷积网络车辆重识别

doi:10.11772/j.issn.1001-9081.2021040545

《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (6): 1876-1883.DOI: 10.11772/j.issn.1001-9081.2021040545

所属专题：人工智能

基于小波特征与注意力机制结合的卷积网络车辆重识别

廖光锴¹, 张正¹, 宋治国²()

^1.吉首大学信息科学与工程学院，湖南吉首 416000
^2.吉首大学物理与机电工程学院，湖南吉首 416000

收稿日期:2021-04-12 修回日期:2021-07-09 接受日期:2021-07-09 发布日期:2022-06-22 出版日期:2022-06-10
通讯作者: 宋治国
作者简介:廖光锴（1993—），男，四川内江人，硕士研究生，主要研究方向：车辆重识别、图像检索
张正（1981—），男，湖南吉首人，副教授，博士，主要研究方向：矩阵计算
基金资助:
国家自然科学基金资助项目(32060238)

Convolutional network-based vehicle re-identification combining wavelet features and attention mechanism

Guangkai LIAO¹, Zheng ZHANG¹, Zhiguo SONG²()

^1.College of Information Science and Engineering，Jishou University，Jishou Hunan 416000，China
^2.College of Physics and Mechanical and Electrical Engineering，Jishou University，Jishou Hunan 416000，China

Received:2021-04-12 Revised:2021-07-09 Accepted:2021-07-09 Online:2022-06-22 Published:2022-06-10
Contact: Zhiguo SONG
About author:LIAO Guangkai，born in 1993，M. S. candidate. His research interests include vehicle re-identification，image retrieval.
ZHANG Zheng，born in 1981，Ph. D.，associate professor. His research interests include matrix computation
Supported by:
National Natural Science Foundation(32060238)

摘要/Abstract

摘要：

针对现有的基于卷积神经网络（CNN）的车辆重识别方法所提取的特征表达力不足的问题，提出一种基于小波特征与注意力机制相结合的车辆重识别方法。首先，将单层小波模块嵌入到卷积模块中代替池化层进行下采样，减少细粒度特征的丢失；其次，结合通道注意力（CA）机制和像素注意力（PA）机制提出一种新的局部注意力模块——特征提取模块（FEM）嵌入到卷积网络中，对关键信息进行加权强化。在VeRi数据集上与基准残差网络ResNet-50、ResNet-101进行对比。实验结果表明，在ResNet-50中增加小波变换层数能提高平均精度均值（mAP）；在消融实验中，虽然ResNet-50+离散小波变换（DWT）比ResNet-101的mAP降低了0.25个百分点，但是其参数量和计算复杂度都比ResNet-101低，且mAP、Rank-1和Rank-5均比单独的ResNet-50高，说明该模型在车辆重识别中能够有效提高车辆检索精度。

关键词: 车辆重识别, 通道注意力, 像素注意力, 小波变换, 卷积神经网络

Abstract:

Aiming at the problem of insufficient representation ability of features extracted by the existing vehicle re-identification methods based on convolution Neural Network （CNN）， a vehicle re-identification method based on the combination of wavelet features and attention mechanism was proposed. Firstly， the single-layer wavelet module was embedded in the convolution module to replace the pooling layer for subsampling， thereby reducing the loss of fine-grained features. Secondly， a new local attention module named Feature Extraction Module （FEM） was put forward by combining Channel Attention （CA） mechanism and Pixel Attention （PA） mechanism， which was embedded into CNN to weight and strengthen the key information. Comparison experiments with the benchmark residual convolutional network ResNet-50 and ResNet-101 were conducted on VeRi dataset. Experimental results show that increasing the number of wavelet decomposition layers in ResNet-50 can improve mean Average Precision （mAP）. In the ablation experiment， although ResNet-50+Discrete Wavelet Transform （DWT） has the mAP reduced by 0.25 percentage points compared with ResNet-101， it has the number of parameters and computational complexity lower than those of ResNet-101， and has the mAP， Rank-1 and Rank-5 higher than those of ResNet-50 without DWT， verifying that the proposed model can effectively improve the accuracy of vehicle retrieval in vehicle re-identification.

Key words: vehicle re-identification, Channel Attention (CA), Pixel Attention (PA), wavelet transform, Convolutional Neural Network (CNN)

中图分类号:

TP 391.41

廖光锴, 张正, 宋治国. 基于小波特征与注意力机制结合的卷积网络车辆重识别[J]. 计算机应用, 2022, 42(6): 1876-1883.

Guangkai LIAO, Zheng ZHANG, Zhiguo SONG. Convolutional network-based vehicle re-identification combining wavelet features and attention mechanism[J]. Journal of Computer Applications, 2022, 42(6): 1876-1883.

图/表 15

图1 网络整体框架

Fig. 1 Network overall framework

图2 二维离散小波变换

Fig. 2 Two-dimensional discrete wavelet transform

图3 残差单元

Fig. 3 Residual unit

表1 本文网络的基本结构和各模块对应参数

Tab. 1 Basic structure of proposed network and corresponding parameters of each module

结构	卷积核，通道	输出
Conv1	7×7，64	112×112
DWT	4×64	56×56
FEM	256	56×56
stage1	$1 × 1, 256,256 3 × 3, 256,256 1 × 1, 256,256, 1 × 1, 256,256 3 × 3, 256,256 1 × 1, 256,128$	56×56
DWT	4×128	28×28
FEM	512	28×28
stage2	$1 × 1, 512,512 3 × 3, 512,512 1 × 1, 512,512 × 2, 1 × 1, 512,512 3 × 3, 512,512 1 × 1, 512,256$	28×28
DWT	4×256	14×14
FEM	1 024	14×14
stage3	$1 × 1, 1 024,1 024 3 × 3, 1 024,1 024 1 × 1, 1 024,1 024 × 2, 1 × 1, 1 024,1 024 3 × 3, 1 024,1 024 1 × 1, 1 024,512$	14×14
DWT	4×512	7×7
FEM	2 048	7×7
stage4	$1 × 1, 2 048,2 048 3 × 3, 2 048,2 048 1 × 1, 2 048,2 048 × 2$	7×7

表1 本文网络的基本结构和各模块对应参数

Tab. 1 Basic structure of proposed network and corresponding parameters of each module

结构	卷积核，通道	输出
Conv1	7×7，64	112×112
DWT	4×64	56×56
FEM	256	56×56
stage1	$1 × 1, 256,256 3 × 3, 256,256 1 × 1, 256,256, 1 × 1, 256,256 3 × 3, 256,256 1 × 1, 256,128$	56×56
DWT	4×128	28×28
FEM	512	28×28
stage2	$1 × 1, 512,512 3 × 3, 512,512 1 × 1, 512,512 × 2, 1 × 1, 512,512 3 × 3, 512,512 1 × 1, 512,256$	28×28
DWT	4×256	14×14
FEM	1 024	14×14
stage3	$1 × 1, 1 024,1 024 3 × 3, 1 024,1 024 1 × 1, 1 024,1 024 × 2, 1 × 1, 1 024,1 024 3 × 3, 1 024,1 024 1 × 1, 1 024,512$	14×14
DWT	4×512	7×7
FEM	2 048	7×7
stage4	$1 × 1, 2 048,2 048 3 × 3, 2 048,2 048 1 × 1, 2 048,2 048 × 2$	7×7

图4 CA模块

Fig. 4 CA module

图5 PA模块

Fig. 5 PA module

图6 特征提取模块

Fig. 6 Feature extraction module

表2 在VeRi数据集上与ResNet-50的比较 ( %)

Tab. 2 Comparison with ResNet-50 on VeRi dataset

方法	Rank-1	Rank-5	mAP
基线（ResNet-50）	83.49	92.31	52.88
本文方法	88.70	94.60	63.90

表3 在VehicleID数据集上的Rank-1比较 ( %)

Tab. 3 Comparison of Rank-1 on VehicleID dataset

测试集	本文方法	基线（ResNet-50）
Test800	69.30	67.27
Test1600	67.32	62.03
Test2400	63.94	55.12

表4 在VeRi数据集上本文方法的消融实验结果 ( %)

Tab. 4 Ablation experimental results of the proposed method on VeRi dataset

方法	Rank-1	Rank-5	mAP
ResNet-50	83.49	92.31	52.88
ResNet-101	84.74	94.34	55.75
ResNet-50+DWT	85.20	93.70	55.50
ResNet-50+FEM	85.60	93.10	56.90
本文方法+L_c	88.10	94.00	62.80
本文方法+L_c+L_t	88.70	94.60	63.90

表5 VehicleID数据集上不同方法的对比 ( %)

Tab. 5 Comparison of different methods on VehicleID dataset

方法	Test800		Test1600		Test2400
方法	Rank-1	Rank-5	Rank-1	Rank-5	Rank-1	Rank-5
BOW-SIFT^［7］	2.81	4.23	3.11	5.22	2.11	3.76
LOMO^［5］	19.74	32.14	18.95	29.46	15.26	25.63
BOW-CN^［6］	13.14	22.69	12.94	21.09	10.20	17.89
GoogLeNet^［11］	47.90	67.43	43.45	63.53	38.24	59.51
FACT^［8］	49.53	67.96	44.63	64.19	39.91	60.49
NuFACT^［17］	48.90	69.51	43.64	65.34	38.63	60.72
MLL+MLSR^［25］	65.78	78.09	64.24	73.11	60.05	70.81
VAMI^［26］	63.12	83.25	52.87	75.12	47.34	70.29
EALN^［27］	67.19	78.20	63.23	77.12	59.98	74.20
本文方法	69.30	82.80	67.32	79.86	63.94	77.57

表6 在VeRi数据集上不同方法的对比 ( %)

Tab. 6 Comparison of different methods on VeRi dataset

方法	mAP	Rank-1	Rank-5
LOMO^［5］	9.64	25.33	46.48
VGGNet^［10］	12.76	44.10	62.63
GoogLeNet^［11］	17.89	52.32	72.17
FACT^［8］	18.49	50.95	73.48
NuFACT+Pate-SNN^［17］	50.87	81.11	92.79
PROVID^［17］	53.42	81.56	95.11
MLL+MLSR^［25］	57.03	85.94	94.16
VAMI^［26］	50.10	77.00	90.90
EALN^［27］	57.40	84.40	94.10
AAVER^［28］	58.50	88.70	94.10
QD-DLF^［29］	61.80	88.50	94.50
本文方法	63.90	88.70	94.60

表7 复杂度分析

Tab. 7 Complexity analysis

方法	参数量/MB	计算复杂度（GFLOPS）
ResNet-50	25.56	4.14
ResNet-50+DWT	29.51	7.73
ResNet101	44.55	7.87

表8 VeRi数据集上小波变换层数对性能的影响 ( %)

Tab. 8 Effect of wavelet transform layers onperformance on VeRi dataset

方法	Rank-1	Rank-5	mAP
ResNet-50+DWT1	82.80	92.00	53.40
ResNet-50+DWT2	82.40	93.10	54.50
ResNet-50+DWT3	85.00	93.90	54.90
ResNet-50+DWT4	85.20	93.70	55.50

图7 查询可视化Rank-10结果

Fig. 7 Query visualization of Rank-10 results

参考文献 29

1	YANG L J， LUO P， LOY C C， et al. A large-scale car dataset for fine-grained categorization and verification［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 3973-3981. 10.1109/cvpr.2015.7299023
2	GUO J M， HSIA C H， WONG K， et al. Nighttime vehicle lamp detection and tracking with adaptive mask training［J］. IEEE Transactions on Vehicular Technology， 2016， 65（6）： 4023-4032. 10.1109/tvt.2015.2508020
3	CHEN X Y， XIANG S M， LIU C L， et al. Vehicle detection in satellite images by hybrid deep convolutional neural networks［J］. IEEE Geoscience and Remote Sensing Letters， 2014， 11（10）： 1797-1801. 10.1109/lgrs.2014.2309695
4	ZHAO R， OUYANG W L， WANG X G. Unsupervised salience learning for person re-identification［C］// Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2013： 3586-3593. 10.1109/cvpr.2013.460
5	LIAO S C， HU Y， ZHU X Y， et al. Person re-identification by local maximal occurrence representation and metric learning［C］// Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 2197-2206. 10.1109/cvpr.2015.7298832
6	ZHENG L， SHEN L Y， TIAN L， et al. Scalable person re-identification： a benchmark［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 1116-1124. 10.1109/iccv.2015.133
7	ZHENG L， WANG S J， ZHOU W G， et al. Bayes merging of multiple vocabularies for scalable image retrieval［C］// Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014： 1963-1970. 10.1109/cvpr.2014.252
8	LIU X C， LIU W， MA H D， et al. Large-scale vehicle re-identification in urban surveillance videos［C］// Proceedings of the 2016 IEEE International Conference on Multimedia and Expo. Piscataway： IEEE， 2016： 1-6. 10.1109/icme.2016.7553002
9	LIU H Y， TIAN Y H， WANG Y W， et al. Deep relative distance learning： tell the difference between similar vehicles［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 2167-2175. 10.1109/cvpr.2016.238
10	SIMONYAN K， ZISSERMAN A. Very deep convolution networks for large-scale image recognition［EB/OL］. （2015-04-10）［2021-02-20］..
11	SZEGEDY C， LIU W， JIA Y Q， et al. Going deeper with convolutions［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 1-9. 10.1109/cvpr.2015.7298594
12	HE K M， ZHANG X Y， REN S Q， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778. 10.1109/cvpr.2016.90
13	WANG Z D， TANG L M， LIU X H， et al. Orientation invariant feature embedding and spatial temporal regularization for vehicle re-identification［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 379-387. 10.1109/iccv.2017.49
14	ZHOU Y， LIU L， SHAO L. Vehicle re-identification by deep hidden multi-view inference［J］. IEEE Transactions on Image Processing， 2018， 27（7）： 3275-3287. 10.1109/tip.2018.2819820
15	ZHOU Y， SHAO L. Aware attentive multi-view inference for vehicle re-identification［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 6489-6498. 10.1109/cvpr.2018.00679
16	SHEN Y T， XIAO T， LI H S， et al. Learning deep neural networks for vehicle re-ID with visual-spatio-temporal path proposals［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 1918-1927. 10.1109/iccv.2017.210
17	LIU X C， LIU W， MEI T， et al. PROVID： progressive and multimodal vehicle reidentification for large-scale urban surveillance［J］. IEEE Transactions on Multimedia， 2018， 20（3）： 645-658. 10.1109/tmm.2017.2751966
18	TANG Y， WU D， JIN Z， et al. Multi-modal metric learning for vehicle re-identification in traffic surveillance environment［C］// Proceedings of the 2017 IEEE International Conference on Image Processing. Piscataway： IEEE， 2017： 2254-2258. 10.1109/icip.2017.8296683
19	ZHAO L M， LI X， ZHUANG Y T， et al. Deeply-learned part-aligned representations for person re-identification［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 3239-3248. 10.1109/iccv.2017.349
20	LI D W， CHEN X T， ZHANG Z， et al. Learning deep context-aware features over body and latent parts for person re-identification［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 7398-7407. 10.1109/cvpr.2017.782
21	邱奕敏，周毅. 基于小波变换的雾霾立体图像增强算法研究［J］. 计算机工程与应用， 2015， 51（9）：30-33. 10.3778/j.issn.1002-8331.1409-0008
	QIU Y M， ZHOU Y. Wavelet transform stereoscopic images enhancement algorithms based on fog and haze［J］. Computer Engineering and Applications， 2015， 51（9）：30-33. 10.3778/j.issn.1002-8331.1409-0008
22	WOO S， PARK J， LEE J Y， et al. CBAM： convolutional block attention module［C］// Proceedings of the 2018 European Conference on Computer Vision， LNIP 11211. Cham： Springer， 2018： 3-19.
23	QIN X， WANG Z L， BAI Y C， et al. FFA-Net： feature fusion attention network for single image dehazing［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2020：11908-11915. 10.1609/aaai.v34i07.6865
24	HERMANS A， BEYER L， LEIBE B. In defense of the triplet loss for person re-identification［EB/OL］. （2017-11-21）［2021-02-20］..
25	HOU J H， ZENG H Q， CAI L， et al. Multi-label learning with multi-label smoothing regularization for vehicle re-identification［J］. Neurocomputing， 2019， 345：15-22. 10.1016/j.neucom.2018.11.088
26	CHU R H， SUN Y F， LI Y D， et al. Vehicle re-identification with viewpoint-aware metric learning［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 8281-8290. 10.1109/iccv.2019.00837
27	LOU Y H， BAI Y， LIU J， et al. Embedding adversarial learning for vehicle re-identification［J］. IEEE Transactions on Image Processing， 2019， 28（8）：3794-3807. 10.1109/tip.2019.2902112
28	KHORRAMSHAHI P， KUMAR A， PERI N， et al. A dual-path model with adaptive attention for vehicle re-identification［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 6131-6140. 10.1109/iccv.2019.00623
29	ZHU J Q， ZENG H Q， HUANG J C， et al. Vehicle re-identification using quadruple directional deep learning features［J］. IEEE Transactions on Intelligent Transportation Systems， 2020， 21（1）： 410-420. 10.1109/tits.2019.2901312

[1]	秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974.
[2]	李云, 王富铕, 井佩光, 王粟, 肖澳. 基于不确定度感知的帧关联短视频事件检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2903-2910.
[3]	陈虹, 齐兵, 金海波, 武聪, 张立昂. 融合1D-CNN与BiGRU的类不平衡流量异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2493-2499.
[4]	赵宇博, 张丽萍, 闫盛, 侯敏, 高茂. 基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2421-2429.
[5]	张春雪, 仇丽青, 孙承爱, 荆彩霞. 基于两阶段动态兴趣识别的购买行为预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2365-2371.
[6]	陈彤, 杨丰玉, 熊宇, 严荭, 邱福星. 基于多尺度频率通道注意力融合的声纹库构建方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2407-2413.
[7]	高阳峄, 雷涛, 杜晓刚, 李岁永, 王营博, 闵重丹. 基于像素距离图和四维动态卷积网络的密集人群计数与定位方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2233-2242.
[8]	唐媛, 陈艳平, 扈应, 黄瑞章, 秦永彬. 基于多尺度混合注意力卷积神经网络的关系抽取模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2011-2017.
[9]	王东炜, 刘柏辰, 韩志, 王艳美, 唐延东. 基于低秩分解和向量量化的深度网络压缩方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 1987-1994.
[10]	李牧, 骆宇, 柯熙政. 基于调频连续波雷达的人体生命体征检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1978-1986.
[11]	姚迅, 秦忠正, 杨捷. 生成式标签对抗的文本分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1781-1785.
[12]	沈君凤, 周星辰, 汤灿. 基于改进的提示学习方法的双通道情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1796-1806.
[13]	黄梦源, 常侃, 凌铭阳, 韦新杰, 覃团发. 基于层间引导的低光照图像渐进增强算法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1911-1919.
[14]	李健京, 李贯峰, 秦飞舟, 李卫军. 基于不确定知识图谱嵌入的多关系近似推理模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1751-1759.
[15]	高文烁, 陈晓云. 基于节点结构的点云分类网络[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1471-1478.

基于小波特征与注意力机制结合的卷积网络车辆重识别

Convolutional network-based vehicle re-identification combining wavelet features and attention mechanism

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 15

参考文献 29

相关文章 15

编辑推荐

Metrics