基于Sophon SC5+芯片构架的行人搜索算法与优化

doi:10.11772/j.issn.1001-9081.2022020252

《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (3): 744-751.DOI: 10.11772/j.issn.1001-9081.2022020252

所属专题：人工智能

基于Sophon SC5+芯片构架的行人搜索算法与优化

孙杰¹, 吴绍鑫¹, 王学军², 华璟¹()

^1.浙江工商大学计算机与信息工程学院，杭州 310018
^2.深圳市星火电子工程公司，广东深圳 518001

收稿日期:2022-03-04 修回日期:2022-05-27 接受日期:2022-05-30 发布日期:2022-08-16 出版日期:2023-03-10
通讯作者: 华璟
作者简介:孙杰（1985—），男，浙江杭州人，中级实验师，硕士，CCF会员，主要研究方向：图像与视频处理、可视化分析
吴绍鑫（1997—），男，浙江温州人，硕士研究生，主要研究方向：图像与视频处理、机器学习
王学军（1969—），男，内蒙古赤峰人，主要研究方向：机器学习
华璟（1974—），男，浙江杭州人，教授，博士，主要研究方向：计算机图形学、数据可视化、医学图像分析。
基金资助:
国家自然科学基金资助项目(61972353)

Efficient person search algorithm and optimization with Sophon SC5+ chip architecture

Jie SUN¹, Shaoxin WU¹, Xuejun WANG², Jing HUA¹()

^1.School of Computer and Information Engineering，Zhejiang Gongshang University，Hangzhou Zhejiang 310018，China
^2.Shenzhen Xinghuo Electronic Engineering Company，Shenzhen Guangdong 518001，China

Received:2022-03-04 Revised:2022-05-27 Accepted:2022-05-30 Online:2022-08-16 Published:2023-03-10
Contact: Jing HUA
About author:SUN Jie， born in 1985， M. S.， intermediate experimentalist. His research interests include image and video processing， visual analysis.
WU Shaoxin， born in 1997， M. S. candidate. His research interests include image and video processing， machine learning.
WANG Xuejun， born in 1969. His research interests include machine learning.
Supported by:
National Natural Science Foundation of China(61972353)

摘要/Abstract

摘要：

传统的基于深度神经网络的行人搜索算法计算量大，在大规模部署时搜索性能低，导致算法在落地应用于硬件和预算有限的终端时面临成本高、速度慢的难题。针对以上问题，提出一种基于Sophon SC5+高性能推理芯片的行人检测与重识别算法，从算法到硬件自上而下地优化深度学习的效率。首先，利用轻量化的Ghost模块替换YOLOv5s的主干网络，从而大幅度降低模型的参数和计算量；其次，融入CBAM注意力机制，以增强算法的特征学习能力，并提高检测精度；然后，将中心损失约束和 Non-local注意力机制加入行人重识别模块，并结合中心约束三元组损失和附加间隔交叉熵损失优化模型，以提升行人重识别算法性能；最后，基于Sophon SC+量化行人检测模型和行人重识别模型并生成最终的推理模型。在Market-1501与DukeMTMC-ReID数据集上的实验结果表明，相较于YOLOv4-tiny、ACRN、SVDNet等主流算法，行人检测算法与行人重识别算法的平均精度均值（mAP）至少提高了43.8和25.7个百分点。基于Sophon SC5+芯片实现int8量化后，所提算法的mAP虽然减小了1.7个百分点，但模型大小减小了74.4%，能够在大规模、城市级行人搜索系统中落地使用。

关键词: 行人重识别, 行人搜索, Ghost模块, 中心损失, Sophon SC5+, 注意力机制

Abstract:

The computational costs of traditional deep neural network-based person search algorithms are very high， so that these algorithms are difficult to deploy on devices with limited hardware resources and budgets because of high cost and low speed. Aiming at the above problems， a person detection and person re-identification algorithm based on the high-performance inference chip Sophon SC5+ was proposed to optimize the efficiency of deep learning from the algorithm end to the hardware end in a top-down approach. Firstly， by using the lightweight Ghost module to replace the backbone network of YOLOv5s， the parameters and computational cost of the model were greatly reduced. Secondly， Convolutional Block Attention Module （CBAM） attention mechanism was integrated to enhance the feature learning capability and improve the detection precision of the algorithm. Thirdly， the central loss constraint and Non-local attention mechanism were added to the person re-identification module， and the central constrained triple loss and the additional interval cross-entropy loss were combined to optimize the model and improve the performance of the person re-identification algorithm. Finally， based on Sophon SC+， person detection model and person re-identification model were quantized and the final inference model was generated. Experimental results on Market-1501 and DukeMTMC-ReID datasets show that， the mean Average Precisions （mAPs） of the person detection and person re-identification algorithms were improved by at least 43.8 and 25.7 percentage points compared with YOLOv4-tiny， Attribute-Complementary Re-ID Net （ACRN）， Singular Vector Decomposition Net （SVDNet） and other mainstream algorithms. After the implementation of int8 quantization based on Sophon SC5+ chip， although the proposed algorithm has the mAP decreased by 1.7 percentage points， it has the model size reduced by 74.4%. It can be seen that the proposed algorithm can be used in large-scale， city-level person search systems.

Key words: person re-identification, person search, Ghost module, central loss, Sophon SC5+, attention mechanism

中图分类号:

TP391

孙杰, 吴绍鑫, 王学军, 华璟. 基于Sophon SC5+芯片构架的行人搜索算法与优化[J]. 计算机应用, 2023, 43(3): 744-751.

Jie SUN, Shaoxin WU, Xuejun WANG, Jing HUA. Efficient person search algorithm and optimization with Sophon SC5+ chip architecture[J]. Journal of Computer Applications, 2023, 43(3): 744-751.

图/表 16

图1 YOLOv5中的CBS和C3模块结构

Fig. 1 CBS and C3 module structure in YOLOv5

图2 传统卷积生成特征图和Ghost模块生成特征图

Fig. 2 Traditional convolution to generate feature maps and Ghost module to generate feature maps

图3 CBAM结构

Fig. 3 Stucture of CBAM

图4 YOLOv5-GC网络结构

Fig. 4 YOLOv5-GC network structure

图5 图片在特征空间上的距离

Fig. 5 Distance of image in feature space

图6 ReID网络结构

Fig. 6 Structure of ReID network

图7 模型量化为int8_bmodel

Fig. 7 Quantizing model as int8_bmodel

图8 量化误差分析图

Fig. 8 Quantitative error analysis diagram

表1 各模块对行人检测算法的影响

Tab. 1 Influence of each module on person detection algorithm

算法	浮点运算量/ GFLOPs	mAP/%	推理时间/ms	模型大小/MB
YOLOv5s	15.8	82.1	5.3	14.4
YOLOv5s-Ghost	7.9	80.2	4.8	10.5
YOLOv5s-CBAM	16.0	82.9	5.4	14.6
YOLOv5s-GC	8.0	81.3	4.9	10.7
YOLOv5l	107.9	84.3	15.9	92.8
YOLOv5l-Ghost	42.3	82.4	12.4	49.1

图9 检测结果示意图

Fig. 9 Schematic diagram of detection results

表2 YOLOv5s-GC与其他YOLO算法对比实验结果

Tab. 2 Comparison of experimental results between YOLOv5s-GC and other YOLO algorithms

算法	模型大小/MB	mAP/%	帧率/（frame·s^-1）
YOLOv3^［2］	246.3	62.8	58
YOLOv4^［19］	256.0	70.3	50
YOLOv4-CSP^［20］	210.2	67.4	53
YOLOv3-tiny^［2］	34.7	37.5	267
YOLOv4-tiny^［21］	23.5	45.6	301
YOLOv5s	14.4	82.1	189
YOLOv5s-GC	10.7	81.3	204

表3 消融实验结果 (%)

Tab. 3 Results of ablation experiment

算法	Market-1501		DukeMTMC-ReID
算法	mAP	Rank-1	mAP	Rank-1
Baseline	85.9	94.5	76.4	86.4
Baseline+ $L A M S$	86.3	94.6	76.8	86.7
Baseline+ $L t r i p l e t_c e n t e r$	86.1	94.5	76.7	86.6
Baseline+ $L$	86.5	94.7	77.0	87.0
Baseline+Non-local	87.0	94.9	77.1	87.2
ReID	87.4	95.1	77.6	87.9

表3 消融实验结果 (%)

Tab. 3 Results of ablation experiment

算法	Market-1501		DukeMTMC-ReID
算法	mAP	Rank-1	mAP	Rank-1
Baseline	85.9	94.5	76.4	86.4
Baseline+ $L A M S$	86.3	94.6	76.8	86.7
Baseline+ $L t r i p l e t_c e n t e r$	86.1	94.5	76.7	86.6
Baseline+ $L$	86.5	94.7	77.0	87.0
Baseline+Non-local	87.0	94.9	77.1	87.2
ReID	87.4	95.1	77.6	87.9

表4 不同行人重识别算法的对比实验结果 ( %)

Tab. 4 Comparison of experimental results of different person re-identification algorithms

检测算法	Market-1501		DukeMTMC-ReID
检测算法	mAP	Rank-1	mAP	Rank-1
ACRN	62.6	83.6	51.9	72.6
SVDNet	62.1	82.3	56.8	76.7
MMT-500	71.2	87.7	53.4	73.0
GPR	71.5	88.1	65.2	79.5
ConsAtt	84.7	96.1	73.1	86.3
PCB-RPP	81.6	93.8	69.2	83.3
PPS	85.3	94.3	75.9	88.2
BagTricks	85.9	94.5	76.4	86.4
本文算法	87.4	95.1	77.6	87.9

表5 不同参数的性能对比 (%)

Tab. 5 Performance comparison of different parameters

$α$	$β$	Market-1501		DukeMTMC-ReID
$α$	$β$	mAP	Rank-1	mAP	Rank-1
10	30	86.1	94.6	76.3	87.1
30	50	87.1	94.9	77.0	87.4
30	70	87.4	95.1	77.6	87.9
50	70	87.2	95.2	77.2	87.8
30	100	85.9	94.3	75.8	86.8
50	100	86.0	94.7	76.1	87.0
70	100	86.4	94.6	76.7	87.3

表5 不同参数的性能对比 (%)

Tab. 5 Performance comparison of different parameters

$α$	$β$	Market-1501		DukeMTMC-ReID
$α$	$β$	mAP	Rank-1	mAP	Rank-1
10	30	86.1	94.6	76.3	87.1
30	50	87.1	94.9	77.0	87.4
30	70	87.4	95.1	77.6	87.9
50	70	87.2	95.2	77.2	87.8
30	100	85.9	94.3	75.8	86.8
50	100	86.0	94.7	76.1	87.0
70	100	86.4	94.6	76.7	87.3

表6 不同算法量化后的结果比较

Tab. 6 Comparison of results after quantization by different algorithms

模块	算法	mAP/%		模型大小/MB			推理时间/ms
模块	算法	fp32	int8	2080Ti	fp32	int8	2080Ti	fp32	int8
行人检测	YOLOv5s	82.1	80.5	28.3	28.8	7.7	5.3	4.5	3.6
行人检测	YOLOv5s-GC	81.3	79.3	22.2	23.0	6.2	4.9	4.1	3.2
行人重识别	MMT	71.2	70.4	94.4	94.9	24.3	9.4	8.1	4.6
	MEB	72.7	71.0	28.9	29.8	8.7	14.3	6.9	4.1
	Fast-ReID	88.6	86.9	102.6	103.1	56.9	13.7	11.2	10.7
	MLCReID	45.5	43.1	102.7	103.6	24.3	13.3	10.6	5.9
	ReID	87.4	85.7	100.7	94.9	24.3	8.6	7.3	4.1

图10 行人搜索结果示意图

Fig. 10 Schematic diagram of person search results

参考文献 29

1	罗浩，姜伟，范星，等. 基于深度学习的行人重识别研究进展［J］. 自动化学报， 2019， 45（11）：2032-2049. 10.16383/j.aas.c180154
	LUO H， JIANG W， FAN X， et al. A survey on deep learning based person re-identification［J］. Acta Automatica Sinica， 2019， 45（11）： 2032-2049. 10.16383/j.aas.c180154
2	REDMON J， FARHADI A. YOLOv3： an incremental improvement［EB/OL］. ［2022-01-10］.. 10.1109/cvpr.2017.690
3	WANG C Y， YEH I H， LIAO H Y M. You only learn one representation： unified network for multiple tasks［EB/OL］. （2021-05-10）［2022-09-29］.. 10.48550/arXiv.2105.04206
4	GE Z， LIU S T， WANG F， et al. YOLOX： exceeding YOLO series in 2021［EB/OL］. ［2021-07-11］. .
5	ZENG K W， NING M N， WANG Y H， et al. Hierarchical clustering with hard-batch triplet loss for person re-identification［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 13654-13662. 10.1109/cvpr42600.2020.01367
6	HE S T， LUO H， WANG P C， et al. TransReID： transformer-based object re-identification［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 14993-15002. 10.1109/iccv48922.2021.01474
7	ZHANG G Q， CHEN Y H， LIN W S， et al. Low resolution information also matters： learning multi-resolution representations for person re-identification［C］// Proceedings of the 30th International Joint Conference on Artificial Intelligence. California： ijcai.org， 2021： 1295-1301. 10.24963/ijcai.2021/179
8	HAN K， WANG Y H， TIAN Q， et al. GhostNet： more features from cheap operations［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 1577-1586. 10.1109/cvpr42600.2020.00165
9	WOO S， PARK J， LEE J Y， et al. CBAM： convolutional block attention module［C］// Proceedings of the 2018 European Conference on Computer Vision. Cham： Springer， 2018： 3-19. 10.1007/978-3-030-01234-2_1
10	LAYNE R， HOSPEDALES T M， GONG S G. Person re-identification by attributes［C］// Proceedings of the 2012 British Machine Vision Conference. Durham： BMVA Press， 2012： No.24. 10.5244/c.26.24
11	WANG F， CHENG J， LIU W Y， et al. Additive margin softmax for face verification［J］. IEEE Signal Processing Letters， 2018， 25（7）： 926-930. 10.1109/lsp.2018.2822810
12	SCHROFF F， KALENICHENKO D， PHILBIN J. FaceNet： a unified embedding for face recognition and clustering［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 815-823. 10.1109/cvpr.2015.7298682
13	WEN Y D， ZHANG K P， LI Z F， et al. A discriminative feature learning approach for deep face recognition［C］// Proceedings of the 2016 European Conference on Computer Vision. Cham： Springer， 2016： 499-515. 10.1007/978-3-319-46478-7_31
14	LUO H， JIANG W， GU Y Z， et al. A strong baseline and batch normalization neck for deep person re-identification［J］. IEEE Transactions on Multimedia， 2020， 22（10）：2597-2609. 10.1109/tmm.2019.2958756
15	WANG X L， GIRSHICK R， GUPTA A， et al. Non-local neural networks［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 7794-7803. 10.1109/cvpr.2018.00813
16	LIN T Y， MAIRT M， BELONGIE S， et al. Microsoft COCO： common objects in context［C］// Proceedings of the 2014 European Conference on Computer Vision. Cham： Springer， 2014： 740-755. 10.1007/978-3-319-10602-1_48
17	ROTH P M， HIRZER M， KÖSTINGER M， et al. Mahalanobis distance learning for person re-identification［M］// GONG S G， CRISTANI M， YAN S C， et al. Person Re-Identification， ACVPR. London： Springer， 2014： 247-267. 10.1007/978-1-4471-6296-4_12
18	ZHENG Z D， ZHENG L， YANG Y. Unlabeled samples generated by GAN improve the person re-identification baseline in vitro［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 3774-3782. 10.1109/iccv.2017.405
19	BOCHKOVSKIY A， WANG C Y， LIAO H Y M. YOLOv4： optimal speed and accuracy of object detection［EB/OL］. （2020-04-23）［2022-02-20］..
20	WANG C Y， BOCHKOVSKIY A， LIAO H Y M. Scaled-YOLOv4： Scaling cross stage partial network［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 13024-13033. 10.1109/cvpr46437.2021.01283
21	JIANG Z C， ZHAO L Q， LI S Y， et al. Real-time object detection method based on improved YOLOv4-tiny［EB/OL］. ［2021-09-23］..
22	SCHUMANN A， STIEFELHAGEN R. Person re-identification by deep learning attribute-complementary information［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2017： 1435-1443. 10.1109/cvprw.2017.186
23	SUN Y F， ZHENG L， DENG W J， et al. SVDNet for pedestrian retrieval［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 3820-3828. 10.1109/iccv.2017.410
24	GE Y X， CHEN D P， LI H S. Mutual mean-teaching： pseudo label refinery for unsupervised domain adaptation on person re-identification［EB/OL］. ［2022-01-17］..
25	LUO C C， SONG C F， ZHANG Z X. Generalizing person re-identification by camera-aware invariance learning and cross-domain mixup［C］// Proceedings of the 2020 European Conference on Computer Vision. Cham： Springer， 2020： 224-241. 10.1007/978-3-030-58555-6_14
26	ZHOU S， WANG F， HUANG Z， et al. Discriminative feature learning with consistent attention regularization for person re-identification ［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 8040-8049. 10.1109/iccv.2019.00813
27	SUN Y F， ZHENG L， YANG Y， et al. Beyond part models： person retrieval with refined part pooling （and a strong convolutional baseline）［C］// Proceedings of the 2018 European Conference on Computer Vision. Cham： Springer， 2018： 501-518. 10.1007/978-3-030-01225-0_30
28	SHEN Y H， JI R R， HONG X P， et al. A part power set model for scale-free person retrieval［C］// Proceedings of the 28th International Joint Conference on Artificial Intelligence. California： ijcai.org， 2019： 3397-3403. 10.24963/ijcai.2019/471
29	罗浩. 基于深度学习的行人重识别算法研究［D］. 杭州：浙江大学， 2020： 45-66. 10.1109/cac51589.2020.9326599
	LUO H. Study on person re-identification based on deep learning - from non-occlusion to occlusion［D］. Hangzhou： Zhejiang University， 2020： 45-66. 10.1109/cac51589.2020.9326599

[1]	秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974.
[2]	李力铤, 华蓓, 贺若舟, 徐况. 基于解耦注意力机制的多变量时序预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2732-2738.
[3]	赵志强, 马培红, 黑新宏. 基于双重注意力机制的人群计数方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2886-2892.
[4]	贾洁茹, 杨建超, 张硕蕊, 闫涛, 陈斌. 基于自蒸馏视觉Transformer的无监督行人重识别[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2893-2902.
[5]	薛凯鹏, 徐涛, 廖春节. 融合自监督和多层交叉注意力的多模态情感分析网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2387-2392.
[6]	汪雨晴, 朱广丽, 段文杰, 李书羽, 周若彤. 基于交互注意力机制的心理咨询文本情感分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2393-2399.
[7]	高鹏淇, 黄鹤鸣, 樊永红. 融合坐标与多头注意力机制的交互语音情感识别[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2400-2406.
[8]	王翠, 邓淼磊, 张德贤, 李磊, 杨晓艳. 基于图像的端到端行人搜索算法综述[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2544-2550.
[9]	李钟华, 白云起, 王雪津, 黄雷雷, 林初俊, 廖诗宇. 基于图像增强的低照度人脸检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2588-2594.
[10]	莫尚斌, 王文君, 董凌, 高盛祥, 余正涛. 基于多路信息聚合协同解码的单通道语音增强[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2611-2617.
[11]	熊武, 曹从军, 宋雪芳, 邵云龙, 王旭升. 基于多尺度混合域注意力机制的笔迹鉴别方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2225-2232.
[12]	李欢欢, 黄添强, 丁雪梅, 罗海峰, 黄丽清. 基于多尺度时空图卷积网络的交通出行需求预测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2065-2072.
[13]	毛典辉, 李学博, 刘峻岭, 张登辉, 颜文婧. 基于并行异构图和序列注意力机制的中文实体关系抽取模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2018-2025.
[14]	刘丽, 侯海金, 王安红, 张涛. 基于多尺度注意力的生成式信息隐藏算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2102-2109.
[15]	徐松, 张文博, 王一帆. 基于时空信息的轻量视频显著性目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2192-2199.

基于Sophon SC5+芯片构架的行人搜索算法与优化

Efficient person search algorithm and optimization with Sophon SC5+ chip architecture

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 16

参考文献 29

相关文章 15

编辑推荐

Metrics