Efficient person search algorithm and optimization with Sophon SC5+ chip architecture

doi:10.11772/j.issn.1001-9081.2022020252

Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (3): 744-751.DOI: 10.11772/j.issn.1001-9081.2022020252

Special Issue: 人工智能

• Artificial intelligence • Previous Articles Next Articles

Efficient person search algorithm and optimization with Sophon SC5+ chip architecture

Jie SUN¹, Shaoxin WU¹, Xuejun WANG², Jing HUA¹()

^1.School of Computer and Information Engineering，Zhejiang Gongshang University，Hangzhou Zhejiang 310018，China
^2.Shenzhen Xinghuo Electronic Engineering Company，Shenzhen Guangdong 518001，China

Received:2022-03-04 Revised:2022-05-27 Accepted:2022-05-30 Online:2022-08-16 Published:2023-03-10
Contact: Jing HUA
About author:SUN Jie， born in 1985， M. S.， intermediate experimentalist. His research interests include image and video processing， visual analysis.
WU Shaoxin， born in 1997， M. S. candidate. His research interests include image and video processing， machine learning.
WANG Xuejun， born in 1969. His research interests include machine learning.
Supported by:
National Natural Science Foundation of China(61972353)

基于Sophon SC5+芯片构架的行人搜索算法与优化

孙杰¹, 吴绍鑫¹, 王学军², 华璟¹()

^1.浙江工商大学计算机与信息工程学院，杭州 310018
^2.深圳市星火电子工程公司，广东深圳 518001

通讯作者: 华璟
作者简介:孙杰（1985—），男，浙江杭州人，中级实验师，硕士，CCF会员，主要研究方向：图像与视频处理、可视化分析
吴绍鑫（1997—），男，浙江温州人，硕士研究生，主要研究方向：图像与视频处理、机器学习
王学军（1969—），男，内蒙古赤峰人，主要研究方向：机器学习
华璟（1974—），男，浙江杭州人，教授，博士，主要研究方向：计算机图形学、数据可视化、医学图像分析。
基金资助:
国家自然科学基金资助项目(61972353)

Abstract

Abstract:

The computational costs of traditional deep neural network-based person search algorithms are very high， so that these algorithms are difficult to deploy on devices with limited hardware resources and budgets because of high cost and low speed. Aiming at the above problems， a person detection and person re-identification algorithm based on the high-performance inference chip Sophon SC5+ was proposed to optimize the efficiency of deep learning from the algorithm end to the hardware end in a top-down approach. Firstly， by using the lightweight Ghost module to replace the backbone network of YOLOv5s， the parameters and computational cost of the model were greatly reduced. Secondly， Convolutional Block Attention Module （CBAM） attention mechanism was integrated to enhance the feature learning capability and improve the detection precision of the algorithm. Thirdly， the central loss constraint and Non-local attention mechanism were added to the person re-identification module， and the central constrained triple loss and the additional interval cross-entropy loss were combined to optimize the model and improve the performance of the person re-identification algorithm. Finally， based on Sophon SC+， person detection model and person re-identification model were quantized and the final inference model was generated. Experimental results on Market-1501 and DukeMTMC-ReID datasets show that， the mean Average Precisions （mAPs） of the person detection and person re-identification algorithms were improved by at least 43.8 and 25.7 percentage points compared with YOLOv4-tiny， Attribute-Complementary Re-ID Net （ACRN）， Singular Vector Decomposition Net （SVDNet） and other mainstream algorithms. After the implementation of int8 quantization based on Sophon SC5+ chip， although the proposed algorithm has the mAP decreased by 1.7 percentage points， it has the model size reduced by 74.4%. It can be seen that the proposed algorithm can be used in large-scale， city-level person search systems.

Key words: person re-identification, person search, Ghost module, central loss, Sophon SC5+, attention mechanism

摘要：

传统的基于深度神经网络的行人搜索算法计算量大，在大规模部署时搜索性能低，导致算法在落地应用于硬件和预算有限的终端时面临成本高、速度慢的难题。针对以上问题，提出一种基于Sophon SC5+高性能推理芯片的行人检测与重识别算法，从算法到硬件自上而下地优化深度学习的效率。首先，利用轻量化的Ghost模块替换YOLOv5s的主干网络，从而大幅度降低模型的参数和计算量；其次，融入CBAM注意力机制，以增强算法的特征学习能力，并提高检测精度；然后，将中心损失约束和 Non-local注意力机制加入行人重识别模块，并结合中心约束三元组损失和附加间隔交叉熵损失优化模型，以提升行人重识别算法性能；最后，基于Sophon SC+量化行人检测模型和行人重识别模型并生成最终的推理模型。在Market-1501与DukeMTMC-ReID数据集上的实验结果表明，相较于YOLOv4-tiny、ACRN、SVDNet等主流算法，行人检测算法与行人重识别算法的平均精度均值（mAP）至少提高了43.8和25.7个百分点。基于Sophon SC5+芯片实现int8量化后，所提算法的mAP虽然减小了1.7个百分点，但模型大小减小了74.4%，能够在大规模、城市级行人搜索系统中落地使用。

关键词: 行人重识别, 行人搜索, Ghost模块, 中心损失, Sophon SC5+, 注意力机制

CLC Number:

TP391

Jie SUN, Shaoxin WU, Xuejun WANG, Jing HUA. Efficient person search algorithm and optimization with Sophon SC5+ chip architecture[J]. Journal of Computer Applications, 2023, 43(3): 744-751.

孙杰, 吴绍鑫, 王学军, 华璟. 基于Sophon SC5+芯片构架的行人搜索算法与优化[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 744-751.

Figures/Tables 16

Fig. 1 CBS and C3 module structure in YOLOv5

Fig. 2 Traditional convolution to generate feature maps and Ghost module to generate feature maps

Fig. 3 Stucture of CBAM

Fig. 4 YOLOv5-GC network structure

Fig. 5 Distance of image in feature space

Fig. 6 Structure of ReID network

Fig. 7 Quantizing model as int8_bmodel

Fig. 8 Quantitative error analysis diagram

Tab. 1 Influence of each module on person detection algorithm

算法	浮点运算量/ GFLOPs	mAP/%	推理时间/ms	模型大小/MB
YOLOv5s	15.8	82.1	5.3	14.4
YOLOv5s-Ghost	7.9	80.2	4.8	10.5
YOLOv5s-CBAM	16.0	82.9	5.4	14.6
YOLOv5s-GC	8.0	81.3	4.9	10.7
YOLOv5l	107.9	84.3	15.9	92.8
YOLOv5l-Ghost	42.3	82.4	12.4	49.1

Fig. 9 Schematic diagram of detection results

Tab. 2 Comparison of experimental results between YOLOv5s-GC and other YOLO algorithms

算法	模型大小/MB	mAP/%	帧率/（frame·s^-1）
YOLOv3^［2］	246.3	62.8	58
YOLOv4^［19］	256.0	70.3	50
YOLOv4-CSP^［20］	210.2	67.4	53
YOLOv3-tiny^［2］	34.7	37.5	267
YOLOv4-tiny^［21］	23.5	45.6	301
YOLOv5s	14.4	82.1	189
YOLOv5s-GC	10.7	81.3	204

Tab. 3 Results of ablation experiment

算法	Market-1501		DukeMTMC-ReID
算法	mAP	Rank-1	mAP	Rank-1
Baseline	85.9	94.5	76.4	86.4
Baseline+ $L A M S$	86.3	94.6	76.8	86.7
Baseline+ $L t r i p l e t_c e n t e r$	86.1	94.5	76.7	86.6
Baseline+ $L$	86.5	94.7	77.0	87.0
Baseline+Non-local	87.0	94.9	77.1	87.2
ReID	87.4	95.1	77.6	87.9

Tab. 3 Results of ablation experiment

算法	Market-1501		DukeMTMC-ReID
算法	mAP	Rank-1	mAP	Rank-1
Baseline	85.9	94.5	76.4	86.4
Baseline+ $L A M S$	86.3	94.6	76.8	86.7
Baseline+ $L t r i p l e t_c e n t e r$	86.1	94.5	76.7	86.6
Baseline+ $L$	86.5	94.7	77.0	87.0
Baseline+Non-local	87.0	94.9	77.1	87.2
ReID	87.4	95.1	77.6	87.9

Tab. 4 Comparison of experimental results of different person re-identification algorithms

检测算法	Market-1501		DukeMTMC-ReID
检测算法	mAP	Rank-1	mAP	Rank-1
ACRN	62.6	83.6	51.9	72.6
SVDNet	62.1	82.3	56.8	76.7
MMT-500	71.2	87.7	53.4	73.0
GPR	71.5	88.1	65.2	79.5
ConsAtt	84.7	96.1	73.1	86.3
PCB-RPP	81.6	93.8	69.2	83.3
PPS	85.3	94.3	75.9	88.2
BagTricks	85.9	94.5	76.4	86.4
本文算法	87.4	95.1	77.6	87.9

Tab. 5 Performance comparison of different parameters

$α$	$β$	Market-1501		DukeMTMC-ReID
$α$	$β$	mAP	Rank-1	mAP	Rank-1
10	30	86.1	94.6	76.3	87.1
30	50	87.1	94.9	77.0	87.4
30	70	87.4	95.1	77.6	87.9
50	70	87.2	95.2	77.2	87.8
30	100	85.9	94.3	75.8	86.8
50	100	86.0	94.7	76.1	87.0
70	100	86.4	94.6	76.7	87.3

Tab. 5 Performance comparison of different parameters

$α$	$β$	Market-1501		DukeMTMC-ReID
$α$	$β$	mAP	Rank-1	mAP	Rank-1
10	30	86.1	94.6	76.3	87.1
30	50	87.1	94.9	77.0	87.4
30	70	87.4	95.1	77.6	87.9
50	70	87.2	95.2	77.2	87.8
30	100	85.9	94.3	75.8	86.8
50	100	86.0	94.7	76.1	87.0
70	100	86.4	94.6	76.7	87.3

Tab. 6 Comparison of results after quantization by different algorithms

模块	算法	mAP/%		模型大小/MB			推理时间/ms
模块	算法	fp32	int8	2080Ti	fp32	int8	2080Ti	fp32	int8
行人检测	YOLOv5s	82.1	80.5	28.3	28.8	7.7	5.3	4.5	3.6
行人检测	YOLOv5s-GC	81.3	79.3	22.2	23.0	6.2	4.9	4.1	3.2
行人重识别	MMT	71.2	70.4	94.4	94.9	24.3	9.4	8.1	4.6
	MEB	72.7	71.0	28.9	29.8	8.7	14.3	6.9	4.1
	Fast-ReID	88.6	86.9	102.6	103.1	56.9	13.7	11.2	10.7
	MLCReID	45.5	43.1	102.7	103.6	24.3	13.3	10.6	5.9
	ReID	87.4	85.7	100.7	94.9	24.3	8.6	7.3	4.1

Fig. 10 Schematic diagram of person search results

References 29

1	罗浩，姜伟，范星，等. 基于深度学习的行人重识别研究进展［J］. 自动化学报， 2019， 45（11）：2032-2049. 10.16383/j.aas.c180154
	LUO H， JIANG W， FAN X， et al. A survey on deep learning based person re-identification［J］. Acta Automatica Sinica， 2019， 45（11）： 2032-2049. 10.16383/j.aas.c180154
2	REDMON J， FARHADI A. YOLOv3： an incremental improvement［EB/OL］. ［2022-01-10］.. 10.1109/cvpr.2017.690
3	WANG C Y， YEH I H， LIAO H Y M. You only learn one representation： unified network for multiple tasks［EB/OL］. （2021-05-10）［2022-09-29］.. 10.48550/arXiv.2105.04206
4	GE Z， LIU S T， WANG F， et al. YOLOX： exceeding YOLO series in 2021［EB/OL］. ［2021-07-11］. .
5	ZENG K W， NING M N， WANG Y H， et al. Hierarchical clustering with hard-batch triplet loss for person re-identification［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 13654-13662. 10.1109/cvpr42600.2020.01367
6	HE S T， LUO H， WANG P C， et al. TransReID： transformer-based object re-identification［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 14993-15002. 10.1109/iccv48922.2021.01474
7	ZHANG G Q， CHEN Y H， LIN W S， et al. Low resolution information also matters： learning multi-resolution representations for person re-identification［C］// Proceedings of the 30th International Joint Conference on Artificial Intelligence. California： ijcai.org， 2021： 1295-1301. 10.24963/ijcai.2021/179
8	HAN K， WANG Y H， TIAN Q， et al. GhostNet： more features from cheap operations［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 1577-1586. 10.1109/cvpr42600.2020.00165
9	WOO S， PARK J， LEE J Y， et al. CBAM： convolutional block attention module［C］// Proceedings of the 2018 European Conference on Computer Vision. Cham： Springer， 2018： 3-19. 10.1007/978-3-030-01234-2_1
10	LAYNE R， HOSPEDALES T M， GONG S G. Person re-identification by attributes［C］// Proceedings of the 2012 British Machine Vision Conference. Durham： BMVA Press， 2012： No.24. 10.5244/c.26.24
11	WANG F， CHENG J， LIU W Y， et al. Additive margin softmax for face verification［J］. IEEE Signal Processing Letters， 2018， 25（7）： 926-930. 10.1109/lsp.2018.2822810
12	SCHROFF F， KALENICHENKO D， PHILBIN J. FaceNet： a unified embedding for face recognition and clustering［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 815-823. 10.1109/cvpr.2015.7298682
13	WEN Y D， ZHANG K P， LI Z F， et al. A discriminative feature learning approach for deep face recognition［C］// Proceedings of the 2016 European Conference on Computer Vision. Cham： Springer， 2016： 499-515. 10.1007/978-3-319-46478-7_31
14	LUO H， JIANG W， GU Y Z， et al. A strong baseline and batch normalization neck for deep person re-identification［J］. IEEE Transactions on Multimedia， 2020， 22（10）：2597-2609. 10.1109/tmm.2019.2958756
15	WANG X L， GIRSHICK R， GUPTA A， et al. Non-local neural networks［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 7794-7803. 10.1109/cvpr.2018.00813
16	LIN T Y， MAIRT M， BELONGIE S， et al. Microsoft COCO： common objects in context［C］// Proceedings of the 2014 European Conference on Computer Vision. Cham： Springer， 2014： 740-755. 10.1007/978-3-319-10602-1_48
17	ROTH P M， HIRZER M， KÖSTINGER M， et al. Mahalanobis distance learning for person re-identification［M］// GONG S G， CRISTANI M， YAN S C， et al. Person Re-Identification， ACVPR. London： Springer， 2014： 247-267. 10.1007/978-1-4471-6296-4_12
18	ZHENG Z D， ZHENG L， YANG Y. Unlabeled samples generated by GAN improve the person re-identification baseline in vitro［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 3774-3782. 10.1109/iccv.2017.405
19	BOCHKOVSKIY A， WANG C Y， LIAO H Y M. YOLOv4： optimal speed and accuracy of object detection［EB/OL］. （2020-04-23）［2022-02-20］..
20	WANG C Y， BOCHKOVSKIY A， LIAO H Y M. Scaled-YOLOv4： Scaling cross stage partial network［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 13024-13033. 10.1109/cvpr46437.2021.01283
21	JIANG Z C， ZHAO L Q， LI S Y， et al. Real-time object detection method based on improved YOLOv4-tiny［EB/OL］. ［2021-09-23］..
22	SCHUMANN A， STIEFELHAGEN R. Person re-identification by deep learning attribute-complementary information［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2017： 1435-1443. 10.1109/cvprw.2017.186
23	SUN Y F， ZHENG L， DENG W J， et al. SVDNet for pedestrian retrieval［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 3820-3828. 10.1109/iccv.2017.410
24	GE Y X， CHEN D P， LI H S. Mutual mean-teaching： pseudo label refinery for unsupervised domain adaptation on person re-identification［EB/OL］. ［2022-01-17］..
25	LUO C C， SONG C F， ZHANG Z X. Generalizing person re-identification by camera-aware invariance learning and cross-domain mixup［C］// Proceedings of the 2020 European Conference on Computer Vision. Cham： Springer， 2020： 224-241. 10.1007/978-3-030-58555-6_14
26	ZHOU S， WANG F， HUANG Z， et al. Discriminative feature learning with consistent attention regularization for person re-identification ［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 8040-8049. 10.1109/iccv.2019.00813
27	SUN Y F， ZHENG L， YANG Y， et al. Beyond part models： person retrieval with refined part pooling （and a strong convolutional baseline）［C］// Proceedings of the 2018 European Conference on Computer Vision. Cham： Springer， 2018： 501-518. 10.1007/978-3-030-01225-0_30
28	SHEN Y H， JI R R， HONG X P， et al. A part power set model for scale-free person retrieval［C］// Proceedings of the 28th International Joint Conference on Artificial Intelligence. California： ijcai.org， 2019： 3397-3403. 10.24963/ijcai.2019/471
29	罗浩. 基于深度学习的行人重识别算法研究［D］. 杭州：浙江大学， 2020： 45-66. 10.1109/cac51589.2020.9326599
	LUO H. Study on person re-identification based on deep learning - from non-occlusion to occlusion［D］. Hangzhou： Zhejiang University， 2020： 45-66. 10.1109/cac51589.2020.9326599

[1]	Jieru JIA, Jianchao YANG, Shuorui ZHANG, Tao YAN, Bin CHEN. Unsupervised person re-identification based on self-distilled vision Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2893-2902.
[2]	Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892.
[3]	Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974.
[4]	Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738.
[5]	Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392.
[6]	Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406.
[7]	Cui WANG, Miaolei DENG, Dexian ZHANG, Lei LI, Xiaoyan YANG. Review of end-to-end person search algorithms based on images [J]. Journal of Computer Applications, 2024, 44(8): 2544-2550.
[8]	Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594.
[9]	Shangbin MO, Wenjun WANG, Ling DONG, Shengxiang GAO, Zhengtao YU. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding [J]. Journal of Computer Applications, 2024, 44(8): 2611-2617.
[10]	Wu XIONG, Congjun CAO, Xuefang SONG, Yunlong SHAO, Xusheng WANG. Handwriting identification method based on multi-scale mixed domain attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2225-2232.
[11]	Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072.
[12]	Dianhui MAO, Xuebo LI, Junling LIU, Denghui ZHANG, Wenjing YAN. Chinese entity and relation extraction model based on parallel heterogeneous graph and sequential attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2018-2025.
[13]	Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109.
[14]	Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199.
[15]	Dahai LI, Zhonghua WANG, Zhendong WANG. Dual-branch low-light image enhancement network combining spatial and frequency domain information [J]. Journal of Computer Applications, 2024, 44(7): 2175-2182.

Efficient person search algorithm and optimization with Sophon SC5+ chip architecture

基于Sophon SC5+芯片构架的行人搜索算法与优化

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 16

References 29

Related Articles 15

Recommended Articles

Metrics