《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (3): 744-751.DOI: 10.11772/j.issn.1001-9081.2022020252
所属专题: 人工智能
收稿日期:
2022-03-04
修回日期:
2022-05-27
接受日期:
2022-05-30
发布日期:
2022-08-16
出版日期:
2023-03-10
通讯作者:
华璟
作者简介:
孙杰(1985—),男,浙江杭州人,中级实验师,硕士,CCF会员,主要研究方向:图像与视频处理、可视化分析基金资助:
Jie SUN1, Shaoxin WU1, Xuejun WANG2, Jing HUA1()
Received:
2022-03-04
Revised:
2022-05-27
Accepted:
2022-05-30
Online:
2022-08-16
Published:
2023-03-10
Contact:
Jing HUA
About author:
SUN Jie, born in 1985, M. S., intermediate experimentalist. His research interests include image and video processing, visual analysis.Supported by:
摘要:
传统的基于深度神经网络的行人搜索算法计算量大,在大规模部署时搜索性能低,导致算法在落地应用于硬件和预算有限的终端时面临成本高、速度慢的难题。针对以上问题,提出一种基于Sophon SC5+高性能推理芯片的行人检测与重识别算法,从算法到硬件自上而下地优化深度学习的效率。首先,利用轻量化的Ghost模块替换YOLOv5s的主干网络,从而大幅度降低模型的参数和计算量;其次,融入CBAM注意力机制,以增强算法的特征学习能力,并提高检测精度;然后,将中心损失约束和 Non-local注意力机制加入行人重识别模块,并结合中心约束三元组损失和附加间隔交叉熵损失优化模型,以提升行人重识别算法性能;最后,基于Sophon SC+量化行人检测模型和行人重识别模型并生成最终的推理模型。在Market-1501与DukeMTMC-ReID数据集上的实验结果表明,相较于YOLOv4-tiny、ACRN、SVDNet等主流算法,行人检测算法与行人重识别算法的平均精度均值(mAP)至少提高了43.8和25.7个百分点。基于Sophon SC5+芯片实现int8量化后,所提算法的mAP虽然减小了1.7个百分点,但模型大小减小了74.4%,能够在大规模、城市级行人搜索系统中落地使用。
中图分类号:
孙杰, 吴绍鑫, 王学军, 华璟. 基于Sophon SC5+芯片构架的行人搜索算法与优化[J]. 计算机应用, 2023, 43(3): 744-751.
Jie SUN, Shaoxin WU, Xuejun WANG, Jing HUA. Efficient person search algorithm and optimization with Sophon SC5+ chip architecture[J]. Journal of Computer Applications, 2023, 43(3): 744-751.
算法 | 浮点运算量/ GFLOPs | mAP/% | 推理 时间/ms | 模型 大小/MB |
---|---|---|---|---|
YOLOv5s | 15.8 | 82.1 | 5.3 | 14.4 |
YOLOv5s-Ghost | 7.9 | 80.2 | 4.8 | 10.5 |
YOLOv5s-CBAM | 16.0 | 82.9 | 5.4 | 14.6 |
YOLOv5s-GC | 8.0 | 81.3 | 4.9 | 10.7 |
YOLOv5l | 107.9 | 84.3 | 15.9 | 92.8 |
YOLOv5l-Ghost | 42.3 | 82.4 | 12.4 | 49.1 |
表1 各模块对行人检测算法的影响
Tab. 1 Influence of each module on person detection algorithm
算法 | 浮点运算量/ GFLOPs | mAP/% | 推理 时间/ms | 模型 大小/MB |
---|---|---|---|---|
YOLOv5s | 15.8 | 82.1 | 5.3 | 14.4 |
YOLOv5s-Ghost | 7.9 | 80.2 | 4.8 | 10.5 |
YOLOv5s-CBAM | 16.0 | 82.9 | 5.4 | 14.6 |
YOLOv5s-GC | 8.0 | 81.3 | 4.9 | 10.7 |
YOLOv5l | 107.9 | 84.3 | 15.9 | 92.8 |
YOLOv5l-Ghost | 42.3 | 82.4 | 12.4 | 49.1 |
算法 | 模型大小/MB | mAP/% | 帧率/(frame·s-1) |
---|---|---|---|
YOLOv3[ | 246.3 | 62.8 | 58 |
YOLOv4[ | 256.0 | 70.3 | 50 |
YOLOv4-CSP[ | 210.2 | 67.4 | 53 |
YOLOv3-tiny[ | 34.7 | 37.5 | 267 |
YOLOv4-tiny[ | 23.5 | 45.6 | 301 |
YOLOv5s | 14.4 | 82.1 | 189 |
YOLOv5s-GC | 10.7 | 81.3 | 204 |
表2 YOLOv5s-GC与其他YOLO算法对比实验结果
Tab. 2 Comparison of experimental results between YOLOv5s-GC and other YOLO algorithms
算法 | 模型大小/MB | mAP/% | 帧率/(frame·s-1) |
---|---|---|---|
YOLOv3[ | 246.3 | 62.8 | 58 |
YOLOv4[ | 256.0 | 70.3 | 50 |
YOLOv4-CSP[ | 210.2 | 67.4 | 53 |
YOLOv3-tiny[ | 34.7 | 37.5 | 267 |
YOLOv4-tiny[ | 23.5 | 45.6 | 301 |
YOLOv5s | 14.4 | 82.1 | 189 |
YOLOv5s-GC | 10.7 | 81.3 | 204 |
算法 | Market-1501 | DukeMTMC-ReID | ||
---|---|---|---|---|
mAP | Rank-1 | mAP | Rank-1 | |
Baseline | 85.9 | 94.5 | 76.4 | 86.4 |
Baseline+ | 86.3 | 94.6 | 76.8 | 86.7 |
Baseline+ | 86.1 | 94.5 | 76.7 | 86.6 |
Baseline+ | 86.5 | 94.7 | 77.0 | 87.0 |
Baseline+Non-local | 87.0 | 94.9 | 77.1 | 87.2 |
ReID | 87.4 | 95.1 | 77.6 | 87.9 |
表3 消融实验结果 (%)
Tab. 3 Results of ablation experiment
算法 | Market-1501 | DukeMTMC-ReID | ||
---|---|---|---|---|
mAP | Rank-1 | mAP | Rank-1 | |
Baseline | 85.9 | 94.5 | 76.4 | 86.4 |
Baseline+ | 86.3 | 94.6 | 76.8 | 86.7 |
Baseline+ | 86.1 | 94.5 | 76.7 | 86.6 |
Baseline+ | 86.5 | 94.7 | 77.0 | 87.0 |
Baseline+Non-local | 87.0 | 94.9 | 77.1 | 87.2 |
ReID | 87.4 | 95.1 | 77.6 | 87.9 |
检测算法 | Market-1501 | DukeMTMC-ReID | ||
---|---|---|---|---|
mAP | Rank-1 | mAP | Rank-1 | |
ACRN | 62.6 | 83.6 | 51.9 | 72.6 |
SVDNet | 62.1 | 82.3 | 56.8 | 76.7 |
MMT-500 | 71.2 | 87.7 | 53.4 | 73.0 |
GPR | 71.5 | 88.1 | 65.2 | 79.5 |
ConsAtt | 84.7 | 96.1 | 73.1 | 86.3 |
PCB-RPP | 81.6 | 93.8 | 69.2 | 83.3 |
PPS | 85.3 | 94.3 | 75.9 | 88.2 |
BagTricks | 85.9 | 94.5 | 76.4 | 86.4 |
本文算法 | 87.4 | 95.1 | 77.6 | 87.9 |
表4 不同行人重识别算法的对比实验结果 ( %)
Tab. 4 Comparison of experimental results of different person re-identification algorithms
检测算法 | Market-1501 | DukeMTMC-ReID | ||
---|---|---|---|---|
mAP | Rank-1 | mAP | Rank-1 | |
ACRN | 62.6 | 83.6 | 51.9 | 72.6 |
SVDNet | 62.1 | 82.3 | 56.8 | 76.7 |
MMT-500 | 71.2 | 87.7 | 53.4 | 73.0 |
GPR | 71.5 | 88.1 | 65.2 | 79.5 |
ConsAtt | 84.7 | 96.1 | 73.1 | 86.3 |
PCB-RPP | 81.6 | 93.8 | 69.2 | 83.3 |
PPS | 85.3 | 94.3 | 75.9 | 88.2 |
BagTricks | 85.9 | 94.5 | 76.4 | 86.4 |
本文算法 | 87.4 | 95.1 | 77.6 | 87.9 |
Market-1501 | DukeMTMC-ReID | ||||
---|---|---|---|---|---|
mAP | Rank-1 | mAP | Rank-1 | ||
10 | 30 | 86.1 | 94.6 | 76.3 | 87.1 |
30 | 50 | 87.1 | 94.9 | 77.0 | 87.4 |
30 | 70 | 87.4 | 95.1 | 77.6 | 87.9 |
50 | 70 | 87.2 | 95.2 | 77.2 | 87.8 |
30 | 100 | 85.9 | 94.3 | 75.8 | 86.8 |
50 | 100 | 86.0 | 94.7 | 76.1 | 87.0 |
70 | 100 | 86.4 | 94.6 | 76.7 | 87.3 |
表5 不同参数的性能对比 (%)
Tab. 5 Performance comparison of different parameters
Market-1501 | DukeMTMC-ReID | ||||
---|---|---|---|---|---|
mAP | Rank-1 | mAP | Rank-1 | ||
10 | 30 | 86.1 | 94.6 | 76.3 | 87.1 |
30 | 50 | 87.1 | 94.9 | 77.0 | 87.4 |
30 | 70 | 87.4 | 95.1 | 77.6 | 87.9 |
50 | 70 | 87.2 | 95.2 | 77.2 | 87.8 |
30 | 100 | 85.9 | 94.3 | 75.8 | 86.8 |
50 | 100 | 86.0 | 94.7 | 76.1 | 87.0 |
70 | 100 | 86.4 | 94.6 | 76.7 | 87.3 |
模块 | 算法 | mAP/% | 模型大小/MB | 推理时间/ms | |||||
---|---|---|---|---|---|---|---|---|---|
fp32 | int8 | 2080Ti | fp32 | int8 | 2080Ti | fp32 | int8 | ||
行人检测 | YOLOv5s | 82.1 | 80.5 | 28.3 | 28.8 | 7.7 | 5.3 | 4.5 | 3.6 |
YOLOv5s-GC | 81.3 | 79.3 | 22.2 | 23.0 | 6.2 | 4.9 | 4.1 | 3.2 | |
行人重识别 | MMT | 71.2 | 70.4 | 94.4 | 94.9 | 24.3 | 9.4 | 8.1 | 4.6 |
MEB | 72.7 | 71.0 | 28.9 | 29.8 | 8.7 | 14.3 | 6.9 | 4.1 | |
Fast-ReID | 88.6 | 86.9 | 102.6 | 103.1 | 56.9 | 13.7 | 11.2 | 10.7 | |
MLCReID | 45.5 | 43.1 | 102.7 | 103.6 | 24.3 | 13.3 | 10.6 | 5.9 | |
ReID | 87.4 | 85.7 | 100.7 | 94.9 | 24.3 | 8.6 | 7.3 | 4.1 |
表6 不同算法量化后的结果比较
Tab. 6 Comparison of results after quantization by different algorithms
模块 | 算法 | mAP/% | 模型大小/MB | 推理时间/ms | |||||
---|---|---|---|---|---|---|---|---|---|
fp32 | int8 | 2080Ti | fp32 | int8 | 2080Ti | fp32 | int8 | ||
行人检测 | YOLOv5s | 82.1 | 80.5 | 28.3 | 28.8 | 7.7 | 5.3 | 4.5 | 3.6 |
YOLOv5s-GC | 81.3 | 79.3 | 22.2 | 23.0 | 6.2 | 4.9 | 4.1 | 3.2 | |
行人重识别 | MMT | 71.2 | 70.4 | 94.4 | 94.9 | 24.3 | 9.4 | 8.1 | 4.6 |
MEB | 72.7 | 71.0 | 28.9 | 29.8 | 8.7 | 14.3 | 6.9 | 4.1 | |
Fast-ReID | 88.6 | 86.9 | 102.6 | 103.1 | 56.9 | 13.7 | 11.2 | 10.7 | |
MLCReID | 45.5 | 43.1 | 102.7 | 103.6 | 24.3 | 13.3 | 10.6 | 5.9 | |
ReID | 87.4 | 85.7 | 100.7 | 94.9 | 24.3 | 8.6 | 7.3 | 4.1 |
1 | 罗浩,姜伟,范星,等. 基于深度学习的行人重识别研究进展[J]. 自动化学报, 2019, 45(11):2032-2049. 10.16383/j.aas.c180154 |
LUO H, JIANG W, FAN X, et al. A survey on deep learning based person re-identification[J]. Acta Automatica Sinica, 2019, 45(11): 2032-2049. 10.16383/j.aas.c180154 | |
2 | REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. [2022-01-10].. 10.1109/cvpr.2017.690 |
3 | WANG C Y, YEH I H, LIAO H Y M. You only learn one representation: unified network for multiple tasks[EB/OL]. (2021-05-10) [2022-09-29].. 10.48550/arXiv.2105.04206 |
4 | GE Z, LIU S T, WANG F, et al. YOLOX: exceeding YOLO series in 2021[EB/OL]. [2021-07-11]. . |
5 | ZENG K W, NING M N, WANG Y H, et al. Hierarchical clustering with hard-batch triplet loss for person re-identification[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 13654-13662. 10.1109/cvpr42600.2020.01367 |
6 | HE S T, LUO H, WANG P C, et al. TransReID: transformer-based object re-identification[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 14993-15002. 10.1109/iccv48922.2021.01474 |
7 | ZHANG G Q, CHEN Y H, LIN W S, et al. Low resolution information also matters: learning multi-resolution representations for person re-identification[C]// Proceedings of the 30th International Joint Conference on Artificial Intelligence. California: ijcai.org, 2021: 1295-1301. 10.24963/ijcai.2021/179 |
8 | HAN K, WANG Y H, TIAN Q, et al. GhostNet: more features from cheap operations[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 1577-1586. 10.1109/cvpr42600.2020.00165 |
9 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 2018 European Conference on Computer Vision. Cham: Springer, 2018: 3-19. 10.1007/978-3-030-01234-2_1 |
10 | LAYNE R, HOSPEDALES T M, GONG S G. Person re-identification by attributes[C]// Proceedings of the 2012 British Machine Vision Conference. Durham: BMVA Press, 2012: No.24. 10.5244/c.26.24 |
11 | WANG F, CHENG J, LIU W Y, et al. Additive margin softmax for face verification[J]. IEEE Signal Processing Letters, 2018, 25(7): 926-930. 10.1109/lsp.2018.2822810 |
12 | SCHROFF F, KALENICHENKO D, PHILBIN J. FaceNet: a unified embedding for face recognition and clustering[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 815-823. 10.1109/cvpr.2015.7298682 |
13 | WEN Y D, ZHANG K P, LI Z F, et al. A discriminative feature learning approach for deep face recognition[C]// Proceedings of the 2016 European Conference on Computer Vision. Cham: Springer, 2016: 499-515. 10.1007/978-3-319-46478-7_31 |
14 | LUO H, JIANG W, GU Y Z, et al. A strong baseline and batch normalization neck for deep person re-identification[J]. IEEE Transactions on Multimedia, 2020, 22(10):2597-2609. 10.1109/tmm.2019.2958756 |
15 | WANG X L, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7794-7803. 10.1109/cvpr.2018.00813 |
16 | LIN T Y, MAIRT M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]// Proceedings of the 2014 European Conference on Computer Vision. Cham: Springer, 2014: 740-755. 10.1007/978-3-319-10602-1_48 |
17 | ROTH P M, HIRZER M, KÖSTINGER M, et al. Mahalanobis distance learning for person re-identification[M]// GONG S G, CRISTANI M, YAN S C, et al. Person Re-Identification, ACVPR. London: Springer, 2014: 247-267. 10.1007/978-1-4471-6296-4_12 |
18 | ZHENG Z D, ZHENG L, YANG Y. Unlabeled samples generated by GAN improve the person re-identification baseline in vitro[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 3774-3782. 10.1109/iccv.2017.405 |
19 | BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. (2020-04-23) [2022-02-20].. |
20 | WANG C Y, BOCHKOVSKIY A, LIAO H Y M. Scaled-YOLOv4: Scaling cross stage partial network[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 13024-13033. 10.1109/cvpr46437.2021.01283 |
21 | JIANG Z C, ZHAO L Q, LI S Y, et al. Real-time object detection method based on improved YOLOv4-tiny[EB/OL]. [2021-09-23].. |
22 | SCHUMANN A, STIEFELHAGEN R. Person re-identification by deep learning attribute-complementary information[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2017: 1435-1443. 10.1109/cvprw.2017.186 |
23 | SUN Y F, ZHENG L, DENG W J, et al. SVDNet for pedestrian retrieval[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 3820-3828. 10.1109/iccv.2017.410 |
24 | GE Y X, CHEN D P, LI H S. Mutual mean-teaching: pseudo label refinery for unsupervised domain adaptation on person re-identification[EB/OL]. [2022-01-17].. |
25 | LUO C C, SONG C F, ZHANG Z X. Generalizing person re-identification by camera-aware invariance learning and cross-domain mixup[C]// Proceedings of the 2020 European Conference on Computer Vision. Cham: Springer, 2020: 224-241. 10.1007/978-3-030-58555-6_14 |
26 | ZHOU S, WANG F, HUANG Z, et al. Discriminative feature learning with consistent attention regularization for person re-identification [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 8040-8049. 10.1109/iccv.2019.00813 |
27 | SUN Y F, ZHENG L, YANG Y, et al. Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline)[C]// Proceedings of the 2018 European Conference on Computer Vision. Cham: Springer, 2018: 501-518. 10.1007/978-3-030-01225-0_30 |
28 | SHEN Y H, JI R R, HONG X P, et al. A part power set model for scale-free person retrieval[C]// Proceedings of the 28th International Joint Conference on Artificial Intelligence. California: ijcai.org, 2019: 3397-3403. 10.24963/ijcai.2019/471 |
29 | 罗浩. 基于深度学习的行人重识别算法研究[D]. 杭州:浙江大学, 2020: 45-66. 10.1109/cac51589.2020.9326599 |
LUO H. Study on person re-identification based on deep learning - from non-occlusion to occlusion[D]. Hangzhou: Zhejiang University, 2020: 45-66. 10.1109/cac51589.2020.9326599 |
[1] | 秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974. |
[2] | 李力铤, 华蓓, 贺若舟, 徐况. 基于解耦注意力机制的多变量时序预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2732-2738. |
[3] | 赵志强, 马培红, 黑新宏. 基于双重注意力机制的人群计数方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2886-2892. |
[4] | 贾洁茹, 杨建超, 张硕蕊, 闫涛, 陈斌. 基于自蒸馏视觉Transformer的无监督行人重识别[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2893-2902. |
[5] | 薛凯鹏, 徐涛, 廖春节. 融合自监督和多层交叉注意力的多模态情感分析网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2387-2392. |
[6] | 汪雨晴, 朱广丽, 段文杰, 李书羽, 周若彤. 基于交互注意力机制的心理咨询文本情感分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2393-2399. |
[7] | 高鹏淇, 黄鹤鸣, 樊永红. 融合坐标与多头注意力机制的交互语音情感识别[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2400-2406. |
[8] | 王翠, 邓淼磊, 张德贤, 李磊, 杨晓艳. 基于图像的端到端行人搜索算法综述[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2544-2550. |
[9] | 李钟华, 白云起, 王雪津, 黄雷雷, 林初俊, 廖诗宇. 基于图像增强的低照度人脸检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2588-2594. |
[10] | 莫尚斌, 王文君, 董凌, 高盛祥, 余正涛. 基于多路信息聚合协同解码的单通道语音增强[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2611-2617. |
[11] | 熊武, 曹从军, 宋雪芳, 邵云龙, 王旭升. 基于多尺度混合域注意力机制的笔迹鉴别方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2225-2232. |
[12] | 李欢欢, 黄添强, 丁雪梅, 罗海峰, 黄丽清. 基于多尺度时空图卷积网络的交通出行需求预测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2065-2072. |
[13] | 毛典辉, 李学博, 刘峻岭, 张登辉, 颜文婧. 基于并行异构图和序列注意力机制的中文实体关系抽取模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2018-2025. |
[14] | 刘丽, 侯海金, 王安红, 张涛. 基于多尺度注意力的生成式信息隐藏算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2102-2109. |
[15] | 徐松, 张文博, 王一帆. 基于时空信息的轻量视频显著性目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2192-2199. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||