《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (3): 764-769.DOI: 10.11772/j.issn.1001-9081.2021040788
所属专题: 人工智能; 2021年中国计算机学会人工智能会议(CCFAI 2021)
• 2021年中国计算机学会人工智能会议(CCFAI 2021) • 上一篇 下一篇
收稿日期:
2021-05-17
修回日期:
2021-06-03
接受日期:
2021-06-15
发布日期:
2021-11-09
出版日期:
2022-03-10
通讯作者:
王洪元
作者简介:
殷雨昌(1996—),男,江苏盐城人,硕士研究生,主要研究方向:计算机视觉基金资助:
Yuchang YIN1, Hongyuan WANG1(), Li CHEN1, Zundeng FENG1, Yu XIAO2
Received:
2021-05-17
Revised:
2021-06-03
Accepted:
2021-06-15
Online:
2021-11-09
Published:
2022-03-10
Contact:
Hongyuan WANG
About author:
YIN Yuchang, born in 1996, M. S. candidate. His research interests include computer vision.Supported by:
摘要:
为解决行人重识别标注成本巨大的问题,提出了基于单标注样本的多损失学习与联合度量视频行人重识别方法。针对标签样本数量少,得到的模型不够鲁棒的问题,提出了多损失学习(MLL)策略:在每次训练过程中,针对不同的数据,采用不同的损失函数进行优化,提高模型的判别力。其次,在标签估计时,提出了一个联合距离度量(JDM),该度量将样本距离和近邻距离结合,进一步提升伪标签预测的精度。JDM改善了无标签数据标签估计的准确率低、未标记的数据没有被充分利用导致训练过程不稳定的问题。实验结果表明,和单标注样本渐进学习方法PL相比,当每次迭代增加的伪标签样本的比率为
中图分类号:
殷雨昌, 王洪元, 陈莉, 冯尊登, 肖宇. 基于单标注样本的多损失学习与联合度量视频行人重识别[J]. 计算机应用, 2022, 42(3): 764-769.
Yuchang YIN, Hongyuan WANG, Li CHEN, Zundeng FENG, Yu XIAO. One-shot video-based person re-identification with multi-loss learning and joint metric[J]. Journal of Computer Applications, 2022, 42(3): 764-769.
方法 | MARS | DukeMTMC-VideoReID | |||||||
---|---|---|---|---|---|---|---|---|---|
rank-1 | rank-5 | rank-20 | mAP | rank-1 | rank-5 | rank-20 | mAP | ||
Baseline(one-shot)[ | 36.20 | 50.20 | 61.90 | 15.50 | 39.60 | 56.80 | 67.00 | 33.30 | |
DGM+IDE[ | 36.80 | 54.00 | 68.50 | 16.90 | 42.40 | 57.90 | 69.30 | 33.60 | |
Stepwise[ | 41.20 | 55.60 | 66.80 | 19.70 | 56.30 | 70.40 | 79.20 | 46.80 | |
EUG[ | 57.62 | 69.64 | 78.08 | 34.68 | 70.79 | 83.61 | 89.60 | 61.76 | |
62.67 | 74.94 | 82.57 | 42.45 | 72.79 | 84.18 | 91.45 | 63.23 | ||
BUC[ | 55.10 | 68.30 | — | 29.40 | 74.80 | 86.80 | — | 66.70 | |
LGF[ | 58.80 | 69.00 | 78.50 | 36.20 | 86.30 | 96.00 | 98.60 | 82.70 | |
SCLU[ | 61.97 | 76.52 | 84.34 | 41.47 | 72.79 | 84.19 | 91.03 | 62.99 | |
63.74 | 78.44 | 85.51 | 42.74 | 72.79 | 85.04 | 90.31 | 63.15 | ||
PL[ | 57.90 | 70.30 | 79.30 | 34.90 | 71.00 | 83.80 | 90.30 | 61.90 | |
62.80 | 75.20 | 83.80 | 42.60 | 72.90 | 84.30 | 91.40 | 63.30 | ||
MLL+JDM | 65.50 | 78.50 | 86.60 | 44.20 | 76.20 | 87.20 | 93.30 | 67.50 | |
68.50 | 80.80 | 88.60 | 47.80 | 76.50 | 88.70 | 93.20 | 68.70 |
表1 各方法在两个大规模数据集上的性能比较 (%)
Tab.1 Performance comparison of different methods on two large-scale datasets
方法 | MARS | DukeMTMC-VideoReID | |||||||
---|---|---|---|---|---|---|---|---|---|
rank-1 | rank-5 | rank-20 | mAP | rank-1 | rank-5 | rank-20 | mAP | ||
Baseline(one-shot)[ | 36.20 | 50.20 | 61.90 | 15.50 | 39.60 | 56.80 | 67.00 | 33.30 | |
DGM+IDE[ | 36.80 | 54.00 | 68.50 | 16.90 | 42.40 | 57.90 | 69.30 | 33.60 | |
Stepwise[ | 41.20 | 55.60 | 66.80 | 19.70 | 56.30 | 70.40 | 79.20 | 46.80 | |
EUG[ | 57.62 | 69.64 | 78.08 | 34.68 | 70.79 | 83.61 | 89.60 | 61.76 | |
62.67 | 74.94 | 82.57 | 42.45 | 72.79 | 84.18 | 91.45 | 63.23 | ||
BUC[ | 55.10 | 68.30 | — | 29.40 | 74.80 | 86.80 | — | 66.70 | |
LGF[ | 58.80 | 69.00 | 78.50 | 36.20 | 86.30 | 96.00 | 98.60 | 82.70 | |
SCLU[ | 61.97 | 76.52 | 84.34 | 41.47 | 72.79 | 84.19 | 91.03 | 62.99 | |
63.74 | 78.44 | 85.51 | 42.74 | 72.79 | 85.04 | 90.31 | 63.15 | ||
PL[ | 57.90 | 70.30 | 79.30 | 34.90 | 71.00 | 83.80 | 90.30 | 61.90 | |
62.80 | 75.20 | 83.80 | 42.60 | 72.90 | 84.30 | 91.40 | 63.30 | ||
MLL+JDM | 65.50 | 78.50 | 86.60 | 44.20 | 76.20 | 87.20 | 93.30 | 67.50 | |
68.50 | 80.80 | 88.60 | 47.80 | 76.50 | 88.70 | 93.20 | 68.70 |
方法 | MARS | DukeMTMC-VideoReID |
---|---|---|
PUL[ | 37.29 | 61.24 |
EUG[ | 36.40 | 43.78 |
EUG[ | 55.56 | 69.75 |
SCLU[ | 58.76 | 70.41 |
SCLU[ | 62.92 | 76.80 |
PL[ | 55.70 | 71.20 |
MLL+JDM | 66.30 | 76.80 |
表2 各方法的标签估计准确率对比 (%)
Tab.2 Comparison of label estimation precision among different methods
方法 | MARS | DukeMTMC-VideoReID |
---|---|---|
PUL[ | 37.29 | 61.24 |
EUG[ | 36.40 | 43.78 |
EUG[ | 55.56 | 69.75 |
SCLU[ | 58.76 | 70.41 |
SCLU[ | 62.92 | 76.80 |
PL[ | 55.70 | 71.20 |
MLL+JDM | 66.30 | 76.80 |
方法 | MARS | DukeMTMC-VideoReID | 方法 | MARS | DukeMTMC-VideoReID | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
rank-1 | mAP | rank-1 | mAP | rank-1 | mAP | rank-1 | mAP | ||||
0.30 | PL[ | 44.5 | 22.1 | 66.1 | 56.3 | 0.10 | PL[ | 57.9 | 34.9 | 71.0 | 61.9 |
MLL | 49.2 | 25.9 | 68.7 | 59.9 | MLL | 61.9 | 39.5 | 73.4 | 65.2 | ||
JDM | 48.3 | 25.6 | 67.1 | 58.0 | JDM | 61.2 | 38.2 | 72.2 | 63.4 | ||
MLL+JDM | 48.5 | 26.8 | 69.5 | 60.2 | MLL+JDM | 65.5 | 44.2 | 76.2 | 67.5 | ||
0.20 | PL[ | 49.6 | 27.2 | 69.1 | 59.6 | 0.05 | PL[ | 62.8 | 42.6 | 72.9 | 63.3 |
MLL | 55.1 | 30.7 | 69.9 | 60.8 | MLL | 64.5 | 43.3 | 73.5 | 66.0 | ||
JDM | 54.7 | 31.0 | 70.1 | 60.5 | JDM | 63.8 | 42.6 | 73.1 | 64.0 | ||
MLL+JDM | 58.0 | 34.7 | 71.1 | 61.8 | MLL+JDM | 68.5 | 47.8 | 76.5 | 68.7 |
表3 p取不同值时在MARS和DukeMTMC-VideoReID数据集上的消融实验结果 (%)
Tab.3 Ablation experiment results on MARS and DukeMTMC-VideoReID datasets with different p values
方法 | MARS | DukeMTMC-VideoReID | 方法 | MARS | DukeMTMC-VideoReID | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
rank-1 | mAP | rank-1 | mAP | rank-1 | mAP | rank-1 | mAP | ||||
0.30 | PL[ | 44.5 | 22.1 | 66.1 | 56.3 | 0.10 | PL[ | 57.9 | 34.9 | 71.0 | 61.9 |
MLL | 49.2 | 25.9 | 68.7 | 59.9 | MLL | 61.9 | 39.5 | 73.4 | 65.2 | ||
JDM | 48.3 | 25.6 | 67.1 | 58.0 | JDM | 61.2 | 38.2 | 72.2 | 63.4 | ||
MLL+JDM | 48.5 | 26.8 | 69.5 | 60.2 | MLL+JDM | 65.5 | 44.2 | 76.2 | 67.5 | ||
0.20 | PL[ | 49.6 | 27.2 | 69.1 | 59.6 | 0.05 | PL[ | 62.8 | 42.6 | 72.9 | 63.3 |
MLL | 55.1 | 30.7 | 69.9 | 60.8 | MLL | 64.5 | 43.3 | 73.5 | 66.0 | ||
JDM | 54.7 | 31.0 | 70.1 | 60.5 | JDM | 63.8 | 42.6 | 73.1 | 64.0 | ||
MLL+JDM | 58.0 | 34.7 | 71.1 | 61.8 | MLL+JDM | 68.5 | 47.8 | 76.5 | 68.7 |
rank-1 | rank-5 | rank-20 | mAP | |
---|---|---|---|---|
0.3 | 73.4 | 88.8 | 92.2 | 65.3 |
0.4 | 73.8 | 86.0 | 93.0 | 65.5 |
0.5 | 76.2 | 87.2 | 93.3 | 67.5 |
0.6 | 73.5 | 86.3 | 92.9 | 65.2 |
表4 在DukeMTMC-VideoReID上使用不同α的JDM性能的比较 (%)
Tab.4 Performance comparison of JDM with different α on DukeMTMC-VideoReID
rank-1 | rank-5 | rank-20 | mAP | |
---|---|---|---|---|
0.3 | 73.4 | 88.8 | 92.2 | 65.3 |
0.4 | 73.8 | 86.0 | 93.0 | 65.5 |
0.5 | 76.2 | 87.2 | 93.3 | 67.5 |
0.6 | 73.5 | 86.3 | 92.9 | 65.2 |
rank-1 | rank-5 | rank-20 | mAP | |
---|---|---|---|---|
2 | 75.2 | 86.3 | 91.9 | 66.9 |
3 | 76.2 | 87.2 | 93.3 | 67.5 |
4 | 73.4 | 85.6 | 91.7 | 65.4 |
5 | 73.2 | 85.9 | 91.3 | 65.0 |
表5 在DukeMTMC-VideoReID上使用不同K值的JDM的性能比较 (%)
Tab.5 Performance comparison of JDM with different K on DukeMTMC-VideoReID
rank-1 | rank-5 | rank-20 | mAP | |
---|---|---|---|---|
2 | 75.2 | 86.3 | 91.9 | 66.9 |
3 | 76.2 | 87.2 | 93.3 | 67.5 |
4 | 73.4 | 85.6 | 91.7 | 65.4 |
5 | 73.2 | 85.9 | 91.3 | 65.0 |
1 | 戴臣超, 王洪元, 倪彤光,等. 基于深度卷积生成对抗网络和拓展近邻重排序的行人重识别[J]. 计算机研究与发展, 2019, 56(8):1632-1641. 10.7544/issn1000-1239.2019.20190195 |
DAI C C, WANG H Y, NI T G, et al. Person re-identification based on deep convolutional generative adversarial network and expanded neighbor reranking[J]. Journal of Computer Research and Development, 2019, 56(8): 1632-1641. 10.7544/issn1000-1239.2019.20190195 | |
2 | WANG H, DING Z, ZHANG J, et al. Person reidentification by semisupervised dictionary rectification learning with retraining module[J]. Journal of Electronic Imaging, 2018, 27(4): 043043. 10.1117/1.jei.27.4.043043 |
3 | GU X, CHANG H, MA B, et al. Appearance-preserving 3D convolution for video-based person re-identification[C]// Proceedings of the 16th European Conference on Computer Vision, LNCS 12347. Cham: Springer, 2020: 228-243. 10.1007/978-3-030-58536-5_14 |
4 | CHEN D, XU D, LI H, et al. Group consistent similarity learning via deep CRF for person re-identification[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2018: 8649-8658. 10.1109/cvpr.2018.00902 |
5 | ZHENG Z, ZHENG L, YANG Y. Unlabeled samples generated by gan improve the person re-identification baseline in vitro[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Washington, DC: IEEE Computer Society, 2017: 3754-3762. 10.1109/ICCV.2017.405 |
6 | LIU X, SONG M, TAO D, et al. Semi-supervised coupled dictionary learning for person re-identification[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2014: 3550-3557. 10.1109/cvpr.2014.454 |
7 | MA A J, LI P. Semi-supervised ranking for re-identification with few labeled image pairs[C]// Proceedings of the 2014 Asian Conference on Computer Vision, LNCS 9006. Cham: Springer, 2014: 598-613. |
8 | BAK S, CARR P. One-shot metric learning for person re-identification[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2017: 2990-2999. 10.1109/cvpr.2017.171 |
9 | WU Y, LIN Y, DONG X, et al. Exploit the unknown gradually: one-shot video-based person re-identification by stepwise learning[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2018: 5177-5186. 10.1109/cvpr.2018.00543 |
10 | WU Y, LIN Y, DONG X, et al. Progressive learning for person re-identification with one example[J]. IEEE Transactions on Image Processing, 2019, 28(6): 2872-2881. 10.1109/tip.2019.2891895 |
11 | SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2016: 2818-2826. 10.1109/cvpr.2016.308 |
12 | CHEN L, YANG H, GAO Z. Joint attentive spatial-temporal feature aggregation for video-based person re-identification[J]. IEEE Access, 2019, 7: 41230-41240. 10.1109/access.2019.2907274 |
13 | HOU R, MA B, CHANG H, et al. VRSTC: occlusion-free video person re-identification[C]// Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2019: 7183-7192. 10.1109/cvpr.2019.00735 |
14 | SUBRAMANIAM A, NAMBIAR A, MITTAL A. Co-segmentation inspired attention networks for video-based person re-identification[C]// Proceedings of the 2019 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2019: 562-572. 10.1109/iccv.2019.00065 |
15 | WU Y, BOURAHLA O E F, LI X, et al. Adaptive graph representation learning for video person re-identification[J]. IEEE Transactions on Image Processing, 2020, 29: 8821-8830. 10.1109/tip.2020.3001693 |
16 | YAN Y, QIN J, CHEN J, et al. Learning multi-granular hypergraphs for video-based person re-identification[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 2899-2908. 10.1109/cvpr42600.2020.00297 |
17 | YE M, MA A J, ZHENG L, et al. Dynamic label graph matching for unsupervised video re-identification[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Washington, DC: IEEE Computer Society, 2017: 5142-5150. 10.1109/iccv.2017.550 |
18 | XU T I, LI J, WU H, et al. Feature space regularization for person re-identification with one sample[C]// Proceedings of the IEEE 31st International Conference on Tools with Artificial Intelligence. Piscataway: IEEE, 2019: 1463-1470. 10.1109/ictai.2019.00208 |
19 | YIN J, LI B, WAN F, et al. A new data selection strategy for one-shot video-based person re-identification[C]// Proceedings of the 2019 IEEE International Conference on Image Processing. Piscataway: IEEE, 2019: 1227-1231. 10.1109/icip.2019.8803723 |
20 | ZHAO C, ZHANG Z, YAN J, et al. Local-global feature for video-based one-shot person re-identification[C]// Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2020: 3662-3666. 10.1109/icassp40776.2020.9053134 |
21 | LI H, XIAO J, SUN M, et al. Progressive sample mining and representation learning for one-shot person re-identification with adversarial samples [EB/OL]. [2021-05-10]. . 10.1016/j.patcog.2020.107614 |
22 | XIN X, WANG J, XIE R, et al. Semi-supervised person re-identification using multi-view clustering[J]. Pattern Recognition, 2019, 88: 285-297. 10.1016/j.patcog.2018.11.025 |
23 | LIU C T, LI Y J, CHIEN S Y, et al. Semantics-guided clustering with deep progressive learning for semi-supervised person re-identification [EB/OL]. [2021-05-10]. . 10.48550/arXiv.2010.01148 |
24 | ZHENG L, BIE Z, SUN Y, et al. MARS: a video benchmark for large-scale person re-identification[C]// Proceedings of the 2016 European Conference on Computer Vision. Cham: Springer, 2016: 868-884. 10.1007/978-3-319-46466-4_52 |
25 | LIU Z, WANG D, LU H. Stepwise metric promotion for unsupervised video person re-identification[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Washington, DC: IEEE Computer Society, 2017: 2429-2438. 10.1109/iccv.2017.266 |
26 | FAN H, ZHENG L, YAN C, et al. Unsupervised person re-identification: clustering and fine-tuning[J]. ACM Transactions on Multimedia Computing, Communications, and Applications, 2018, 14(4): 1-18. 10.1145/3243316 |
27 | LIN Y, DONG X, ZHENG L, et al. A bottom-up clustering approach to unsupervised person re-identification[C]// Proceedings of the 2019 AAAI Conference on Artificial Intelligence. Menlo Park, CA: AAAI, 2019: 8738-8745. 10.1609/aaai.v33i01.33018738 |
[1] | 张英俊, 李牛牛, 谢斌红, 张睿, 陆望东. 课程学习指导下的半监督目标检测框架[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2326-2333. |
[2] | 周妍, 李阳. 用于脑卒中病灶分割的具有注意力机制的校正交叉伪监督方法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1942-1948. |
[3] | 张帅华, 张淑芬, 周明川, 徐超, 陈学斌. 基于半监督联邦学习的恶意流量检测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3487-3494. |
[4] | 王瑞琪, 纪淑娟, 曹宁, 郭亚杰. 基于一致性训练的半监督虚假招聘广告检测模型[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2932-2939. |
[5] | 姚英茂, 姜晓燕. 基于图卷积网络与自注意力图池化的视频行人重识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 728-735. |
[6] | 伏博毅, 彭云聪, 蓝鑫, 秦小林. 基于深度学习的标签噪声学习算法综述[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 674-684. |
[7] | 方昕, 黄泽鑫, 张聿晗, 高天, 潘嘉, 付中华, 高建清, 刘俊华, 邹亮. 基于时域波形的半监督端到端虚假语音检测方法[J]. 《计算机应用》唯一官方网站, 2023, 43(1): 227-231. |
[8] | 李锦烨, 黄瑞章, 秦永彬, 陈艳平, 田小瑜. 基于反绎学习的裁判文书量刑情节识别[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1802-1807. |
[9] | 邱永茹, 姚光乐, 冯杰, 崔昊宇. 基于半监督学习的单幅图像去雨算法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1577-1582. |
[10] | 吴洁, 张师天, 谢海滨, 杨光. 基于多影像中心磁共振成像数据的半监督膝盖异常分类[J]. 《计算机应用》唯一官方网站, 2022, 42(1): 316-324. |
[11] | 张师鹏, 李永忠, 杜祥通. 基于半监督学习和三支决策的入侵检测模型[J]. 计算机应用, 2021, 41(9): 2602-2608. |
[12] | 毛铭泽, 曹芮浩, 闫春钢. 基于权值多样性的半监督分类算法[J]. 计算机应用, 2021, 41(9): 2473-2480. |
[13] | 曹玉红, 徐海, 刘荪傲, 王紫霄, 李宏亮. 基于深度学习的医学影像分割研究综述[J]. 《计算机应用》唯一官方网站, 2021, 41(8): 2273-2287. |
[14] | 刘紫燕, 朱明成, 袁磊, 马珊珊, 陈霖周廷. 基于非局部关注和多重特征融合的视频行人重识别[J]. 计算机应用, 2021, 41(2): 530-536. |
[15] | 李子龙, 周勇, 鲍蓉, 王洪栋. 优化三元组损失的深度距离度量学习方法[J]. 《计算机应用》唯一官方网站, 2021, 41(12): 3480-3484. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||