《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (6): 1919-1929.DOI: 10.11772/j.issn.1001-9081.2022050753

• 多媒体计算与计算机仿真 • 上一篇    下一篇

基于双解码器的Transformer多目标跟踪方法

王利1, 宣士斌1,2(), 秦续阳1, 李紫薇1   

  1. 1.广西民族大学 人工智能学院,南宁 530006
    2.广西混杂计算与集成电路设计分析重点实验室(广西民族大学),南宁 530006
  • 收稿日期:2022-05-25 修回日期:2022-12-22 接受日期:2022-12-29 发布日期:2023-06-08 出版日期:2023-06-10
  • 通讯作者: 宣士斌
  • 作者简介:王利(1995—),女,四川成都人,硕士研究生,主要研究方向:多目标跟踪、计算机视觉
    宣士斌(1964—),男,安徽无为人,教授,博士,主要研究方向:图像处理与识别Email:xuanshibin@gxmzu.edu.cn
    秦续阳(1995—),男,山西运城人,硕士研究生,主要研究方向:目标跟踪、深度学习
    李紫薇(1997—),女,安徽淮北人,硕士研究生,主要研究方向:图像分割、计算机视觉。
  • 基金资助:
    国家自然科学基金资助项目(61866003)

Multi-object tracking method based on dual-decoder Transformer

Li WANG1, Shibin XUAN1,2(), Xuyang QIN1, Ziwei LI1   

  1. 1.School of Artificial Intelligence,Guangxi Minzu University,Nanning Guangxi 530006,China
    2.Guangxi Key Laboratory of Hybrid Computation and IC Design and Analysis (Guangxi Minzu University),Nanning Guangxi 530006,China
  • Received:2022-05-25 Revised:2022-12-22 Accepted:2022-12-29 Online:2023-06-08 Published:2023-06-10
  • Contact: Shibin XUAN
  • About author:WANG Li, born in 1995, M. S. candidate. Her research interests include multi-object tracking, computer vision.
    QIN Xuyang, born in 1995, M. S. candidate. His research interests include object tracking, deep learning.
    LI Ziwei, born in 1997, M. S. candidate. Her research interests include semantic segmentation, computer vision.
  • Supported by:
    National Natural Science Foundation of China(6186603)

摘要:

多目标跟踪(MOT)任务需要同时跟踪多个目标并保证目标身份的连续性。针对当前MOT过程中存在目标遮挡、目标ID切换(IDSW)和目标丢失等问题,对基于Transformer的MOT模型进行改进,提出了一种基于双解码器的Transformer多目标跟踪方法。首先,在第一帧中通过模型初始化生成一组轨迹,并在此后的每一帧中用注意力建立帧与帧之间的关联;其次,利用双解码器修正跟踪目标信息,一个解码器用于检测目标,一个解码器用于跟踪目标;然后,完成跟踪后利用直方图模板匹配找回丢失的目标;最后,用卡尔曼滤波跟踪预测遮挡目标,并将遮挡结果与新检测出的目标关联,从而保证跟踪结果的连续性。此外,在TrackFormer的基础上添加表观统计特性和运动特征建模,以实现不同结构之间的融合。在MOT17数据集上的实验结果表明,相较于TrackFomer模型,所提模型的身份F1得分(IDF1)提升了0.87个百分点,多对象跟踪准确性(MOTA)提升了0.41个百分点,IDSW数量减少了16.3%。所提方法在MOT16和MOT20数据集上也取得了不错的成绩。可见所提方法能够有效应对物体遮挡问题,维持目标身份信息,减少目标身份丢失。

关键词: 多目标跟踪, 注意力, Transformer, 直方图, 模板匹配, 卡尔曼滤波

Abstract:

The Multi-Object Tracking (MOT) task needs to track multiple objects at the same time and ensures the continuity of object identities. To solve the problems in the current MOT process, such as object occlusion, object ID Switch (IDSW) and object loss, the Transformer-based MOT model was improved, and a multi-object tracking method based on dual-decoder Transformer was proposed. Firstly, a set of trajectories was generated by model initialization in the first frame, and in each frame after the first one, attention was used to establish the association between frames. Secondly, the dual-decoder was used to correct the tracked object information. One decoder was used to detect the objects, and the other one was used to track the objects. Thirdly, the histogram template matching was applied to find the lost objects after completing the tracking. Finally, the Kalman filter was utilized to track and predict the occluded objects, and the occluded results were associated with the newly detected objects to ensure the continuity of the tracking results. In addition, on the basis of TrackFormer, the modeling of apparent statistical characteristics and motion features was added to realize the fusion between different structures. Experimental results on MOT17 dataset show that compared with TrackFormer, the proposed algorithm has the IDentity F1 Score (IDF1) increased by 0.87 percentage points, the Multiple Object Tracking Accuracy (MOTA) increased by 0.41 percentage points, and the IDSW number reduced by 16.3%. The proposed method also achieves good results on MOT16 and MOT20 datasets. Consequently, the proposed method can effectively deal with the object occlusion problem, maintain object identity information, and reduce object identity loss.

Key words: Multi-Object Tracking (MOT), attention, Transformer, histogram, template matching, Kalman filter

中图分类号: