Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (9): 2878-2885.DOI: 10.11772/j.issn.1001-9081.2023081223

• Multimedia computing and computer simulation • Previous Articles     Next Articles

Siamese mixed information fusion algorithm for RGBT tracking

Ying HUANG(), Jiayu YANG, Jiahao JIN, Bangrui WAN   

  1. School of Software Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
  • Received:2023-09-08 Revised:2023-10-30 Accepted:2023-11-10 Online:2024-09-14 Published:2024-09-10
  • Contact: Ying HUANG
  • About author:YANG Jiayu, born in 1999, M. S. candidate. His research interests include multi-modal object tracking.
    JIN Jiahao, born in 2001, M. S. candidate. His research interests include single object tracking.
    WAN Bangrui, born in 1981, senior engineer. His research interests include big data analysis and processing.

用于RGBT跟踪的孪生混合信息融合算法

黄颖(), 杨佳宇, 金家昊, 万邦睿   

  1. 重庆邮电大学 软件工程学院,重庆 400065
  • 通讯作者: 黄颖
  • 作者简介:杨佳宇(1999—),男,山西长治人,硕士研究生,主要研究方向:多模态目标跟踪
    金家昊(2001—),男,安徽六安人,硕士研究生,主要研究方向:单目标跟踪
    万邦睿(1981—),男,重庆人,高级工程师,CCF会员,主要研究方向:大数据分析处理。

Abstract:

The core of visible light and thermal infrared tracking (RGBT (RGB-Thermal) tracking for shot) lies in the effective utilization of information from different modalities. To address the problem of low-quality results produced by single branch in decision-level fusion affecting algorithm’s object decision-making, a Siamese mixed information fusion algorithm — SiamMIF was proposed for RGBT tracking. Firstly, Siamese Backbone Network (SBN) was used for multi-modal feature extraction. Secondly, the affect of low-quality images on the dual-branch parallel decision-making was analyzed from the perspective of signal-to-noise ratio, and an Signal-to-Noise Ratio (SNR)-driven Information Interaction Module (IIM) was designed for information complementation of information with low signal-to-noise ratio. Thirdly, a Dual-stream Anchor-free Head (DAH) was employed for the classification and regression of the compensated features. Finally, an Adaptive Lightweight Decision Module (ALDM) was used to fuse the tracking results and determine the object’s position quickly. Experimental results on four RGBT benchmark datasets including GTOT, RGBT234, VOT-RGBT2019 and LasHeR show that the success rate and precision of the proposed method on LasHeR dataset are 0.396 and 0.518 respectively, and compared to the APFNet (Attribute-based Progressive Fusion Network), there are a 9.4% improvement in success rate and a 3.6% enhancement in precision. At the same time, SiamMIF achieves good results on other three datasets, and the frame rate on GPU can reach 40 frame/s.

Key words: RGB-Thermal (RGBT) tracking, Siamese neural network, multi-modal fusion strategy, information interaction, anchor-free head

摘要:

可见光与热红外跟踪(又称RGBT(RGB-Thermal)跟踪)的核心是有效地利用不同模态的信息,针对决策级融合中单分支产生低质结果影响算法判定目标的问题,提出一个用于RGBT跟踪的孪生混合信息融合算法SiamMIF。首先,使用孪生主干网络(SBN)进行多模态特征提取;其次,从信噪比的角度分析低质图像对双分支并行决策产生的影响,进而设计了一个信噪比驱动的信息交互模块(IIM)对低信噪比特征进行信息互补;再次,利用双流无锚跟踪头(ADH)对补偿后的特征进行分类回归;最后,采用自适应轻量决策模块(ALDM)对跟踪结果进行融合,并快速判定目标位置。在4个RGBT基准数据集GTOT、RGBT234、VOT-RGBT2019和LasHeR上的实验结果表明,所提算法在LasHeR数据集上的成功率和精确度分别为0.396和0.518,相较于APFNet(Attribute-based Progressive Fusion Network)提升9.4%和3.6%,在其他3个数据集上也能取得较好结果,且在GPU上的帧率能达到40 frame/s。

关键词: RGBT跟踪, 孪生神经网络, 多模态融合策略, 信息交互, 无锚跟踪头

CLC Number: