• •    

用于RGBT跟踪的孪生混合信息融合算法

黄颖,杨佳宇   

  1. 重庆邮电大学
  • 收稿日期:2023-09-07 修回日期:2023-10-30 发布日期:2023-12-18
  • 通讯作者: 黄颖

Siamese mixed information interaction algorithm for RGBT tracking

  • Received:2023-09-07 Revised:2023-10-30 Online:2023-12-18

摘要: 可见光与热红外跟踪(简称 RGBT 跟踪)的核心是有效的利用不同模态的信息,针对决策级融合中单分支产生低质结果影响算法判定目标的问题,提出了一个用于 RGBT 跟踪的孪生混合信息融合算法(SiamMIF),使用孪生主干网络 SBN进行多模态特征提取,从信噪比的角度分析低质图像对双分支并行决策产生的影响,进而设计了一个信噪比驱动的信息交互模块 IIM 对低信噪比特征进行信息互补,利用双流无锚跟踪头对补偿后的特征进行分类回归,最终采用自适应轻量决策模块ALDM 对跟踪结果进行融合,快速判定目标位置。在四个 RGBT 跟踪基准数据集 GTOT,RGBT234,VOT-RGBT2019,LasHeR上进行评估,所提方法在 LasHeR 数据集上的成功率和精确度分别为 0.396 和 0.518,相较于 APFNet 提升了 9.4%和 3.6%,在 其余数据集上所提方法也可以取得极具竞争力的结果,且在 GPU 上的运行速度能达到 40fps。

关键词: RGBT跟踪, 孪生神经网络, 多模态融合策略, 信息交互, 无锚跟踪头

Abstract: The core of visible light and thermal infrared tracking (RGBT tracking) lies in the effective utilization of information from different modalities. To address the problem of suboptimal target determination caused by single-branch decision-level fusion, a siamese mixed information fusion algorithm (SiamMIF) for RGBT tracking was proposed. Firstly, the siamese backbone network SBN was used for multi-modal feature extraction. Secondly, the impact of low-quality images on the dual-branch parallel decision-making was analyzed from the perspective of signal-to-noise ratio, and an SNR-driven information interaction module (IIM) was designed for information complementation. Thirdly, a dual-stream anchor-free tracking head employed for the classification and regression of the compensated features. Finally, an adaptive lightweight decision-making module (ALDM) was used to fuse the tracking results and swiftly determine the target's position. Proposed tracker was evaluated on four RGBT benchmarks including GTOT, RGBT234, VOT-RGBT2019 and LasHeR, the success rate and precision of the method on the LasHeR dataset were determined to be 0.396 and 0.518 and exhibited a 9.4% improvement in success rate and a 3.6% enhancement in precision compared to the APFNet, the results show that SiamMIF achieves competitive performance on remaining dataset and runs at 40fps.

Key words: RGBT tracking, siamese neural network, multimodal fusion strategy, information interaction, anchor-free head

中图分类号: