《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (12): 3733-3739.DOI: 10.11772/j.issn.1001-9081.2022111790
收稿日期:
2022-12-06
修回日期:
2023-02-23
接受日期:
2023-02-27
发布日期:
2023-03-13
出版日期:
2023-12-10
通讯作者:
陈志华
作者简介:
公海涛(1998—),男,山东临沂人,硕士研究生,主要研究方向:计算机视觉、深度学习基金资助:
Haitao GONG1, Zhihua CHEN1(), Bin SHENG2, Bingyan ZHU1
Received:
2022-12-06
Revised:
2023-02-23
Accepted:
2023-02-27
Online:
2023-03-13
Published:
2023-12-10
Contact:
Zhihua CHEN
About author:
GONG Haitao, born in 1998, M. S. candidate. His research interests include computer vision, deep learning.Supported by:
摘要:
针对现有小目标跟踪算法的鲁棒性差、精度及成功率低的问题,提出一种基于孪生网络和Transformer的小目标跟踪算法SiamTrans。首先,基于Transformer机制设计一种相似度响应图计算模块。该模块叠加若干层特征编码-解码结构,并利用多头自注意力机制和多头跨注意力机制在不同层次的搜索区域特征图中查询模板特征图信息,从而避免陷入局部最优解,并获得一个高质量的相似度响应图;其次,在预测子网中设计一个基于Transformer机制的预测模块(PM),并利用自注意力机制处理预测分支特征图中的冗余特征信息,以提高不同预测分支的预测精度。在Small90数据集上,相较于TransT(Transformer Tracking)算法,所提算法的跟踪精度和跟踪成功率分别高8.0和9.5个百分点。可见,所提出的算法具有更优异的小目标跟踪性能。
中图分类号:
公海涛, 陈志华, 盛斌, 祝冰艳. 基于孪生网络和Transformer的小目标跟踪算法SiamTrans[J]. 计算机应用, 2023, 43(12): 3733-3739.
Haitao GONG, Zhihua CHEN, Bin SHENG, Bingyan ZHU. SiamTrans: tiny object tracking algorithm based on Siamese network and Transformer[J]. Journal of Computer Applications, 2023, 43(12): 3733-3739.
算法 | 遮挡 | 形变 | 运动模糊 | 快速运动 | 低分辨率 | |||||
---|---|---|---|---|---|---|---|---|---|---|
精度 | 成功率 | 精度 | 成功率 | 精度 | 成功率 | 精度 | 成功率 | 精度 | 成功率 | |
SCT[ | 0.726 | 0.460 | 0.676 | 0.425 | 0.421 | 0.260 | 0.500 | 0.317 | 0.666 | 0.414 |
KCF_AST[ | 0.772 | 0.469 | 0.805 | 0.416 | 0.582 | 0.368 | 0.645 | 0.421 | 0.783 | 0.475 |
MDNet_AST[ | 0.803 | 0.507 | 0.794 | 0.519 | 0.717 | 0.464 | 0.809 | 0.537 | 0.805 | 0.527 |
ECO[ | 0.757 | 0.480 | 0.777 | 0.508 | 0.696 | 0.453 | 0.770 | 0.514 | 0.900 | 0.587 |
SiamTrans | 0.796 | 0.571 | 0.826 | 0.534 | 0.763 | 0.525 | 0.836 | 0.570 | 0.844 | 0.599 |
表1 不同算法在Small90数据集上不同属性下的跟踪精度以及成功率比较结果
Tab.1 Comparison results of tracking precision and success rate for different algorithms in different attributes on Small90 dataset
算法 | 遮挡 | 形变 | 运动模糊 | 快速运动 | 低分辨率 | |||||
---|---|---|---|---|---|---|---|---|---|---|
精度 | 成功率 | 精度 | 成功率 | 精度 | 成功率 | 精度 | 成功率 | 精度 | 成功率 | |
SCT[ | 0.726 | 0.460 | 0.676 | 0.425 | 0.421 | 0.260 | 0.500 | 0.317 | 0.666 | 0.414 |
KCF_AST[ | 0.772 | 0.469 | 0.805 | 0.416 | 0.582 | 0.368 | 0.645 | 0.421 | 0.783 | 0.475 |
MDNet_AST[ | 0.803 | 0.507 | 0.794 | 0.519 | 0.717 | 0.464 | 0.809 | 0.537 | 0.805 | 0.527 |
ECO[ | 0.757 | 0.480 | 0.777 | 0.508 | 0.696 | 0.453 | 0.770 | 0.514 | 0.900 | 0.587 |
SiamTrans | 0.796 | 0.571 | 0.826 | 0.534 | 0.763 | 0.525 | 0.836 | 0.570 | 0.844 | 0.599 |
算法 | Small112 | UAV20L | ||
---|---|---|---|---|
成功率 | 精度 | 成功率 | 精度 | |
SAMF[ | — | — | 0.380 | 0.457 |
KCF[ | 0.416 | 0.580 | 0.202 | 0.321 |
KCF_AST[ | 0.492 | 0.710 | 0.204 | 0.345 |
ECO[ | 0.629 | 0.779 | — | — |
CSK[ | 0.429 | 0.585 | 0.177 | 0.309 |
DaSiamRPN_AST[ | 0.693 | 0.805 | 0.705 | 0.717 |
SiamTrans | 0.687 | 0.809 | 0.710 | 0.721 |
表2 不同算法在Small112和UAV20L数据集上的跟踪成功率和精度比较结果
Tab.2 Comparison results of tracking success rate and precision for different algorithms on Small112 and UAV20L datasets
算法 | Small112 | UAV20L | ||
---|---|---|---|---|
成功率 | 精度 | 成功率 | 精度 | |
SAMF[ | — | — | 0.380 | 0.457 |
KCF[ | 0.416 | 0.580 | 0.202 | 0.321 |
KCF_AST[ | 0.492 | 0.710 | 0.204 | 0.345 |
ECO[ | 0.629 | 0.779 | — | — |
CSK[ | 0.429 | 0.585 | 0.177 | 0.309 |
DaSiamRPN_AST[ | 0.693 | 0.805 | 0.705 | 0.717 |
SiamTrans | 0.687 | 0.809 | 0.710 | 0.721 |
算法 | 总体 | 尺度变化 | 快速运动 | 目标消失 | 光照变化 | 相机运动 | 运动模糊 |
---|---|---|---|---|---|---|---|
ECO[ | 0.326 0 | 0.320 0 | 0.187 0 | 0.112 0 | 0.382 0 | 0.295 0 | 0.080 2 |
Ocean[ | 0.343 0 | 0.357 0 | 0.209 0 | 0.125 0 | 0.395 0 | 0.272 0 | 0.093 6 |
SiamRPN++[ | 0.359 0 | 0.368 0 | 0.196 0 | 0.116 0 | 0.401 0 | 0.288 0 | 0.082 8 |
SiamBAN[ | 0.349 0 | 0.373 0 | 0.212 0 | 0.111 0 | 0.420 0 | 0.286 0 | 0.097 6 |
MKDNet[ | 0.413 0 | 0.4410 | 0.264 0 | 0.1710 | 0.474 0 | 0.378 0 | 0.141 0 |
SiamTrans | 0.4190 | 0.438 8 | 0.2703 | 0.170 5 | 0.4803 | 0.3806 | 0.1478 |
表3 不同算法在LaTOT数据集上不同属性下的跟踪成功率
Tab. 3 Comparison of tracking success rate for different algorithms in different attributes on LaTOT dataset
算法 | 总体 | 尺度变化 | 快速运动 | 目标消失 | 光照变化 | 相机运动 | 运动模糊 |
---|---|---|---|---|---|---|---|
ECO[ | 0.326 0 | 0.320 0 | 0.187 0 | 0.112 0 | 0.382 0 | 0.295 0 | 0.080 2 |
Ocean[ | 0.343 0 | 0.357 0 | 0.209 0 | 0.125 0 | 0.395 0 | 0.272 0 | 0.093 6 |
SiamRPN++[ | 0.359 0 | 0.368 0 | 0.196 0 | 0.116 0 | 0.401 0 | 0.288 0 | 0.082 8 |
SiamBAN[ | 0.349 0 | 0.373 0 | 0.212 0 | 0.111 0 | 0.420 0 | 0.286 0 | 0.097 6 |
MKDNet[ | 0.413 0 | 0.4410 | 0.264 0 | 0.1710 | 0.474 0 | 0.378 0 | 0.141 0 |
SiamTrans | 0.4190 | 0.438 8 | 0.2703 | 0.170 5 | 0.4803 | 0.3806 | 0.1478 |
对照组 | Small90 | UAV123_10fps | ||
---|---|---|---|---|
成功率 | 精度 | 成功率 | 精度 | |
互相关操作 | 0.502 | 0.756 | 0.596 | 0.815 |
2层FEM-FDM | 0.469 | 0.738 | 0.600 | 0.812 |
4层FEM-FDM | 0.485 | 0.747 | 0.583 | 0.784 |
6层FEM-FDM | 0.507 | 0.768 | 0.603 | 0.811 |
8层FEM-FDM | 0.476 | 0.740 | 0.594 | 0.807 |
表4 相似度响应图计算模块的消融实验结果
Tab.4 Ablation experimental results of similarity response map calculation module
对照组 | Small90 | UAV123_10fps | ||
---|---|---|---|---|
成功率 | 精度 | 成功率 | 精度 | |
互相关操作 | 0.502 | 0.756 | 0.596 | 0.815 |
2层FEM-FDM | 0.469 | 0.738 | 0.600 | 0.812 |
4层FEM-FDM | 0.485 | 0.747 | 0.583 | 0.784 |
6层FEM-FDM | 0.507 | 0.768 | 0.603 | 0.811 |
8层FEM-FDM | 0.476 | 0.740 | 0.594 | 0.807 |
PM层数 | Small90 | UAV123_10fps | ||
---|---|---|---|---|
成功率 | 精度 | 成功率 | 精度 | |
0 | 0.487 | 0.745 | 0.605 | 0.802 |
2 | 0.487 | 0.749 | 0.611 | 0.814 |
4 | 0.492 | 0.745 | 0.621 | 0.819 |
6 | 0.505 | 0.772 | 0.621 | 0.822 |
8 | 0.025 | 0.042 | 0.017 | 0.033 |
表5 预测模块的消融实验结果
Tab.5 Ablation experiment results of prediction module
PM层数 | Small90 | UAV123_10fps | ||
---|---|---|---|---|
成功率 | 精度 | 成功率 | 精度 | |
0 | 0.487 | 0.745 | 0.605 | 0.802 |
2 | 0.487 | 0.749 | 0.611 | 0.814 |
4 | 0.492 | 0.745 | 0.621 | 0.819 |
6 | 0.505 | 0.772 | 0.621 | 0.822 |
8 | 0.025 | 0.042 | 0.017 | 0.033 |
1 | HENRIQUES J F, CASEIRO R, MARTINS P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583-596. 10.1109/tpami.2014.2345390 |
2 | LI B, YAN J, WU W, et al. High performance visual tracking with Siamese region proposal network[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8971-8980. 10.1109/cvpr.2018.00935 |
3 | GUO D, WANG J, CUI Y, et al. SiamCAR: Siamese fully convolutional classification and regression for visual tracking [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 6268-6276. 10.1109/cvpr42600.2020.00630 |
4 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6000-6010. |
5 | 王梦亭,杨文忠,武雍智. 基于孪生网络的单目标跟踪算法综述[J]. 计算机应用, 2023, 43(3):661-673. |
WANG M T, YANG W Z, WU Y Z. Survey of single target tracking algorithms based on Siamese network [J]. Journal of Computer Applications, 2023, 43(3): 661-673. | |
6 | LIU C, DING W, YANG J, et al. Aggregation signature for small object tracking [J]. IEEE Transactions on Image Processing, 2020, 29: 1738-1747. 10.1109/tip.2019.2940477 |
7 | ZHU Y, LI C, LIU Y, et al. Tiny object tracking: a large-scale dataset and a baseline[EB/OL]. (2022-02-11) [2022-09-16].. 10.1109/tnnls.2023.3239529 |
8 | MUELLER M, SMITH N, GHANEM B. A benchmark and simulator for UAV tracking [C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9905. Cham: Springer, 2016: 445-461. |
9 | 朱文球,邹广,曾志高. 融合层次特征和混合注意力的目标跟踪算法[J]. 计算机应用, 2022, 42(3): 833-843. |
ZHU W Q, ZOU G, ZENG Z G. Object tracking algorithm with hierarchical features and hybrid attention[J]. Journal of Computer Applications, 2022, 42(3): 833-843. | |
10 | AHMADI K, SALARI E. Small dim object tracking using frequency and spatial domain information[J]. Pattern Recognition, 2016, 58: 227-234. 10.1016/j.patcog.2016.04.001 |
11 | AHMADI K, SALARI E. Small dim object tracking using a multi objective particle swarm optimisation technique[J]. IET Image Processing, 2015, 9(9): 820-826. 10.1049/iet-ipr.2014.0927 |
12 | MARVASTI-ZADEH S M, KHAGHANI J, CHANEI-YAKHDAN H, et al. COMET: context-aware IoU-guided network for small object tracking [C]// Proceedings of the 2020 Asian Conference on Computer Vision, LNCS 12623. Cham: Springer, 2021: 594-611. |
13 | HENRIQUES J F, CASEIRO R, MARTINS P, et al. Exploiting the circulant structure of tracking-by-detection with kernels[C]// Proceedings of the 2012 European Conference on Computer Vision, LNCS 7575. Berlin: Springer, 2012: 702-715. |
14 | LI Y, ZHU J. A scale adaptive kernel correlation filter tracker with feature integration[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8926. Cham: Springer, 2015: 254-265. |
15 | DANELLJAN M, BHAT G, SHAHBAZ KHAN F, et al. ECO: efficient convolution operators for tracking [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6931-6939. 10.1109/cvpr.2017.733 |
16 | BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional Siamese networks for object tracking[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9914. Cham: Springer, 2016: 850-865. |
17 | LI B, WU W, WANG Q, et al. SiamRPN++: evolution of Siamese visual tracking with very deep networks [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 4282-4291. 10.1109/cvpr.2019.00441 |
18 | ZHU Z, WANG Q, LI B, et al. Distractor-aware Siamese networks for visual object tracking[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11213. Cham: Springer, 2018: 103-119. |
19 | WANG Q, ZHANG L, BERTINETTO L, et al. Fast online object tracking and segmentation: a unifying approach [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 1328-1338. 10.1109/cvpr.2019.00142 |
20 | CHEN Z, ZHONG B, LI G, et al. Siamese box adaptive network for visual tracking[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 6667-6676. 10.1109/cvpr42600.2020.00670 |
21 | YAN B, PENG H, FU J, et al. Learning spatio-temporal Transformer for visual tracking [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 10428-10437. 10.1109/iccv48922.2021.01028 |
22 | WANG N, ZHOU W, WANG J, et al. Transformer meets tracker: exploiting temporal context for robust visual tracking [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 1571-1580. 10.1109/cvpr46437.2021.00162 |
23 | BLATTER P, KANAKIS M, DANELLJAN M, et al. Efficient visual tracking with Exemplar Transformers [C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 1571-1581. 10.1109/wacv56688.2023.00162 |
24 | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 |
25 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context [C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014: 740-755. |
26 | RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge [J]. International Journal of Computer Vision, 2015, 115(3): 211-252. 10.1007/s11263-015-0816-y |
27 | HUANG L, ZHAO X, HUANG K. GOT-10k: a large high-diversity benchmark for generic object tracking in the wild[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1562-1577. 10.1109/tpami.2019.2957464 |
28 | CHEN X, YAN B, ZHU J,et al. Transformer tracking[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2021:8122-8131. 10.1109/cvpr46437.2021.00803 |
29 | CHOI J, CHANG H J, JEONG J, et al. Visual tracking using attention-modulated disintegration and integration[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4321-4330. 10.1109/cvpr.2016.468 |
NAM H, HAN B. Learning multi-domain convolutional neural networks for visual tracking [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4293-4302. 10.1109/cvpr.2016.468 | |
30 | GRABNER H, GRABNER M, BISCHOF H. Real-time tracking via on-line boosting [EB/OL]. [2022-11-20]. . 10.5244/c.20.6 |
31 | ZHANG J, MA S, SCLAROFF S. MEEM: robust tracking via multiple experts using entropy minimization [C]// Proceedings of the 2014 European Conference on Computer Vision,LNCS 8694. Cham:Springer, 2014:188-203. |
32 | ZHANG Z, PENG H, FU J, et al. Ocean: object-aware anchor-free tracking [C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12366. Cham: Springer, 2020: 771-787. |
33 | CHEN X, YAN B, ZHU J, et al. Transformer tracking[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 8122-8131. 10.1109/cvpr46437.2021.00803 |
[1] | 王宏, 钱清, 王欢, 龙永. 融合大核注意力卷积的轻量化图像篡改定位算法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2692-2699. |
[2] | 陈蒙蒙, 乔志伟. 基于融合通道注意力的Uformer的CT图像稀疏重建[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2948-2954. |
[3] | 杨昊, 张轶. 基于上下文信息和多尺度融合重要性感知的特征金字塔网络算法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2727-2734. |
[4] | 袁国龙, 张玉金, 刘洋. 基于残差反馈和自注意力的图像篡改取证网络[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2925-2931. |
[5] | 张秋余, 温永旺. 用于语音检索的三联体深度哈希方法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2910-2918. |
[6] | 崔雨萌, 王靖亚, 刘晓文, 闫尚义, 陶知众. 融合注意力和裁剪机制的通用文本分类模型[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2396-2405. |
[7] | 齐爱玲, 王宣淋. 基于中层细微特征提取与多尺度特征融合细粒度图像识别[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2556-2563. |
[8] | 金泽熙, 李磊, 刘继. 基于改进领域分离网络的迁移学习模型[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2382-2389. |
[9] | 段升位, 程欣宇, 王浩舟, 王飞. 基于改进的YOLOv5的大坝表面病害检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2619-2629. |
[10] | 刘源, 董永权, 贾瑞, 杨昊霖. 面向个性化课程推荐的分层分期注意力网络模型[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2358-2363. |
[11] | 姜钧舰, 刘达维, 刘逸凡, 任酉贵, 赵志滨. 基于孪生网络的小样本目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2325-2329. |
[12] | 梁美佳, 刘昕武, 胡晓鹏. 基于改进YOLOv3的列车运行环境图像小目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2611-2618. |
[13] | 王静红, 周志霞, 王辉, 李昊康. 双路自编码器的属性网络表示学习[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2338-2344. |
[14] | 梁敏, 刘佳艺, 李杰. 融合迭代反馈与注意力机制的图像超分辨重建方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2280-2287. |
[15] | 周静, 胡怡宇, 胡成玉, 王天江. 基于点云补全和多分辨Transformer的弱感知目标检测方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2155-2165. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||