Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (12): 3733-3739.DOI: 10.11772/j.issn.1001-9081.2022111790
Special Issue: 人工智能
• Artificial intelligence • Previous Articles Next Articles
Haitao GONG1, Zhihua CHEN1(), Bin SHENG2, Bingyan ZHU1
Received:
2022-12-06
Revised:
2023-02-23
Accepted:
2023-02-27
Online:
2023-03-13
Published:
2023-12-10
Contact:
Zhihua CHEN
About author:
GONG Haitao, born in 1998, M. S. candidate. His research interests include computer vision, deep learning.Supported by:
通讯作者:
陈志华
作者简介:
公海涛(1998—),男,山东临沂人,硕士研究生,主要研究方向:计算机视觉、深度学习基金资助:
CLC Number:
Haitao GONG, Zhihua CHEN, Bin SHENG, Bingyan ZHU. SiamTrans: tiny object tracking algorithm based on Siamese network and Transformer[J]. Journal of Computer Applications, 2023, 43(12): 3733-3739.
公海涛, 陈志华, 盛斌, 祝冰艳. 基于孪生网络和Transformer的小目标跟踪算法SiamTrans[J]. 《计算机应用》唯一官方网站, 2023, 43(12): 3733-3739.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022111790
算法 | 遮挡 | 形变 | 运动模糊 | 快速运动 | 低分辨率 | |||||
---|---|---|---|---|---|---|---|---|---|---|
精度 | 成功率 | 精度 | 成功率 | 精度 | 成功率 | 精度 | 成功率 | 精度 | 成功率 | |
SCT[ | 0.726 | 0.460 | 0.676 | 0.425 | 0.421 | 0.260 | 0.500 | 0.317 | 0.666 | 0.414 |
KCF_AST[ | 0.772 | 0.469 | 0.805 | 0.416 | 0.582 | 0.368 | 0.645 | 0.421 | 0.783 | 0.475 |
MDNet_AST[ | 0.803 | 0.507 | 0.794 | 0.519 | 0.717 | 0.464 | 0.809 | 0.537 | 0.805 | 0.527 |
ECO[ | 0.757 | 0.480 | 0.777 | 0.508 | 0.696 | 0.453 | 0.770 | 0.514 | 0.900 | 0.587 |
SiamTrans | 0.796 | 0.571 | 0.826 | 0.534 | 0.763 | 0.525 | 0.836 | 0.570 | 0.844 | 0.599 |
Tab.1 Comparison results of tracking precision and success rate for different algorithms in different attributes on Small90 dataset
算法 | 遮挡 | 形变 | 运动模糊 | 快速运动 | 低分辨率 | |||||
---|---|---|---|---|---|---|---|---|---|---|
精度 | 成功率 | 精度 | 成功率 | 精度 | 成功率 | 精度 | 成功率 | 精度 | 成功率 | |
SCT[ | 0.726 | 0.460 | 0.676 | 0.425 | 0.421 | 0.260 | 0.500 | 0.317 | 0.666 | 0.414 |
KCF_AST[ | 0.772 | 0.469 | 0.805 | 0.416 | 0.582 | 0.368 | 0.645 | 0.421 | 0.783 | 0.475 |
MDNet_AST[ | 0.803 | 0.507 | 0.794 | 0.519 | 0.717 | 0.464 | 0.809 | 0.537 | 0.805 | 0.527 |
ECO[ | 0.757 | 0.480 | 0.777 | 0.508 | 0.696 | 0.453 | 0.770 | 0.514 | 0.900 | 0.587 |
SiamTrans | 0.796 | 0.571 | 0.826 | 0.534 | 0.763 | 0.525 | 0.836 | 0.570 | 0.844 | 0.599 |
算法 | Small112 | UAV20L | ||
---|---|---|---|---|
成功率 | 精度 | 成功率 | 精度 | |
SAMF[ | — | — | 0.380 | 0.457 |
KCF[ | 0.416 | 0.580 | 0.202 | 0.321 |
KCF_AST[ | 0.492 | 0.710 | 0.204 | 0.345 |
ECO[ | 0.629 | 0.779 | — | — |
CSK[ | 0.429 | 0.585 | 0.177 | 0.309 |
DaSiamRPN_AST[ | 0.693 | 0.805 | 0.705 | 0.717 |
SiamTrans | 0.687 | 0.809 | 0.710 | 0.721 |
Tab.2 Comparison results of tracking success rate and precision for different algorithms on Small112 and UAV20L datasets
算法 | Small112 | UAV20L | ||
---|---|---|---|---|
成功率 | 精度 | 成功率 | 精度 | |
SAMF[ | — | — | 0.380 | 0.457 |
KCF[ | 0.416 | 0.580 | 0.202 | 0.321 |
KCF_AST[ | 0.492 | 0.710 | 0.204 | 0.345 |
ECO[ | 0.629 | 0.779 | — | — |
CSK[ | 0.429 | 0.585 | 0.177 | 0.309 |
DaSiamRPN_AST[ | 0.693 | 0.805 | 0.705 | 0.717 |
SiamTrans | 0.687 | 0.809 | 0.710 | 0.721 |
算法 | 总体 | 尺度变化 | 快速运动 | 目标消失 | 光照变化 | 相机运动 | 运动模糊 |
---|---|---|---|---|---|---|---|
ECO[ | 0.326 0 | 0.320 0 | 0.187 0 | 0.112 0 | 0.382 0 | 0.295 0 | 0.080 2 |
Ocean[ | 0.343 0 | 0.357 0 | 0.209 0 | 0.125 0 | 0.395 0 | 0.272 0 | 0.093 6 |
SiamRPN++[ | 0.359 0 | 0.368 0 | 0.196 0 | 0.116 0 | 0.401 0 | 0.288 0 | 0.082 8 |
SiamBAN[ | 0.349 0 | 0.373 0 | 0.212 0 | 0.111 0 | 0.420 0 | 0.286 0 | 0.097 6 |
MKDNet[ | 0.413 0 | 0.4410 | 0.264 0 | 0.1710 | 0.474 0 | 0.378 0 | 0.141 0 |
SiamTrans | 0.4190 | 0.438 8 | 0.2703 | 0.170 5 | 0.4803 | 0.3806 | 0.1478 |
Tab. 3 Comparison of tracking success rate for different algorithms in different attributes on LaTOT dataset
算法 | 总体 | 尺度变化 | 快速运动 | 目标消失 | 光照变化 | 相机运动 | 运动模糊 |
---|---|---|---|---|---|---|---|
ECO[ | 0.326 0 | 0.320 0 | 0.187 0 | 0.112 0 | 0.382 0 | 0.295 0 | 0.080 2 |
Ocean[ | 0.343 0 | 0.357 0 | 0.209 0 | 0.125 0 | 0.395 0 | 0.272 0 | 0.093 6 |
SiamRPN++[ | 0.359 0 | 0.368 0 | 0.196 0 | 0.116 0 | 0.401 0 | 0.288 0 | 0.082 8 |
SiamBAN[ | 0.349 0 | 0.373 0 | 0.212 0 | 0.111 0 | 0.420 0 | 0.286 0 | 0.097 6 |
MKDNet[ | 0.413 0 | 0.4410 | 0.264 0 | 0.1710 | 0.474 0 | 0.378 0 | 0.141 0 |
SiamTrans | 0.4190 | 0.438 8 | 0.2703 | 0.170 5 | 0.4803 | 0.3806 | 0.1478 |
对照组 | Small90 | UAV123_10fps | ||
---|---|---|---|---|
成功率 | 精度 | 成功率 | 精度 | |
互相关操作 | 0.502 | 0.756 | 0.596 | 0.815 |
2层FEM-FDM | 0.469 | 0.738 | 0.600 | 0.812 |
4层FEM-FDM | 0.485 | 0.747 | 0.583 | 0.784 |
6层FEM-FDM | 0.507 | 0.768 | 0.603 | 0.811 |
8层FEM-FDM | 0.476 | 0.740 | 0.594 | 0.807 |
Tab.4 Ablation experimental results of similarity response map calculation module
对照组 | Small90 | UAV123_10fps | ||
---|---|---|---|---|
成功率 | 精度 | 成功率 | 精度 | |
互相关操作 | 0.502 | 0.756 | 0.596 | 0.815 |
2层FEM-FDM | 0.469 | 0.738 | 0.600 | 0.812 |
4层FEM-FDM | 0.485 | 0.747 | 0.583 | 0.784 |
6层FEM-FDM | 0.507 | 0.768 | 0.603 | 0.811 |
8层FEM-FDM | 0.476 | 0.740 | 0.594 | 0.807 |
PM层数 | Small90 | UAV123_10fps | ||
---|---|---|---|---|
成功率 | 精度 | 成功率 | 精度 | |
0 | 0.487 | 0.745 | 0.605 | 0.802 |
2 | 0.487 | 0.749 | 0.611 | 0.814 |
4 | 0.492 | 0.745 | 0.621 | 0.819 |
6 | 0.505 | 0.772 | 0.621 | 0.822 |
8 | 0.025 | 0.042 | 0.017 | 0.033 |
Tab.5 Ablation experiment results of prediction module
PM层数 | Small90 | UAV123_10fps | ||
---|---|---|---|---|
成功率 | 精度 | 成功率 | 精度 | |
0 | 0.487 | 0.745 | 0.605 | 0.802 |
2 | 0.487 | 0.749 | 0.611 | 0.814 |
4 | 0.492 | 0.745 | 0.621 | 0.819 |
6 | 0.505 | 0.772 | 0.621 | 0.822 |
8 | 0.025 | 0.042 | 0.017 | 0.033 |
1 | HENRIQUES J F, CASEIRO R, MARTINS P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583-596. 10.1109/tpami.2014.2345390 |
2 | LI B, YAN J, WU W, et al. High performance visual tracking with Siamese region proposal network[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8971-8980. 10.1109/cvpr.2018.00935 |
3 | GUO D, WANG J, CUI Y, et al. SiamCAR: Siamese fully convolutional classification and regression for visual tracking [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 6268-6276. 10.1109/cvpr42600.2020.00630 |
4 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6000-6010. |
5 | 王梦亭,杨文忠,武雍智. 基于孪生网络的单目标跟踪算法综述[J]. 计算机应用, 2023, 43(3):661-673. |
WANG M T, YANG W Z, WU Y Z. Survey of single target tracking algorithms based on Siamese network [J]. Journal of Computer Applications, 2023, 43(3): 661-673. | |
6 | LIU C, DING W, YANG J, et al. Aggregation signature for small object tracking [J]. IEEE Transactions on Image Processing, 2020, 29: 1738-1747. 10.1109/tip.2019.2940477 |
7 | ZHU Y, LI C, LIU Y, et al. Tiny object tracking: a large-scale dataset and a baseline[EB/OL]. (2022-02-11) [2022-09-16].. 10.1109/tnnls.2023.3239529 |
8 | MUELLER M, SMITH N, GHANEM B. A benchmark and simulator for UAV tracking [C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9905. Cham: Springer, 2016: 445-461. |
9 | 朱文球,邹广,曾志高. 融合层次特征和混合注意力的目标跟踪算法[J]. 计算机应用, 2022, 42(3): 833-843. |
ZHU W Q, ZOU G, ZENG Z G. Object tracking algorithm with hierarchical features and hybrid attention[J]. Journal of Computer Applications, 2022, 42(3): 833-843. | |
10 | AHMADI K, SALARI E. Small dim object tracking using frequency and spatial domain information[J]. Pattern Recognition, 2016, 58: 227-234. 10.1016/j.patcog.2016.04.001 |
11 | AHMADI K, SALARI E. Small dim object tracking using a multi objective particle swarm optimisation technique[J]. IET Image Processing, 2015, 9(9): 820-826. 10.1049/iet-ipr.2014.0927 |
12 | MARVASTI-ZADEH S M, KHAGHANI J, CHANEI-YAKHDAN H, et al. COMET: context-aware IoU-guided network for small object tracking [C]// Proceedings of the 2020 Asian Conference on Computer Vision, LNCS 12623. Cham: Springer, 2021: 594-611. |
13 | HENRIQUES J F, CASEIRO R, MARTINS P, et al. Exploiting the circulant structure of tracking-by-detection with kernels[C]// Proceedings of the 2012 European Conference on Computer Vision, LNCS 7575. Berlin: Springer, 2012: 702-715. |
14 | LI Y, ZHU J. A scale adaptive kernel correlation filter tracker with feature integration[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8926. Cham: Springer, 2015: 254-265. |
15 | DANELLJAN M, BHAT G, SHAHBAZ KHAN F, et al. ECO: efficient convolution operators for tracking [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6931-6939. 10.1109/cvpr.2017.733 |
16 | BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional Siamese networks for object tracking[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9914. Cham: Springer, 2016: 850-865. |
17 | LI B, WU W, WANG Q, et al. SiamRPN++: evolution of Siamese visual tracking with very deep networks [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 4282-4291. 10.1109/cvpr.2019.00441 |
18 | ZHU Z, WANG Q, LI B, et al. Distractor-aware Siamese networks for visual object tracking[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11213. Cham: Springer, 2018: 103-119. |
19 | WANG Q, ZHANG L, BERTINETTO L, et al. Fast online object tracking and segmentation: a unifying approach [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 1328-1338. 10.1109/cvpr.2019.00142 |
20 | CHEN Z, ZHONG B, LI G, et al. Siamese box adaptive network for visual tracking[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 6667-6676. 10.1109/cvpr42600.2020.00670 |
21 | YAN B, PENG H, FU J, et al. Learning spatio-temporal Transformer for visual tracking [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 10428-10437. 10.1109/iccv48922.2021.01028 |
22 | WANG N, ZHOU W, WANG J, et al. Transformer meets tracker: exploiting temporal context for robust visual tracking [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 1571-1580. 10.1109/cvpr46437.2021.00162 |
23 | BLATTER P, KANAKIS M, DANELLJAN M, et al. Efficient visual tracking with Exemplar Transformers [C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 1571-1581. 10.1109/wacv56688.2023.00162 |
24 | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 |
25 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context [C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014: 740-755. |
26 | RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge [J]. International Journal of Computer Vision, 2015, 115(3): 211-252. 10.1007/s11263-015-0816-y |
27 | HUANG L, ZHAO X, HUANG K. GOT-10k: a large high-diversity benchmark for generic object tracking in the wild[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1562-1577. 10.1109/tpami.2019.2957464 |
28 | CHEN X, YAN B, ZHU J,et al. Transformer tracking[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2021:8122-8131. 10.1109/cvpr46437.2021.00803 |
29 | CHOI J, CHANG H J, JEONG J, et al. Visual tracking using attention-modulated disintegration and integration[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4321-4330. 10.1109/cvpr.2016.468 |
NAM H, HAN B. Learning multi-domain convolutional neural networks for visual tracking [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4293-4302. 10.1109/cvpr.2016.468 | |
30 | GRABNER H, GRABNER M, BISCHOF H. Real-time tracking via on-line boosting [EB/OL]. [2022-11-20]. . 10.5244/c.20.6 |
31 | ZHANG J, MA S, SCLAROFF S. MEEM: robust tracking via multiple experts using entropy minimization [C]// Proceedings of the 2014 European Conference on Computer Vision,LNCS 8694. Cham:Springer, 2014:188-203. |
32 | ZHANG Z, PENG H, FU J, et al. Ocean: object-aware anchor-free tracking [C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12366. Cham: Springer, 2020: 771-787. |
33 | CHEN X, YAN B, ZHU J, et al. Transformer tracking[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 8122-8131. 10.1109/cvpr46437.2021.00803 |
[1] | Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892. |
[2] | Jinjin LI, Guoming SANG, Yijia ZHANG. Multi-domain fake news detection model enhanced by APK-CNN and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2674-2682. |
[3] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. |
[4] | Jiepo FANG, Chongben TAO. Hybrid internet of vehicles intrusion detection system for zero-day attacks [J]. Journal of Computer Applications, 2024, 44(9): 2763-2769. |
[5] | Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738. |
[6] | Jieru JIA, Jianchao YANG, Shuorui ZHANG, Tao YAN, Bin CHEN. Unsupervised person re-identification based on self-distilled vision Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2893-2902. |
[7] | Yunchuan HUANG, Yongquan JIANG, Juntao HUANG, Yan YANG. Molecular toxicity prediction based on meta graph isomorphism network [J]. Journal of Computer Applications, 2024, 44(9): 2964-2969. |
[8] | Xin YANG, Xueni CHEN, Chunjiang WU, Shijie ZHOU. Short-term traffic flow prediction of urban highway based on variant residual model and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2947-2951. |
[9] | Liehong REN, Lyuwen HUANG, Xu TIAN, Fei DUAN. Multivariate long-term series forecasting method with DFT-based frequency-sensitive dual-branch Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2739-2746. |
[10] | Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392. |
[11] | Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406. |
[12] | Yuwei DING, Hongbo SHI, Jie LI, Min LIANG. Image denoising network based on local and global feature decoupling [J]. Journal of Computer Applications, 2024, 44(8): 2571-2579. |
[13] | Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594. |
[14] | Kaili DENG, Weibo WEI, Zhenkuan PAN. Industrial defect detection method with improved masked autoencoder [J]. Journal of Computer Applications, 2024, 44(8): 2595-2603. |
[15] | Shangbin MO, Wenjun WANG, Ling DONG, Shengxiang GAO, Zhengtao YU. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding [J]. Journal of Computer Applications, 2024, 44(8): 2611-2617. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||