Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (12): 3733-3739.DOI: 10.11772/j.issn.1001-9081.2022111790
Special Issue: 人工智能
• Artificial intelligence • Previous Articles Next Articles
					
						                                                                                                                                                                                                                                                    Haitao GONG1, Zhihua CHEN1( ), Bin SHENG2, Bingyan ZHU1
), Bin SHENG2, Bingyan ZHU1
												  
						
						
						
					
				
Received:2022-12-06
															
							
																	Revised:2023-02-23
															
							
																	Accepted:2023-02-27
															
							
							
																	Online:2023-03-13
															
							
																	Published:2023-12-10
															
							
						Contact:
								Zhihua CHEN   
													About author:GONG Haitao, born in 1998, M. S. candidate. His research interests include computer vision, deep learning.Supported by:通讯作者:
					陈志华
							作者简介:公海涛(1998—),男,山东临沂人,硕士研究生,主要研究方向:计算机视觉、深度学习基金资助:CLC Number:
Haitao GONG, Zhihua CHEN, Bin SHENG, Bingyan ZHU. SiamTrans: tiny object tracking algorithm based on Siamese network and Transformer[J]. Journal of Computer Applications, 2023, 43(12): 3733-3739.
公海涛, 陈志华, 盛斌, 祝冰艳. 基于孪生网络和Transformer的小目标跟踪算法SiamTrans[J]. 《计算机应用》唯一官方网站, 2023, 43(12): 3733-3739.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022111790
| 算法 | 遮挡 | 形变 | 运动模糊 | 快速运动 | 低分辨率 | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| 精度 | 成功率 | 精度 | 成功率 | 精度 | 成功率 | 精度 | 成功率 | 精度 | 成功率 | |
| SCT[ | 0.726 | 0.460 | 0.676 | 0.425 | 0.421 | 0.260 | 0.500 | 0.317 | 0.666 | 0.414 | 
| KCF_AST[ | 0.772 | 0.469 | 0.805 | 0.416 | 0.582 | 0.368 | 0.645 | 0.421 | 0.783 | 0.475 | 
| MDNet_AST[ | 0.803 | 0.507 | 0.794 | 0.519 | 0.717 | 0.464 | 0.809 | 0.537 | 0.805 | 0.527 | 
| ECO[ | 0.757 | 0.480 | 0.777 | 0.508 | 0.696 | 0.453 | 0.770 | 0.514 | 0.900 | 0.587 | 
| SiamTrans | 0.796 | 0.571 | 0.826 | 0.534 | 0.763 | 0.525 | 0.836 | 0.570 | 0.844 | 0.599 | 
Tab.1 Comparison results of tracking precision and success rate for different algorithms in different attributes on Small90 dataset
| 算法 | 遮挡 | 形变 | 运动模糊 | 快速运动 | 低分辨率 | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| 精度 | 成功率 | 精度 | 成功率 | 精度 | 成功率 | 精度 | 成功率 | 精度 | 成功率 | |
| SCT[ | 0.726 | 0.460 | 0.676 | 0.425 | 0.421 | 0.260 | 0.500 | 0.317 | 0.666 | 0.414 | 
| KCF_AST[ | 0.772 | 0.469 | 0.805 | 0.416 | 0.582 | 0.368 | 0.645 | 0.421 | 0.783 | 0.475 | 
| MDNet_AST[ | 0.803 | 0.507 | 0.794 | 0.519 | 0.717 | 0.464 | 0.809 | 0.537 | 0.805 | 0.527 | 
| ECO[ | 0.757 | 0.480 | 0.777 | 0.508 | 0.696 | 0.453 | 0.770 | 0.514 | 0.900 | 0.587 | 
| SiamTrans | 0.796 | 0.571 | 0.826 | 0.534 | 0.763 | 0.525 | 0.836 | 0.570 | 0.844 | 0.599 | 
| 算法 | Small112 | UAV20L | ||
|---|---|---|---|---|
| 成功率 | 精度 | 成功率 | 精度 | |
| SAMF[ | — | — | 0.380 | 0.457 | 
| KCF[ | 0.416 | 0.580 | 0.202 | 0.321 | 
| KCF_AST[ | 0.492 | 0.710 | 0.204 | 0.345 | 
| ECO[ | 0.629 | 0.779 | — | — | 
| CSK[ | 0.429 | 0.585 | 0.177 | 0.309 | 
| DaSiamRPN_AST[ | 0.693 | 0.805 | 0.705 | 0.717 | 
| SiamTrans | 0.687 | 0.809 | 0.710 | 0.721 | 
Tab.2 Comparison results of tracking success rate and precision for different algorithms on Small112 and UAV20L datasets
| 算法 | Small112 | UAV20L | ||
|---|---|---|---|---|
| 成功率 | 精度 | 成功率 | 精度 | |
| SAMF[ | — | — | 0.380 | 0.457 | 
| KCF[ | 0.416 | 0.580 | 0.202 | 0.321 | 
| KCF_AST[ | 0.492 | 0.710 | 0.204 | 0.345 | 
| ECO[ | 0.629 | 0.779 | — | — | 
| CSK[ | 0.429 | 0.585 | 0.177 | 0.309 | 
| DaSiamRPN_AST[ | 0.693 | 0.805 | 0.705 | 0.717 | 
| SiamTrans | 0.687 | 0.809 | 0.710 | 0.721 | 
| 算法 | 总体 | 尺度变化 | 快速运动 | 目标消失 | 光照变化 | 相机运动 | 运动模糊 | 
|---|---|---|---|---|---|---|---|
| ECO[ | 0.326 0 | 0.320 0 | 0.187 0 | 0.112 0 | 0.382 0 | 0.295 0 | 0.080 2 | 
| Ocean[ | 0.343 0 | 0.357 0 | 0.209 0 | 0.125 0 | 0.395 0 | 0.272 0 | 0.093 6 | 
| SiamRPN++[ | 0.359 0 | 0.368 0 | 0.196 0 | 0.116 0 | 0.401 0 | 0.288 0 | 0.082 8 | 
| SiamBAN[ | 0.349 0 | 0.373 0 | 0.212 0 | 0.111 0 | 0.420 0 | 0.286 0 | 0.097 6 | 
| MKDNet[ | 0.413 0 | 0.4410 | 0.264 0 | 0.1710 | 0.474 0 | 0.378 0 | 0.141 0 | 
| SiamTrans | 0.4190 | 0.438 8 | 0.2703 | 0.170 5 | 0.4803 | 0.3806 | 0.1478 | 
Tab. 3 Comparison of tracking success rate for different algorithms in different attributes on LaTOT dataset
| 算法 | 总体 | 尺度变化 | 快速运动 | 目标消失 | 光照变化 | 相机运动 | 运动模糊 | 
|---|---|---|---|---|---|---|---|
| ECO[ | 0.326 0 | 0.320 0 | 0.187 0 | 0.112 0 | 0.382 0 | 0.295 0 | 0.080 2 | 
| Ocean[ | 0.343 0 | 0.357 0 | 0.209 0 | 0.125 0 | 0.395 0 | 0.272 0 | 0.093 6 | 
| SiamRPN++[ | 0.359 0 | 0.368 0 | 0.196 0 | 0.116 0 | 0.401 0 | 0.288 0 | 0.082 8 | 
| SiamBAN[ | 0.349 0 | 0.373 0 | 0.212 0 | 0.111 0 | 0.420 0 | 0.286 0 | 0.097 6 | 
| MKDNet[ | 0.413 0 | 0.4410 | 0.264 0 | 0.1710 | 0.474 0 | 0.378 0 | 0.141 0 | 
| SiamTrans | 0.4190 | 0.438 8 | 0.2703 | 0.170 5 | 0.4803 | 0.3806 | 0.1478 | 
| 对照组 | Small90 | UAV123_10fps | ||
|---|---|---|---|---|
| 成功率 | 精度 | 成功率 | 精度 | |
| 互相关操作 | 0.502 | 0.756 | 0.596 | 0.815 | 
| 2层FEM-FDM | 0.469 | 0.738 | 0.600 | 0.812 | 
| 4层FEM-FDM | 0.485 | 0.747 | 0.583 | 0.784 | 
| 6层FEM-FDM | 0.507 | 0.768 | 0.603 | 0.811 | 
| 8层FEM-FDM | 0.476 | 0.740 | 0.594 | 0.807 | 
Tab.4 Ablation experimental results of similarity response map calculation module
| 对照组 | Small90 | UAV123_10fps | ||
|---|---|---|---|---|
| 成功率 | 精度 | 成功率 | 精度 | |
| 互相关操作 | 0.502 | 0.756 | 0.596 | 0.815 | 
| 2层FEM-FDM | 0.469 | 0.738 | 0.600 | 0.812 | 
| 4层FEM-FDM | 0.485 | 0.747 | 0.583 | 0.784 | 
| 6层FEM-FDM | 0.507 | 0.768 | 0.603 | 0.811 | 
| 8层FEM-FDM | 0.476 | 0.740 | 0.594 | 0.807 | 
| PM层数 | Small90 | UAV123_10fps | ||
|---|---|---|---|---|
| 成功率 | 精度 | 成功率 | 精度 | |
| 0 | 0.487 | 0.745 | 0.605 | 0.802 | 
| 2 | 0.487 | 0.749 | 0.611 | 0.814 | 
| 4 | 0.492 | 0.745 | 0.621 | 0.819 | 
| 6 | 0.505 | 0.772 | 0.621 | 0.822 | 
| 8 | 0.025 | 0.042 | 0.017 | 0.033 | 
Tab.5 Ablation experiment results of prediction module
| PM层数 | Small90 | UAV123_10fps | ||
|---|---|---|---|---|
| 成功率 | 精度 | 成功率 | 精度 | |
| 0 | 0.487 | 0.745 | 0.605 | 0.802 | 
| 2 | 0.487 | 0.749 | 0.611 | 0.814 | 
| 4 | 0.492 | 0.745 | 0.621 | 0.819 | 
| 6 | 0.505 | 0.772 | 0.621 | 0.822 | 
| 8 | 0.025 | 0.042 | 0.017 | 0.033 | 
| 1 | HENRIQUES J F, CASEIRO R, MARTINS P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583-596. 10.1109/tpami.2014.2345390 | 
| 2 | LI B, YAN J, WU W, et al. High performance visual tracking with Siamese region proposal network[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8971-8980. 10.1109/cvpr.2018.00935 | 
| 3 | GUO D, WANG J, CUI Y, et al. SiamCAR: Siamese fully convolutional classification and regression for visual tracking [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 6268-6276. 10.1109/cvpr42600.2020.00630 | 
| 4 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6000-6010. | 
| 5 | 王梦亭,杨文忠,武雍智. 基于孪生网络的单目标跟踪算法综述[J]. 计算机应用, 2023, 43(3):661-673. | 
| WANG M T, YANG W Z, WU Y Z. Survey of single target tracking algorithms based on Siamese network [J]. Journal of Computer Applications, 2023, 43(3): 661-673. | |
| 6 | LIU C, DING W, YANG J, et al. Aggregation signature for small object tracking [J]. IEEE Transactions on Image Processing, 2020, 29: 1738-1747. 10.1109/tip.2019.2940477 | 
| 7 | ZHU Y, LI C, LIU Y, et al. Tiny object tracking: a large-scale dataset and a baseline[EB/OL]. (2022-02-11) [2022-09-16].. 10.1109/tnnls.2023.3239529 | 
| 8 | MUELLER M, SMITH N, GHANEM B. A benchmark and simulator for UAV tracking [C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9905. Cham: Springer, 2016: 445-461. | 
| 9 | 朱文球,邹广,曾志高. 融合层次特征和混合注意力的目标跟踪算法[J]. 计算机应用, 2022, 42(3): 833-843. | 
| ZHU W Q, ZOU G, ZENG Z G. Object tracking algorithm with hierarchical features and hybrid attention[J]. Journal of Computer Applications, 2022, 42(3): 833-843. | |
| 10 | AHMADI K, SALARI E. Small dim object tracking using frequency and spatial domain information[J]. Pattern Recognition, 2016, 58: 227-234. 10.1016/j.patcog.2016.04.001 | 
| 11 | AHMADI K, SALARI E. Small dim object tracking using a multi objective particle swarm optimisation technique[J]. IET Image Processing, 2015, 9(9): 820-826. 10.1049/iet-ipr.2014.0927 | 
| 12 | MARVASTI-ZADEH S M, KHAGHANI J, CHANEI-YAKHDAN H, et al. COMET: context-aware IoU-guided network for small object tracking [C]// Proceedings of the 2020 Asian Conference on Computer Vision, LNCS 12623. Cham: Springer, 2021: 594-611. | 
| 13 | HENRIQUES J F, CASEIRO R, MARTINS P, et al. Exploiting the circulant structure of tracking-by-detection with kernels[C]// Proceedings of the 2012 European Conference on Computer Vision, LNCS 7575. Berlin: Springer, 2012: 702-715. | 
| 14 | LI Y, ZHU J. A scale adaptive kernel correlation filter tracker with feature integration[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8926. Cham: Springer, 2015: 254-265. | 
| 15 | DANELLJAN M, BHAT G, SHAHBAZ KHAN F, et al. ECO: efficient convolution operators for tracking [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6931-6939. 10.1109/cvpr.2017.733 | 
| 16 | BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional Siamese networks for object tracking[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9914. Cham: Springer, 2016: 850-865. | 
| 17 | LI B, WU W, WANG Q, et al. SiamRPN++: evolution of Siamese visual tracking with very deep networks [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 4282-4291. 10.1109/cvpr.2019.00441 | 
| 18 | ZHU Z, WANG Q, LI B, et al. Distractor-aware Siamese networks for visual object tracking[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11213. Cham: Springer, 2018: 103-119. | 
| 19 | WANG Q, ZHANG L, BERTINETTO L, et al. Fast online object tracking and segmentation: a unifying approach [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 1328-1338. 10.1109/cvpr.2019.00142 | 
| 20 | CHEN Z, ZHONG B, LI G, et al. Siamese box adaptive network for visual tracking[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 6667-6676. 10.1109/cvpr42600.2020.00670 | 
| 21 | YAN B, PENG H, FU J, et al. Learning spatio-temporal Transformer for visual tracking [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 10428-10437. 10.1109/iccv48922.2021.01028 | 
| 22 | WANG N, ZHOU W, WANG J, et al. Transformer meets tracker: exploiting temporal context for robust visual tracking [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 1571-1580. 10.1109/cvpr46437.2021.00162 | 
| 23 | BLATTER P, KANAKIS M, DANELLJAN M, et al. Efficient visual tracking with Exemplar Transformers [C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 1571-1581. 10.1109/wacv56688.2023.00162 | 
| 24 | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 | 
| 25 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context [C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014: 740-755. | 
| 26 | RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge [J]. International Journal of Computer Vision, 2015, 115(3): 211-252. 10.1007/s11263-015-0816-y | 
| 27 | HUANG L, ZHAO X, HUANG K. GOT-10k: a large high-diversity benchmark for generic object tracking in the wild[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1562-1577. 10.1109/tpami.2019.2957464 | 
| 28 | CHEN X, YAN B, ZHU J,et al. Transformer tracking[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2021:8122-8131. 10.1109/cvpr46437.2021.00803 | 
| 29 | CHOI J, CHANG H J, JEONG J, et al. Visual tracking using attention-modulated disintegration and integration[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4321-4330. 10.1109/cvpr.2016.468 | 
| NAM H, HAN B. Learning multi-domain convolutional neural networks for visual tracking [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4293-4302. 10.1109/cvpr.2016.468 | |
| 30 | GRABNER H, GRABNER M, BISCHOF H. Real-time tracking via on-line boosting [EB/OL]. [2022-11-20]. . 10.5244/c.20.6 | 
| 31 | ZHANG J, MA S, SCLAROFF S. MEEM: robust tracking via multiple experts using entropy minimization [C]// Proceedings of the 2014 European Conference on Computer Vision,LNCS 8694. Cham:Springer, 2014:188-203. | 
| 32 | ZHANG Z, PENG H, FU J, et al. Ocean: object-aware anchor-free tracking [C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12366. Cham: Springer, 2020: 771-787. | 
| 33 | CHEN X, YAN B, ZHU J, et al. Transformer tracking[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 8122-8131. 10.1109/cvpr46437.2021.00803 | 
| [1] | Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892. | 
| [2] | Jinjin LI, Guoming SANG, Yijia ZHANG. Multi-domain fake news detection model enhanced by APK-CNN and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2674-2682. | 
| [3] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. | 
| [4] | Jiepo FANG, Chongben TAO. Hybrid internet of vehicles intrusion detection system for zero-day attacks [J]. Journal of Computer Applications, 2024, 44(9): 2763-2769. | 
| [5] | Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738. | 
| [6] | Jieru JIA, Jianchao YANG, Shuorui ZHANG, Tao YAN, Bin CHEN. Unsupervised person re-identification based on self-distilled vision Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2893-2902. | 
| [7] | Yunchuan HUANG, Yongquan JIANG, Juntao HUANG, Yan YANG. Molecular toxicity prediction based on meta graph isomorphism network [J]. Journal of Computer Applications, 2024, 44(9): 2964-2969. | 
| [8] | Xin YANG, Xueni CHEN, Chunjiang WU, Shijie ZHOU. Short-term traffic flow prediction of urban highway based on variant residual model and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2947-2951. | 
| [9] | Liehong REN, Lyuwen HUANG, Xu TIAN, Fei DUAN. Multivariate long-term series forecasting method with DFT-based frequency-sensitive dual-branch Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2739-2746. | 
| [10] | Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392. | 
| [11] | Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406. | 
| [12] | Yuwei DING, Hongbo SHI, Jie LI, Min LIANG. Image denoising network based on local and global feature decoupling [J]. Journal of Computer Applications, 2024, 44(8): 2571-2579. | 
| [13] | Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594. | 
| [14] | Kaili DENG, Weibo WEI, Zhenkuan PAN. Industrial defect detection method with improved masked autoencoder [J]. Journal of Computer Applications, 2024, 44(8): 2595-2603. | 
| [15] | Shangbin MO, Wenjun WANG, Ling DONG, Shengxiang GAO, Zhengtao YU. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding [J]. Journal of Computer Applications, 2024, 44(8): 2611-2617. | 
| Viewed | ||||||
| Full text |  | |||||
| Abstract |  | |||||