Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (3): 661-673.DOI: 10.11772/j.issn.1001-9081.2022010150
• Artificial intelligence • Next Articles
Mengting WANG, Wenzhong YANG(), Yongzhi WU
Received:
2022-02-11
Revised:
2022-04-28
Accepted:
2022-05-05
Online:
2022-05-24
Published:
2023-03-10
Contact:
Wenzhong YANG
About author:
WANG Mengting, born in 1995, M. S. candidate. Her research interests include single object tracking, computer vision.Supported by:
通讯作者:
杨文忠
作者简介:
王梦亭(1995—),女,河南周口人,硕士研究生,主要研究方向:单目标跟踪、计算机视觉基金资助:
CLC Number:
Mengting WANG, Wenzhong YANG, Yongzhi WU. Survey of single target tracking algorithms based on Siamese network[J]. Journal of Computer Applications, 2023, 43(3): 661-673.
王梦亭, 杨文忠, 武雍智. 基于孪生网络的单目标跟踪算法综述[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 661-673.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022010150
类型 | 名称 | 利用特征 | AUC | 帧率/ (frame/·s-1) | |
---|---|---|---|---|---|
CPU | GPU | ||||
传统的相关 滤波算法 | KCF | 原始像素、HOG | 0.477 | 172 | — |
FDSST | HOG | 0.551 | 54.3 | — | |
结合深度特征的相关滤波算法 | DeepSRDCF | HOG、 深层外观特征 | 0.635 | — | 0.2 |
ECO | HOG、CN、 深层外观特征 | 0.691 | — | 8 | |
其他深度算法 | MDNet | 深层外观特征 | 0.678 | — | 1 |
基于孪生网络的算法 | SiamFC | 深层外观特征 | 0.582 | — | 86 |
SiamRPN | 深层外观特征 | 0.637 | — | 160 |
Tab. 1 Comparison of different types of tracking algorithms
类型 | 名称 | 利用特征 | AUC | 帧率/ (frame/·s-1) | |
---|---|---|---|---|---|
CPU | GPU | ||||
传统的相关 滤波算法 | KCF | 原始像素、HOG | 0.477 | 172 | — |
FDSST | HOG | 0.551 | 54.3 | — | |
结合深度特征的相关滤波算法 | DeepSRDCF | HOG、 深层外观特征 | 0.635 | — | 0.2 |
ECO | HOG、CN、 深层外观特征 | 0.691 | — | 8 | |
其他深度算法 | MDNet | 深层外观特征 | 0.678 | — | 1 |
基于孪生网络的算法 | SiamFC | 深层外观特征 | 0.582 | — | 86 |
SiamRPN | 深层外观特征 | 0.637 | — | 160 |
数据集 | 训练 视频数 | 测试 视频数 | 帧数 | 帧率/ (frame·s-1) | 重叠数据集 | 属性名称 | ||||
---|---|---|---|---|---|---|---|---|---|---|
总数/106 | 最小 | 平均 | 最大 | |||||||
早 期 数 据 集 | OTB2013 | — | 51 | 0.029 | 71 | 578 | 3 872 | 30 | VOT、OTB2015、TColor-128 | IV、SV、OCC、DEF、MB、 FM、IPR、OPR、OV、BC、LR |
OTB2015 | — | 100 | 0.059 | 71 | 590 | 3 872 | 30 | VOT、OTB2013、TColor-128 | ||
TColor-128 | — | 128 | 0.055 | 71 | 429 | 3 872 | 30 | OTB、VOT | ||
UAV123 | — | 123 | 0.113 | 109 | 915 | 3 085 | 30 | VOT、UAV20L | ARC、BC、CM、FM、FOC、IV、 LR、OV、POC、SOB、SV、VC | |
UAV20L | — | 20 | 0.059 | 1 717 | 2 934 | 5 527 | 30 | VOT、UAV123 | ||
NfS | — | 100 | 0.383 | 169 | 3 830 | 20 665 | 240 | YouTube | IV、SV、OCC、DEF、FM、OV、 BC、LR、VC | |
VOT2018 | — | 60 | 0.021 36 | 41 | 356 | 1 500 | 30 | NUS-PRO[ OTB、TColor-128、UAV123 | OCO、SCO、ARC、CM、MOC、 DEF、AM、MB、BC、IV、SV、OCC | |
VOT2020 | — | 60 | — | — | — | — | 30 | OTB、VOT、ALOV++,UAV123、 NUS-PRO、TColor-128、RGBT234[ | OCC、IV、MOC、SZ、CM | |
大 规 模 数 据 集 | TrackingNet | 30 132 | 511 | 14.43 | — | 480 | — | 30 | YouTube-BB | IV、SV、DEF、MB、FM、IPR、 OPR、OV、BC、LR、ARC、CM、 FOC、POC、SOB |
GOT-10K | 9 335 | 180 | 1.5 | 29 | 149 | 1 418 | 10 | VOT、WordNet、ImageNet | IV、SV、OCC、FM、ARC、LR | |
LaSOT | 1 120 | 280 | 3.87 | 1 000 | 2 502 | 11 397 | 30 | YouTube、ImageNet | IV、SV、DEF、MB、FM、OV、 BC、LR、ARC、CM、FOC、POC、 VC、ROT | |
TNL2K | 1 300 | 700 | 1.24 | 21 | 622 | 18 488 | 30 | YouTube | CM、ROT、DEF、FOC、IV、OV、 POC、VC、SV、BC、MB、ARC、 LR、FM、AS、TC、MS |
Tab. 2 Details of commonly used tracking datasets
数据集 | 训练 视频数 | 测试 视频数 | 帧数 | 帧率/ (frame·s-1) | 重叠数据集 | 属性名称 | ||||
---|---|---|---|---|---|---|---|---|---|---|
总数/106 | 最小 | 平均 | 最大 | |||||||
早 期 数 据 集 | OTB2013 | — | 51 | 0.029 | 71 | 578 | 3 872 | 30 | VOT、OTB2015、TColor-128 | IV、SV、OCC、DEF、MB、 FM、IPR、OPR、OV、BC、LR |
OTB2015 | — | 100 | 0.059 | 71 | 590 | 3 872 | 30 | VOT、OTB2013、TColor-128 | ||
TColor-128 | — | 128 | 0.055 | 71 | 429 | 3 872 | 30 | OTB、VOT | ||
UAV123 | — | 123 | 0.113 | 109 | 915 | 3 085 | 30 | VOT、UAV20L | ARC、BC、CM、FM、FOC、IV、 LR、OV、POC、SOB、SV、VC | |
UAV20L | — | 20 | 0.059 | 1 717 | 2 934 | 5 527 | 30 | VOT、UAV123 | ||
NfS | — | 100 | 0.383 | 169 | 3 830 | 20 665 | 240 | YouTube | IV、SV、OCC、DEF、FM、OV、 BC、LR、VC | |
VOT2018 | — | 60 | 0.021 36 | 41 | 356 | 1 500 | 30 | NUS-PRO[ OTB、TColor-128、UAV123 | OCO、SCO、ARC、CM、MOC、 DEF、AM、MB、BC、IV、SV、OCC | |
VOT2020 | — | 60 | — | — | — | — | 30 | OTB、VOT、ALOV++,UAV123、 NUS-PRO、TColor-128、RGBT234[ | OCC、IV、MOC、SZ、CM | |
大 规 模 数 据 集 | TrackingNet | 30 132 | 511 | 14.43 | — | 480 | — | 30 | YouTube-BB | IV、SV、DEF、MB、FM、IPR、 OPR、OV、BC、LR、ARC、CM、 FOC、POC、SOB |
GOT-10K | 9 335 | 180 | 1.5 | 29 | 149 | 1 418 | 10 | VOT、WordNet、ImageNet | IV、SV、OCC、FM、ARC、LR | |
LaSOT | 1 120 | 280 | 3.87 | 1 000 | 2 502 | 11 397 | 30 | YouTube、ImageNet | IV、SV、DEF、MB、FM、OV、 BC、LR、ARC、CM、FOC、POC、 VC、ROT | |
TNL2K | 1 300 | 700 | 1.24 | 21 | 622 | 18 488 | 30 | YouTube | CM、ROT、DEF、FOC、IV、OV、 POC、VC、SV、BC、MB、ARC、 LR、FM、AS、TC、MS |
算法 | 来源 | OTB2015上属性的AUC | LaSOT | GOT-10K | 帧率/ (frame·s-1) | ||||
---|---|---|---|---|---|---|---|---|---|
FM | MB | OV | IPR | OPR | AUC | AO | |||
SAOT | ICCV2021 | 0.703 | 0.716 | 0.663 | 0.726 | 0.702 | 0.616 | 0.640 | 29.00 |
AutoMatch | ICCV2021 | 0.721 | 0.732 | 0.687 | 0.714 | 0.705 | 0.583 | 0.652 | 50.00 |
STARK-ST50 | ICCV2021 | 0.709 | 0.733 | 0.666 | 0.675 | 0.667 | 0.664 | 0.680 | 40.00 |
STMTrack | CVPR2021 | 0.729 | 0.740 | 0.667 | 0.730 | 0.707 | 0.606 | 0.642 | 37.00 |
TrDiMP | CVPR2021 | 0.715 | 0.736 | 0.691 | 0.721 | 0.701 | 0.639 | 0.671 | 26.00 |
TrSiam | CVPR2021 | 0.706 | 0.729 | 0.679 | 0.709 | 0.687 | 0.624 | 0.660 | 35.00 |
SiamGAT | CVPR2021 | 0.695 | 0.715 | 0.631 | 0.712 | 0.707 | 0.539 | 0.627 | 70.00 |
SiamBAN-ACM | CVPR2021 | 0.734 | 0.744 | 0.700 | 0.729 | 0.715 | 0.572 | — | 41.00 |
TransT | CVPR2021 | 0.720 | 0.744 | 0.684 | 0.694 | 0.674 | 0.649 | 0.671 | 50.00 |
CGACD | CVPR2020 | 0.702 | 0.713 | 0.636 | 0.722 | 0.704 | 0.518 | — | 70.00 |
SiamBAN | CVPR2020 | 0.687 | 0.698 | 0.640 | 0.717 | 0.687 | 0.514 | — | 40.00 |
SiamCAR | CVPR2020 | 0.703 | 0.715 | 0.661 | 0.703 | 0.679 | — | 0.569* | 52.27 |
ROAM++ | CVPR2020 | 0.679 | 0.660 | 0.622 | 0.664 | 0.659 | 0.447 | 0.465 | 20.00 |
PrDiMP-50 | CVPR2020 | 0.699 | 0.728 | 0.656 | 0.705 | 0.686 | 0.598 | 0.634 | 30.00 |
Siam R-CNN | CVPR2020 | 0.702 | 0.735 | 0.677 | 0.699 | 0.684 | 0.648 | 0.649 | 4.70 |
Ocean | ECCV2020 | 0.668 | 0.681 | 0.613 | 0.697 | 0.677 | 0.560 | 0.611 | 25.00 |
SiamRPN++ | CVPR2019 | 0.686 | 0.703 | 0.646 | 0.694 | 0.680 | 0.496 | 0.517* | 35.00 |
SiamDW | CVPR2019 | 0.665 | 0.696 | 0.641 | 0.648 | 0.658 | 0.384 | 0.416 | 35.00 |
ATOM | CVPR2019 | 0.657 | 0.653 | 0.613 | 0.637 | 0.618 | 0.514 | 0.556* | 30.00 |
DiMP-50 | ICCV2019 | 0.682 | 0.699 | 0.620 | 0.689 | 0.660 | 0.569 | 0.611 | 43.00 |
GradNet | ICCV2019 | 0.624 | 0.645 | 0.583 | 0.627 | 0.628 | 0.365 | — | 80.00 |
SiamRPN | CVPR2018 | 0.606 | 0.627 | 0.550 | 0.636 | 0.631 | 0.433* | — | 160.00 |
SiamFC | ECCVW2016 | 0.579 | 0.586 | 0.469 | 0.565 | 0.558 | 0.336* | 0.348 | 86.00 |
Tab.3 Performance comparison of Siamese trackers
算法 | 来源 | OTB2015上属性的AUC | LaSOT | GOT-10K | 帧率/ (frame·s-1) | ||||
---|---|---|---|---|---|---|---|---|---|
FM | MB | OV | IPR | OPR | AUC | AO | |||
SAOT | ICCV2021 | 0.703 | 0.716 | 0.663 | 0.726 | 0.702 | 0.616 | 0.640 | 29.00 |
AutoMatch | ICCV2021 | 0.721 | 0.732 | 0.687 | 0.714 | 0.705 | 0.583 | 0.652 | 50.00 |
STARK-ST50 | ICCV2021 | 0.709 | 0.733 | 0.666 | 0.675 | 0.667 | 0.664 | 0.680 | 40.00 |
STMTrack | CVPR2021 | 0.729 | 0.740 | 0.667 | 0.730 | 0.707 | 0.606 | 0.642 | 37.00 |
TrDiMP | CVPR2021 | 0.715 | 0.736 | 0.691 | 0.721 | 0.701 | 0.639 | 0.671 | 26.00 |
TrSiam | CVPR2021 | 0.706 | 0.729 | 0.679 | 0.709 | 0.687 | 0.624 | 0.660 | 35.00 |
SiamGAT | CVPR2021 | 0.695 | 0.715 | 0.631 | 0.712 | 0.707 | 0.539 | 0.627 | 70.00 |
SiamBAN-ACM | CVPR2021 | 0.734 | 0.744 | 0.700 | 0.729 | 0.715 | 0.572 | — | 41.00 |
TransT | CVPR2021 | 0.720 | 0.744 | 0.684 | 0.694 | 0.674 | 0.649 | 0.671 | 50.00 |
CGACD | CVPR2020 | 0.702 | 0.713 | 0.636 | 0.722 | 0.704 | 0.518 | — | 70.00 |
SiamBAN | CVPR2020 | 0.687 | 0.698 | 0.640 | 0.717 | 0.687 | 0.514 | — | 40.00 |
SiamCAR | CVPR2020 | 0.703 | 0.715 | 0.661 | 0.703 | 0.679 | — | 0.569* | 52.27 |
ROAM++ | CVPR2020 | 0.679 | 0.660 | 0.622 | 0.664 | 0.659 | 0.447 | 0.465 | 20.00 |
PrDiMP-50 | CVPR2020 | 0.699 | 0.728 | 0.656 | 0.705 | 0.686 | 0.598 | 0.634 | 30.00 |
Siam R-CNN | CVPR2020 | 0.702 | 0.735 | 0.677 | 0.699 | 0.684 | 0.648 | 0.649 | 4.70 |
Ocean | ECCV2020 | 0.668 | 0.681 | 0.613 | 0.697 | 0.677 | 0.560 | 0.611 | 25.00 |
SiamRPN++ | CVPR2019 | 0.686 | 0.703 | 0.646 | 0.694 | 0.680 | 0.496 | 0.517* | 35.00 |
SiamDW | CVPR2019 | 0.665 | 0.696 | 0.641 | 0.648 | 0.658 | 0.384 | 0.416 | 35.00 |
ATOM | CVPR2019 | 0.657 | 0.653 | 0.613 | 0.637 | 0.618 | 0.514 | 0.556* | 30.00 |
DiMP-50 | ICCV2019 | 0.682 | 0.699 | 0.620 | 0.689 | 0.660 | 0.569 | 0.611 | 43.00 |
GradNet | ICCV2019 | 0.624 | 0.645 | 0.583 | 0.627 | 0.628 | 0.365 | — | 80.00 |
SiamRPN | CVPR2018 | 0.606 | 0.627 | 0.550 | 0.636 | 0.631 | 0.433* | — | 160.00 |
SiamFC | ECCVW2016 | 0.579 | 0.586 | 0.469 | 0.565 | 0.558 | 0.336* | 0.348 | 86.00 |
1 | EMAMI A, DADGOSTAR F, BIGDELI A, et al. Role of spatiotemporal oriented energy features for robust visual tracking in video surveillance[C]// Proceedings of the IEEE 9th International Conference on Advanced Video and Signal-Based Surveillance. Piscataway: IEEE, 2012: 349-354. 10.1109/avss.2012.64 |
2 | XING J L, AI H Z, LAO S H. Multiple human tracking based on multi-view upper-body detection and discriminative learning[C]// Proceedings of the 20th International Conference on Pattern Recognition. Piscataway: IEEE, 2010: 1698-1701. 10.1109/icpr.2010.420 |
3 | XU R Y, GUAN Y P, HUANG Y Z. Multiple human detection and tracking based on head detection for real-time video surveillance[J]. Multimedia Tools and Applications, 2015, 74(3): 729-742. 10.1007/s11042-014-2177-x |
4 | LEE K H, HWANG J N. On-road pedestrian tracking across multiple driving recorders[J]. IEEE Transactions on Multimedia, 2015, 17(9): 1429-1438. 10.1109/tmm.2015.2455418 |
5 | GAO M, JIN L S, JIANG Y Y, et al. Manifold Siamese network: a novel visual tracking convnet for autonomous vehicles[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 21(4): 1612-1623. 10.1109/tits.2019.2930337 |
6 | LIU L W, XING J L, AI H Z, et al. Hand posture recognition using finger geometric feature[C]// Proceedings of the 21st International Conference on Pattern Recognition. Piscataway: IEEE, 2012: 565-568. |
7 | ROBIN C, LACROIX S. Multi-robot target detection and tracking: taxonomy and survey[J]. Autonomous Robots, 2016, 40(4): 729-760. 10.1007/s10514-015-9491-7 |
8 | MANAFIFARD M, EBADI H, ABRISHAMI MOGHADDAM H. A survey on player tracking in soccer videos[J]. Computer Vision and Image Understanding, 2017, 159: 19-46. 10.1016/j.cviu.2017.02.002 |
9 | LUO J H, HAN Y, FAN L Y. Underwater acoustic target tracking: a review[J]. Sensors, 2018, 18(1): No.112. 10.3390/s18010112 |
10 | 孟晓燕,段建民. 基于相关滤波的目标跟踪算法研究综述[J]. 北京工业大学学报, 2020, 46(12): 1393-1416. 10.11936/bjutxb2019030011 |
MENG X Y, DUAN J M. Advances in correlation filter-based object tracking algorithms: a review[J]. Journal of Beijing University of Technology, 2020, 46(12): 1393-1416. 10.11936/bjutxb2019030011 | |
11 | HENRIQUES J F, CASEIRO R, MARTINS P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583-596. 10.1109/tpami.2014.2345390 |
12 | DANELLJAN M, HÄGER G, SHAHBAZ F S, et al. Accurate scale estimation for robust visual tracking[C]// Proceedings of the 2014 British Machine Vision Conference. Durham: BMVA Press, 2014: No.65. 10.5244/c.28.65 |
13 | DANELLJAN M, HÄGER G, KHAN F S, et al. Convolutional features for correlation filter based visual tracking[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision Workshops. Piscataway: IEEE, 2015: 621-629. 10.1109/iccvw.2015.84 |
14 | DANELLJAN M, BHAT G, KHAN F S, et al. ECO: efficient convolution operators for tracking[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6931-6939. 10.1109/cvpr.2017.733 |
15 | NAM H, HAN B. Learning multi-domain convolutional neural networks for visual tracking[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4293-4302. 10.1109/cvpr.2016.465 |
16 | BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional Siamese networks for object tracking[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9914. Cham: Springer, 2016: 850-865. |
17 | LI B, YAN J J, WU W, et al. High performance visual tracking with Siamese region proposal network[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8971-8980. 10.1109/cvpr.2018.00935 |
18 | BROMLEY J, BENTZ J W, BOTTOU L, et al. Signature verification using a “Siamese” time delay neural network[J]. International Journal of Pattern Recognition and Artificial Intelligence, 1993, 7(4): 669-688. |
19 | CHOPRA S, HADSELL R, LeCUN Y. Learning a similarity metric discriminatively, with application to face verification[C]// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume I. Piscataway: IEEE, 2005: 539-546. |
20 | TAIGMAN Y, YANG M, RANZATO M, et al. DeepFace: closing the gap to human-level performance in face verification[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 1701-1708. 10.1109/cvpr.2014.220 |
21 | LIN T Y, CUI Y, BELONGIE S, et al. Learning deep representations for ground-to-aerial geolocalization[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 5007-5015. 10.1109/cvpr.2015.7299135 |
22 | HAN X F, LEUNG T, JIA Y Q, et al. MatchNet: unifying feature and metric learning for patch-based matching[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 3279-3286. 10.1109/cvpr.2015.7298948 |
23 | ZAGORUYKO S, KOMODAKIS N. Learning to compare image patches via convolutional neural networks[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 4353-4361. 10.1109/cvpr.2015.7299064 |
24 | ŽBONTAR J, LeCUN Y. Computing the stereo matching cost with a convolutional neural network[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 1592-1599. 10.1109/cvpr.2015.7298767 |
25 | KOCH G, ZEMEL R, SALAKHUTDINOV R. Siamese neural networks for one-shot image recognition[EB/OL]. [2022-04-18].. |
26 | TAO R, GAVVES E, SMEULDERS A W M. Siamese instance search for tracking[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 1420-1429. 10.1109/cvpr.2016.158 |
27 | VALMADRE J, BERTINETTO L, HENRIQUES J, et al. End-to-end representation learning for correlation filter based tracking[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 5000-5008. 10.1109/cvpr.2017.531 |
28 | HE A F, LUO C, TIAN X M, et al. A twofold Siamese network for real-time object tracking[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4834-4843. 10.1109/cvpr.2018.00508 |
29 | WANG Q, TENG Z, XING J L, et al. Learning attentions: residual attentional Siamese network for high performance online visual tracking[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4854-4863. 10.1109/cvpr.2018.00510 |
30 | ZHANG Y H, WANG L J, QI J Q, et al. Structured Siamese network for real-time visual tracking[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11213. Cham: Springer, 2018: 355-370. |
31 | WANG G T, LUO C, XIONG Z W, et al. SPM-Tracker: series-parallel matching for real-time visual object tracking[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 3638-3647. 10.1109/cvpr.2019.00376 |
32 | FAN H, LING H B. Siamese cascaded region proposal networks for real-time visual tracking[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 7944-7953. 10.1109/cvpr.2019.00814 |
33 | SUNG F, YANG Y X, ZHANG L, et al. Learning to compare: relation network for few-shot learning[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 1199-1208. 10.1109/cvpr.2018.00131 |
34 | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]// Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1. Red Hook, NY: Curran Associates Inc., 2012: 1097-1105. |
35 | LI B, WU W, WANG Q, et al. SiamRPN++: evolution of Siamese visual tracking with very deep networks[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 4277-4286. 10.1109/cvpr.2019.00441 |
36 | ZHANG Z P, PENG H W. Deeper and wider Siamese networks for real-time visual tracking[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 4586-4595. 10.1109/cvpr.2019.00472 |
37 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 |
38 | XIE S N, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 5987-5995. 10.1109/cvpr.2017.634 |
39 | SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 1-9. 10.1109/cvpr.2015.7298594 |
40 | GUPTA D K, ARYA D, GAVVES E. Rotation equivariant Siamese networks for tracking[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 12357-12366. 10.1109/cvpr46437.2021.01218 |
41 | SOSNOVIK I, MOSKALEV A, SMEULDERS A. Scale equivariance improves Siamese tracking[C]// Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2021: 2764-2773. 10.1109/wacv48630.2021.00281 |
42 | HUANG C, LUCEY S, RAMANAN D. Learning policies for adaptive tracking with deep feature cascades[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 105-114. 10.1109/iccv.2017.21 |
43 | YAN B, PENG H W, WU K, et al. LightTrack: finding lightweight neural networks for object tracking via one-shot architecture search[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 15175-15184. 10.1109/cvpr46437.2021.01493 |
44 | WANG Z Q, XU J, LIU L, et al. RANet: ranking attention network for fast video object segmentation[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 3977-3986. 10.1109/iccv.2019.00408 |
45 | YAN B, ZHANG X Y, WANG D, et al. Alpha-Refine: boosting tracking performance by precise bounding box estimation[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 5285-5294. 10.1109/cvpr46437.2021.00525 |
46 | LIAO B Y, WANG C Y, WANG Y Y, et al. PG-Net: pixel to global matching network for visual tracking[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12367. Cham: Springer, 2020: 429-444. |
47 | ZHANG Z P, LIU Y H, WANG X, et al. Learn to match: automatic matching network design for visual tracking[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 13319-13328. 10.1109/iccv48922.2021.01309 |
48 | GUO D Y, SHAO Y Y, CUI Y, et al. Graph attention tracking[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 9538-9547. 10.1109/cvpr46437.2021.00942 |
49 | ZHOU Z K, PEI W J, LI X, et al. Saliency-associated object tracking[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 9846-9855. 10.1109/iccv48922.2021.00972 |
50 | HAN W C, DONG X P, KHAN F S, et al. Learning to fuse asymmetric feature maps in Siamese trackers[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 16565-16575. 10.1109/cvpr46437.2021.01630 |
51 | CHEN X, YAN B, ZHU J W, et al. Transformer tracking[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 8122-8131. 10.1109/cvpr46437.2021.00803 |
52 | REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. 10.1109/tpami.2016.2577031 |
53 | XU Y D, WANG Z Y, LI Z X, et al. SiamFC++: towards robust and accurate visual tracking with target estimation guidelines[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020:12549-12556. 10.1609/aaai.v34i07.6944 |
54 | GUO D Y, WANG J, CUI Y, et al. SiamCAR: Siamese fully convolutional classification and regression for visual tracking[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 6268-6276. 10.1109/cvpr42600.2020.00630 |
55 | CHEN Z D, ZHONG B N, LI G R, et al. Siamese box adaptive network for visual tracking[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 6667-6676. 10.1109/cvpr42600.2020.00670 |
56 | ZHANG Z P, PENG H W, FU J L, et al. Ocean: object-aware anchor-free tracking[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12366. Cham: Springer, 2020: 771-787. |
57 | DU F, LIU P, ZHAO W, et al. Correlation-guided attention for corner detection based visual tracking[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 6835-6844. 10.1109/cvpr42600.2020.00687 |
58 | YANG Z, LIU S H, HU H, et al. RepPoints: point set representation for object detection[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 9656-9665. 10.1109/iccv.2019.00975 |
59 | MA Z A, WANG L Y, ZHANG H T, et al. RPT: learning point set representation for Siamese visual tracking[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12539. Cham: Springer, 2020: 653-665. |
60 | WANG Q, ZHANG L, BERTINETTO L, et al. Fast online object tracking and segmentation: a unifying approach[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 1328-1338. 10.1109/cvpr.2019.00142 |
61 | XU N, YANG L J, FAN Y C, et al. YouTube-VOS: sequence-to-sequence video object segmentation[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11209. Cham: Springer, 2018: 603-619. |
62 | BHAT G, DANELLJAN M, van GOOL L, et al. Learning discriminative model prediction for tracking[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 6181-6190. 10.1109/iccv.2019.00628 |
63 | HELD D, THRUN S, SAVARESE S. Learning to track at 100 FPS with deep regression networks[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9905. Cham: Springer, 2016: 749-765. |
64 | YANG T Y, CHAN A B. Recurrent filter learning for visual tracking[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops. Piscataway: IEEE, 2017: 2010-2019. 10.1109/iccvw.2017.235 |
65 | YANG T Y, CHAN A B. Learning dynamic memory networks for object tracking[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11213. Cham: Springer, 2018: 153-169. |
66 | FU Z H, LIU Q J, FU Z H, et al. STMTrack: template-free visual tracking with space-time memory networks[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 13769-13778. 10.1109/cvpr46437.2021.01356 |
67 | GUO Q, FENG W, ZHOU C, et al. Learning dynamic Siamese network for visual object tracking[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 1781-1789. 10.1109/iccv.2017.196 |
68 | ZHU Z, WANG Q, LI B, et al. Distractor-aware Siamese networks for visual object tracking[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11213. Cham: Springer, 2018: 103-119. |
69 | ZHANG L C, GONZALEZ-GARCIA A, J van de WEIJER, et al. Learning the model update for Siamese trackers[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 4009-4018. 10.1109/iccv.2019.00411 |
70 | LI P X, CHEN B Y, OUYANG W L, et al. GradNet: gradient-guided network for visual object tracking[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 6161-6170. 10.1109/iccv.2019.00626 |
71 | CHOI J, KWON J, LEE K M. Deep meta learning for real-time target-aware visual tracking[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 911-920. 10.1109/iccv.2019.00100 |
72 | GAO J Y, ZHANG T Z, XU C S. Graph convolutional tracking[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 4644-4654. 10.1109/cvpr.2019.00478 |
73 | WANG N, ZHOU W G, WANG J, et al. Transformer meets tracker: exploiting temporal context for robust visual tracking[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 1571-1580. 10.1109/cvpr46437.2021.00162 |
74 | YAN B, PENG H W, FU J L, et al. Learning spatio-temporal transformer for visual tracking[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 10428-10437. 10.1109/iccv48922.2021.01028 |
75 | SONG Y B, MA C, GONG L J, et al. CREST: convolutional residual learning for visual tracking[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2574-2583. 10.1109/iccv.2017.279 |
76 | YAO Y J, WU X H, ZHANG L, et al. Joint representation and truncated inference learning for correlation filter based tracking[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11213. Cham: Springer, 2018: 560-575. |
77 | ZHU Z, WU W, ZOU W, et al. End-to-end flow correlation tracking with spatial-temporal attention[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 548-557. 10.1109/cvpr.2018.00064 |
78 | DANELLJAN M, BHAT G, KHAN F S, et al. ATOM: accurate tracking by overlap maximization[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 4655-4664. 10.1109/cvpr.2019.00479 |
79 | DANELLJAN M, van GOOL L, TIMOFTE R. Probabilistic regression for visual tracking[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 7181-7190. 10.1109/cvpr42600.2020.00721 |
80 | YANG T Y, XU P F, HU R B, et al. ROAM: recurrently optimizing tracking model[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 6717-6726. 10.1109/cvpr42600.2020.00675 |
81 | VOIGTLAENDER P, LUITEN J, TORR P H S, et al. Siam R-CNN: visual tracking by re-detection[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 6577-6587. 10.1109/cvpr42600.2020.00661 |
82 | WANG X, LI C L, LUO B, et al. SINT++: robust visual tracking via adversarial positive instance generation[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4864-4873. 10.1109/cvpr.2018.00511 |
83 | WANG N, SONG Y B, MA C, et al. Unsupervised deep tracking[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 1308-1317. 10.1109/cvpr.2019.00140 |
84 | DONG X P, SHEN J B. Triplet loss in Siamese network for object tracking[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11217. Cham: Springer, 2018: 472-488. |
85 | WU Y, LIM J, YANG M H. Online object tracking: a benchmark[C]// Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2013: 2411-2418. 10.1109/cvpr.2013.312 |
86 | WU Y, LIM J, YANG M H. Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834-1848. 10.1109/tpami.2014.2388226 |
87 | LIANG P P, BLASCH E, LING H B. Encoding color information for visual tracking: algorithms and benchmark[J]. IEEE Transactions on Image Processing, 2015, 24(12): 5630-5644. 10.1109/tip.2015.2482905 |
88 | GALOOGAHI H K, FAGG A, HUANG C, et al. Need for speed: a benchmark for higher frame rate object tracking[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 1134-1143. 10.1109/iccv.2017.128 |
89 | LI S Y, YEUNG D Y. Visual object tracking for unmanned aerial vehicles: a benchmark and new motion models[C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2017: 4140-4146. 10.1609/aaai.v31i1.11205 |
90 | KRISTAN M, LEONARDIS A, MATAS J, et al. The sixth Visual Object Tracking VOT2018 challenge results[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11129. Cham: Springer, 2019: 3-53. |
91 | KRISTAN M, LEONARDIS A, MATAS J, et al. The eighth Visual Object Tracking VOT2020 challenge results[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12539. Cham: Springer, 2020: 547-601. |
92 | MÜLLER M, BIBI A, GIANCOLA S, et al. TrackingNet: a large-scale dataset and benchmark for object tracking in the wild[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11205. Cham: Springer, 2018: 310-327. |
93 | HUANG L H, ZHAO X, HUANG K Q. GOT-10k: a large high-diversity benchmark for generic object tracking in the wild[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1562-1577. 10.1109/tpami.2019.2957464 |
94 | MILLER G A. WordNet: a lexical database for English[J]. Communications of the ACM, 1995, 38(11):39-41. 10.1145/219717.219748 |
95 | FAN H, LIN L T, YANG F, et al. LaSOT: a high-quality large-scale single object tracking benchmark[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 5369-5378. 10.1109/cvpr.2019.00552 |
96 | WANG X, SHU X J, ZHANG Z P, et al. Towards more flexible and accurate object tracking with natural language: algorithms and benchmark[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 13758-13768. 10.1109/cvpr46437.2021.01355 |
97 | RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211-252. 10.1007/s11263-015-0816-y |
98 | REAL E, SHLENS J, MAZZOCCHI S, et al. YouTube-BoundingBoxes: a large high-precision human-annotated data set for object detection in video[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 7464-7473. 10.1109/cvpr.2017.789 |
99 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014: 7740-755. |
100 | LI A N, LIN M, WU Y, et al. NUS-PRO: a new visual tracking challenge[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(2): 335-349. 10.1109/tpami.2015.2417577 |
101 | SMEULDERS A W M, CHU D M, CUCCHIARA R, et al. Visual tracking: an experimental survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(7): 1442-1468. 10.1109/tpami.2013.230 |
102 | LI C L, LIANG X Y, LU Y J, et al. RGB-T object tracking: benchmark and baseline[J]. Pattern Recognition, 2019, 96: No.106977. 10.1016/j.patcog.2019.106977 |
[1] | Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877. |
[2] | Ying HUANG, Jiayu YANG, Jiahao JIN, Bangrui WAN. Siamese mixed information fusion algorithm for RGBT tracking [J]. Journal of Computer Applications, 2024, 44(9): 2878-2885. |
[3] | Shuai FU, Xiaoying GUO, Ruyi BAI, Tao YAN, Bin CHEN. Age estimation method combining improved CloFormer model and ordinal regression [J]. Journal of Computer Applications, 2024, 44(8): 2372-2380. |
[4] | Sailong SHI, Zhiwen FANG. Gaze estimation model based on multi-scale aggregation and shared attention [J]. Journal of Computer Applications, 2024, 44(7): 2047-2054. |
[5] | Wu XIONG, Congjun CAO, Xuefang SONG, Yunlong SHAO, Xusheng WANG. Handwriting identification method based on multi-scale mixed domain attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2225-2232. |
[6] | Ziwen SUN, Lizhi QIAN, Chuandong YANG, Yibo GAO, Qingyang LU, Guanglin YUAN. Survey of visual object tracking methods based on Transformer [J]. Journal of Computer Applications, 2024, 44(5): 1644-1654. |
[7] | Zhiwen JING, Yujia ZHANG, Boting SUN, Hao GUO. Two-stage recommendation algorithm of Siamese graph convolutional neural network [J]. Journal of Computer Applications, 2024, 44(2): 469-476. |
[8] | Chenhui CUI, Suzhen LIN, Dawei LI, Xiaofei LU, Jie WU. Infrared dim small target tracking method based on Siamese network and Transformer [J]. Journal of Computer Applications, 2024, 44(2): 563-571. |
[9] | Yudong PANG, Zhixing LI, Weijie LIU, Tianhao LI, Ningning WANG. Small target detection model in overlooking scenes on tower cranes based on improved real-time detection Transformer [J]. Journal of Computer Applications, 2024, 44(12): 3922-3929. |
[10] | Yongjiang LIU, Bin CHEN. Pixel-level unsupervised industrial anomaly detection based on multi-scale memory bank [J]. Journal of Computer Applications, 2024, 44(11): 3587-3594. |
[11] | Wenze CHAI, Jing FAN, Shukui SUN, Yiming LIANG, Jingfeng LIU. Overview of deep metric learning [J]. Journal of Computer Applications, 2024, 44(10): 2995-3010. |
[12] | Yi WANG, Jie XIE, Jia CHENG, Liwei DOU. Review of object pose estimation in RGB images based on deep learning [J]. Journal of Computer Applications, 2023, 43(8): 2546-2555. |
[13] | Junjian JIANG, Dawei LIU, Yifan LIU, Yougui REN, Zhibin ZHAO. Few-shot object detection algorithm based on Siamese network [J]. Journal of Computer Applications, 2023, 43(8): 2325-2329. |
[14] | Yichi CHEN, Bin CHEN. Review of lifelong learning in computer vision [J]. Journal of Computer Applications, 2023, 43(6): 1785-1795. |
[15] | Yuanlong ZHAO, Yugang SHAN, Jie YUAN, Kangdi ZHAO. Object tracking based on instance segmentation and Pythagorean fuzzy decision-making [J]. Journal of Computer Applications, 2023, 43(6): 1930-1937. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||