Abstract:In order to improve the tracking accuracy of fast online target tracking and segmentation algorithm, a dynamic weighted siamese network tracking algorithm was proposed. First, the template features extracted from the initial frame and the template features extracted from each frame were learned and fused to improve the generalization ability of the tracker. Second, in the process of obtaining the target mask by the mask branch, the features were fused in a weighting method, so as to reduce the interference caused by redundant features and improve the tracking accuracy. The algorithm was evaluated on the VOT2016 and VOT2018 datasets. The results show that the proposed algorithm has the expected average overlap rate of 0.450 and 0.390 respectively, the accuracy of 0.649 and 0.618 respectively, and the robustness of 0.205 and 0.267 respectively, all of which are higher than those of baseline algorithm. The tracking speed of the proposed algorithm is 34 frame/s, which meets the requirements of real-time tracking. The proposed algorithm effectively improves the tracking accuracy, and completes the tracking task well in a complex tracking environment.
[1] BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional Siamese networks for object tracking[C]//Proceedings of the 14th European Conference on Computer Vision, LNCS 9914. Cham:Springer, 2016:850-865. [2] 杨康,宋慧慧,张开华. 基于双重注意力孪生网络的实时视觉跟踪[J]. 计算机应用, 2019, 39(6):1652-1656. (YANG K, SONG H H, ZHANG K H. Real-time visual tracking based on dual attention siamese network[J]. Journal of Computer Applications, 2019, 39(6):1652-1656.) [3] HELD D, THRUN S, SAVARESE S. Learning to track at 100 FPS with deep regression networks[C]//Proceedings of the 14th European Conference on Computer Vision, LNCS 9905. Cham:Springer, 2016:749-765. [4] 熊昌镇,车满强,王润玲. 基于稀疏卷积特征和相关滤波的实时视觉跟踪算法[J]. 计算机应用, 2018, 38(8):2175-2179, 2223. (XIONG C Z, CHE M Q, WANG R L. Real-time visual tracking algorithm based on correlation filters and sparse convolutional features[J]. Journal of Computer Applications, 2018, 38(8):2175-2179, 2223.) [5] GUO Q, FENG W, ZHOU C, et al. Learning dynamic Siamese network for visual object tracking[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway:IEEE, 2017:1781-1798. [6] LI B, YAN J, WU W, et al. High performance visual tracking with Siamese region proposal network[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2018:8971-8980. [7] ZHU Z, WANG Q, LI B, et al. Distractor-aware Siamese networks for visual object tracking[C]//Proceedings of the 15th European Conference on Computer Vision, LNCS 11213. Cham:Springer, 2018:103-119. [8] WANG Q, ZHANG L, BERTINETTO L, et al. Fast online object tracking and segmentation:a unifying approach[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2019:1328-1338. [9] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Red Hook, NY:Curran Associates Inc., 2012:1097-1105. [10] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:770-778. [11] RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3):211-252. [12] KRISTAN M, LEONARDIS A, MATAS J, et al. The visual object tracking VOT2016 challenge results[C]//Proceedings of the 2016 IEEE International Conference on Computer Vision, LNCS 9914. Cham:Springer, 2016:777-823. [13] KRISTAN M, LEONARDIS A, MATAS J, et al. The sixth visual object tracking VOT2018 challenge results[C]//Proceedings of the 2018 European Conference on Computer Vision, LNCS 11129. Cham:Springer, 2018:3-53. [14] DANELLJAN M, ROBINSON A, KHAN FS, et al. Beyond correlation filters:learning continuous convolution operations for visual tracking[C]//Proceedings of the 2016 European Conference on Computer Vision, LNCS 9909. Cham:Springer, 2016:472-488. [15] DANELLJAN M, BHAT G, KHAN F S, et al. ECO:efficient convolution operators for tracking[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2017:6931-6939. [16] HE A, LUO C, TIAN X, et al. A twofold Siamese network for real-time object tracking[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2018:4834-4843. [17] XU T, FENG Z H, WU X, et al. Learning adaptive discriminative correlation filters via temporal consistency preserving spatial feature selection for robust visual object tracking[J]. IEEE Transactions on Image Processing, 2019, 28(11):5596-5609.