Object tracking algorithm based on parallel tracking and detection framework and deep learning
YAN Ruoyi1, XIONG Dan2, YU Qinghua1, XIAO Junhao1, LU Huimin1
1. College of Intelligence Science and Technology, National University of Defense Technology, Changsha Hunan 410073, China; 2. Unmanned Systems Research Center, National Innovation Institute of Defense Technology, Beijing 100071, China
Abstract:In the context of air-ground robot collaboration, the apperance of the moving ground object will change greatly from the perspective of the drone and traditional object tracking algorithms can hardly accomplish target tracking in such scenarios. In order to solve this problem, based on the Parallel Tracking And Detection (PTAD) framework and deep learning, an object detecting and tracking algorithm was proposed. Firstly, the Single Shot MultiBox detector (SSD) object detection algorithm based on Convolutional Neural Network (CNN) was used as the detector in the PTAD framework to process the keyframe to obtain the object information and provide it to the tracker. Secondly, the detector and tracker processed image frames in parallel and calculated the overlap between the detection and tracking results and the confidence level of the tracking results. Finally, the proposed algorithm determined whether the tracker or detector need to be updated according to the tracking or detection status, and realized real-time tracking of the object in image frames. Based on the comparison with the original algorithm of the PTAD on video sequences captured from the perspective of the drone, the experimental results show that the performance of the proposed algorithm is better than that of the best algorithm with the PTAD framework, its real-time performance is improved by 13%, verifying the effectiveness of the proposed algorithm.
闫若怡, 熊丹, 于清华, 肖军浩, 卢惠民. 基于并行跟踪检测框架与深度学习的目标跟踪算法[J]. 计算机应用, 2019, 39(2): 343-347.
YAN Ruoyi, XIONG Dan, YU Qinghua, XIAO Junhao, LU Huimin. Object tracking algorithm based on parallel tracking and detection framework and deep learning. Journal of Computer Applications, 2019, 39(2): 343-347.
[1] 尹宏鹏,陈波,柴毅,等.基于视觉的目标检测与跟踪综述[J].自动化学报,2016,42(10):1466-1489. (YIN H B, CHEN B, CHAI Y, et al. Vision-based object detection and tracking:a review[J]. Acta Automatica Sinica, 2016, 42(10):1466-1489.) [2] 管皓,薛向阳,安志勇.深度学习在视频目标跟踪中的应用进展与展望[J].自动化学报,2016,42(6):834-847. (GUAN H, XUE X Y, AN Z Y. Advances on application of deep learning for video object tracking[J]. Acta Automatica Sinica, 2016, 42(6):834-847.) [3] LI X, HU W, SHEN C, et al. A survey of appearance models in visual object tracking[J]. ACM Transactions on Intelligent Systems & Technology, 2013, 4(4):Article No. 58. [4] KALAL Z, MIKOLAJCZYK K, MATAS J. Tracking learning detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2012, 34(7):1409-1422. [5] KALAL Z, MATAS J, MIKOLAJCZYK K. P-N learning:Bootstrapping binary classifiers by structural constraints[C]//Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2010:49-56. [6] 熊丹.基于视觉的微小型无人机地面目标跟踪技术研究[D].长沙:国防科技大学,2017:42-59. (XIONG D. The Research Vision-based Ground Object Tracking for MAVs[D]. Changsha:National University of Defense Technology, 2017:42-59.) [7] MUELLER M, SMITH N, GHANEM B. A benchmark and simulator for UAV tracking[C]//Proceedings of the 2016 European Conference on Computer Vision, LNCS 9905. Berlin:Springer, 2016:445-461. [8] HENRIQUES J F, CASEIRO R, MARTINS P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2015, 37(3):583-596. [9] KALAL Z, MIKOLAJCZYK K, MATAS J. Forward-backward error:automatic detection of tracking failures[C]//Proceedings of the 201020th International Conference on Pattern Recognition. Washington, DC:IEEE Computer Society, 2010:2756-2759. [10] GIRSHICK R, DONAHUE J, DARRELL T, et al. Region-based convolutional networks for accurate object detection and segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(1):142-158. [11] HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[C]//Proceedings of the 2014 European Conference on Computer Vision. Cham:Springer, 2014:346-361. [12] GIRSHICK R. Fast R-CNN[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Washington, DC:IEEE Computer Society, 2015:1440-1448. [13] REN S, HE K, GIRSHICK R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149. [14] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once:unified, real-time object detection[EB/OL].[2018-04-06]. https://arxiv.org/pdf/1506.02640.pdf. [15] LIU W, ANGUELOV D, ERHAN D, et al. SSD:Single Shot multibox Detector[C]//Proceedings of the 2016 European Conference on Computer Vision. Cham:Springer, 2016:21-37. [16] REDMON J, FARHADI A. YOLO9000:better, faster, stronger[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2017:6517-6525.