[1] 单言虎,张彰,黄凯奇.人的视觉行为识别研究回顾、现状及展望[J].计算机研究与发展,2016,53(1):93-112.(SHAN Y H, ZHANG Z, HUANG K Q. Review, current situation and prospect of human visual behavior recognition[J]. Journal of Computer Research and Development, 2016, 53(1):93-112.)
[2] FORSYTH D A. Computer Vision:A Modern Approach[M]. 2nd ed. Englewood Cliffs, NJ:Prentice Hall, 2011:1-2.
[3] CAI Z, WANG L, PENG X, et al. Multi-view super vector for action recognition[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2014:596-603.
[4] WANG H, SCHMID C. Action recognition with improved trajectories[C]//Proceedings of the 2013 IEEE International Conference on Computer Vision. Washington, DC:IEEE Computer Society, 2014:3551-3558.
[5] PENG X, WANG L, WANG X, et al. Bag of visual words and fusion methods for action recognition:comprehensive study and good practice[J]. Computer Vision and Image Understanding, 2016, 150:109-125.
[6] WANG L, QIAO Y, TANG X. MoFAP:a multi-level representation for action recognition[J]. International Journal of Computer Vision, 2016, 119(3):254-271.
[7] KARPATHY A, TODERICI G, SHETTY S, et al. Large-scale video classification with convolutional neural networks[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Rec-ognition. Washington, DC:IEEE Computer Society, 2014:1725-1732.
[8] TRAN D, BOURDEV L, FERGUS R, et al. Learning spatiotemporal features with 3D convolutional networks[C]//Proceedings of the 2014 IEEE International Conference on Computer Vision. Washington, DC:IEEE Computer Society, 2015:4489-4497.
[9] VAROL G, LAPTEV I, SCHMID C. Long-term temporal convolutions for action recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(6):1510-1517.
[10] SIMONYAN K, ZISSERMAN A. Two-stream convolutional net-works for action recognition in videos[C]//Proceedings of the 2014 Conference on Neural Information Processing Systems. New York:Curran Associates, 2014:568-576.
[11] NG Y H, HAUSKNECHT M, VIJAYANARASIMHAN S, et al. Beyond short snippets:deep networks for video classification[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2015:4694-4702.
[12] WANG L M, XIONG Y J, WANG Z, et al. Temporal segment networks:towards good practices for deep action recognition[C]//Proceedings of the 2016 European Conference on Computer Vision. Berlin:Springer, 2016:22-36.
[13] WANG L, QIAO Y, TANG X. Action recognition with trajectory-pooled deep-convolutional descriptors[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2015:4305-4314.
[14] SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2016:2818-2826.
[15] MURPHY K P. Machine Learning:A Probabilistic Perspective[M]. Cambridge:MIT Press, 2012:22.
[16] HORN B K P, SCHUNCK B G. Determining optical flow[J]. Artificial Intelligence, 1981, 17(1/2/3):185-203.
[17] 周志华.机器学习[M].北京:清华大学出版社,2016:171-173.(ZHOU Z H. Machine Learning[M]. Beijing:Tsinghua University Press, 2016:171-173.)
[18] JIANG Y G, LIU J, ZAMIR A, et.al. Competition track evaluation setup, the first international workshop on action recognition with a large number of classes[EB/OL].[2018-05-20]. http://www.crcv.ucf.edu/ICCV13-Action-Workshop/index.files/Competition_Track_Evaluation.pdf. |