三维动作识别时空特征提取方法

doi:10.11772/j.issn.1001-9081.2016.02.0568

计算机应用 ›› 2016, Vol. 36 ›› Issue (2): 568-573.DOI: 10.11772/j.issn.1001-9081.2016.02.0568

三维动作识别时空特征提取方法

徐海宁¹, 陈恩庆¹, 梁成武^1,2

1. 郑州大学信息工程学院, 郑州 450001;
2. 河南城建学院电气与信息工程学院, 河南平顶山 467001

收稿日期:2015-07-13 修回日期:2015-09-09 发布日期:2016-02-03 出版日期:2016-02-10
通讯作者: 徐海宁(1989-),男,河南鹤壁人,硕士研究生,主要研究方向:模式识别、图像处理。
作者简介:陈恩庆(1977-),男,河南郑州人,副教授,博士,主要研究方向:图像处理、信号处理;梁成武(1982-),男,河南平顶山人,博士研究生,主要研究方向:模式识别、图像处理。
基金资助:
国家自然科学基金重大国际合作项目(61210005);国家自然科学基金重点项目(61331021)。

Three-dimensional spatio-temporal feature extraction method for action recognition

XU Haining¹, CHEN Enqing¹, LIANG Chengwu^1,2

1. College of Information Engineering, Zhengzhou University, Zhengzhou Henan 450001, China;
2. College of Electrical and Information Engineering, Henan University of Urban Construction, Pingdingshan Henan 467001, China

Received:2015-07-13 Revised:2015-09-09 Online:2016-02-03 Published:2016-02-10

摘要/Abstract

摘要： 针对传统的彩色视频中动作识别算法成本高,且二维信息不足导致动作识别效果不佳的问题,提出一种新的基于三维深度图像序列的动作识别方法。该算法在时间维度上提出了时间深度模型(TDM)来描述动作。在三个正交的笛卡尔平面上,将深度图像序列分成几个子动作,对所有子动作作帧间差分并累积能量,形成深度运动图来描述动作的动态特征。在空间维度上,用空间金字塔方向梯度直方图(SPHOG)对时间深度模型进行编码得到了最终的描述符。最后用支持向量机(SVM)进行动作的分类。在两个权威数据库MSR Action3D和MSRGesture3D上进行实验验证,该方法识别率分别达到了94.90%(交叉测试组)和94.86%。实验结果表明,该方法能够快速对深度图像序列进行计算并取得较高的识别率,并基本满足深度视频序列的实时性要求。

关键词: 动作识别, 三维深度图像, 方向梯度直方图, 时空金字塔, 深度运动图

Abstract: Concerning the high costs of traditional action recognition algorithm in color video and poor recognition performance caused by insufficient two-dimensional information, a new human action recognition method based on three-dimensional depth image sequence was put forward. On the temporal dimension, Temporal Depth Model (TDM) was proposed to describe the action. Specially, the entire depth maps were divided into several sub-actions under three orthogonal Cartesian planes. The absolute difference between two consecutive projected maps was accumulated to form a depth motion map to describe the dynamic feature of an action. On the spatial-dimension, Spatial Pyramid Histogram of Oriented Gradient (SPHOG) was computed from the TDM for the representation of an action to obtain the final descriptor. Support Vector Machine (SVM) was used to classify the proposed descriptors at last. The proposed method was tested on two authoritative datasets including MSR Action3D dataset and MSRGesture3D dataset, the recognition rates were 94.90% (cross subject test) and 94.86% respectively. The experimental results demonstrate that the proposed method has fast speed and better recognition, also it meets the real-time requirement in the depth video sequence system basically.

Key words: action recognition, three-dimensional depth image, Histogram of Oriented Gradient(HOG), spatio-temporal pyramid, depth motion map

中图分类号:

TP391.413

徐海宁, 陈恩庆, 梁成武. 三维动作识别时空特征提取方法[J]. 计算机应用, 2016, 36(2): 568-573.

XU Haining, CHEN Enqing, LIANG Chengwu. Three-dimensional spatio-temporal feature extraction method for action recognition[J]. Journal of Computer Applications, 2016, 36(2): 568-573.

参考文献

[1] 胡琼,秦磊,黄庆明.基于视觉的人体动作识别综述[J].计算机学报,2013,36(12):2512-2524.(HU Q, QIN L, HUANG Q M. A survey on visual human action recognition[J]. Chinese Journal of Computers, 2013, 36(12): 2512-2524.)
[2] 徐光祐,曹媛媛.动作识别与行为理解综述[J].中国图象图形学报,2009,14(2):189-195.(XU G Y, CAO Y Y. Action recognition and activity understanding: a review[J]. Journal of Image and Graphics, 2009, 14(2): 189-195.)
[3] MANINIS K, KOUTRAS P, MARAGOS P, et al. Advances on action recognition in videos using an interest point detector based on multiband spatio-temporal energies[C]//ICIP 2014: Proceedings of the 2014 IEEE International Conference on Image Processing. Piscataway, NJ: IEEE, 2014: 1490-1494.
[4] DALAL N, TRIGGS B. Silhouette analysis-based action recognition via exploiting human poses[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2013, 23(2): 236-243.
[5] SHOTTON J, FITZGIBBON A, COOK M, et al. Real-time human pose recognition in parts from single depth images[C]//ICML 2013: Proceedings of the 2013 ACM International Conference on Machine Learning for Computer Vision. New York: ACM, 2013: 116-124.
[6] OREIFE O, LIU Z. HON4D: histogram of oriented 4D normals for activity recognition from depth sequences[C]//CVPR 2013: Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2013: 716-723.
[7] YANG X, TIAN Y. Super normal vector for activity recognition using depth sequences[C]//CVPR 2014: Proceedings of 2014 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2014: 804-811.
[8] YANG X, TIAN Y. Recognizing actions using depth motion maps based histograms of oriented gradients[C]//ICML 2012: Proceedings of 2012 ACM International Conference on Machine Learning for Computer Vision. New York: ACM, 2012: 1057-1060.
[9] LI W, ZHANG Z, LIU Z. Action recognition based on a bag of 3D points[C]//CVPR 2010: Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2010: 9-14.
[10] WANG J, LIU P, NAHAVANDI S, et al. Human action recognition based on pyramid histogram of oriented gradients[C]//SMC 2011: Proceedings of 2011 IEEE International Conference on Systems, Man, and Cybernetics. Piscataway, NJ: IEEE, 2011: 2449-2454.
[11] LIANG B, ZHENG L. 3D motion trail model based pyramid histograms of oriented gradient for action recognition[C]//ICPR 2014: Proceedings of 2014 IEEE International Conference on Pattern Recognition. Piscataway, NJ: IEEE, 2014: 1952-1957.
[12] 王鑫,沃波海,管秋,等.基于流形学习的人体动作识别[J].中国图象图形学报,2014,19(6):914-923.(WANG X, WO B H, GUAN Q,et al. Human action recognition based on manifold learning[J]. Journal of Image and Graphics, 2014, 19(6): 914-923.)
[13] BURGES C. A tutorial on support vector machines for pattern recognition[J]. Data Mining and Knowledge Discovery, 1998, 2(2): 121-167.
[14] CHANG C, LIN C. LIBSVM: a library for support vector machines[J]. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3): Article No. 27.
[15] XIA L, CHEN C, AGGARWAL J. View invariant human action recognition using histograms of 3D joints[C]//CVPRW 2012: Proceedings of 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshop. Piscataway, NJ: IEEE, 2012: 20-27.
[16] YANG X, TIAN Y. Eigenjoints-based action recognition using naive-Bayes-nearest-neighbor[C]//CVPRW 2012: Proceedings of 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshop. Piscataway, NJ: IEEE, 2012: 14-19.
[17] VIEIRA A, NASCIMENTO E, OLIVEIRA G, et al. STOP: space-time occupancy patterns for 3D action recognition from depth map sequences[C]//CIARP 2012: Proceedings of 2012 International Conference on Pattern Recognition, Image Analysis, Computer Vision, and Applications. Berlin: Springer, 2012: 252-259.
[18] WANG J, LIU Z, CHOROWSKI J, et al. Robust 3D action recognition with random occupancy patterns[C]//ECCV 2012: Proceedings of 2012 IEEE Computer Society Conference on European Conference on Computer Vision. Piscataway, NJ: IEEE, 2012: 872-885. (上接第573页)
[19] ZANFIR M, LEORDEANU M, SMINCHISECU C. The moving pose: an efficient 3D kinematics descriptor for low-latency action recognition and detection[C]//ICCV 2013: Proceedings of the 2013 IEEE Computer Society Conference on International Conference on Computer Vision. Piscataway, NJ: IEEE, 2013: 2752-2759.
[20] KURAKIN A, ZHANG Z, LIU Z. A real-time system for dynamic hand gesture recognition with a depth sensor[C]//ESPC 2012: Proceedings of the 2012 IEEE Conference on European Signal Processing Conference. Piscataway, NJ: IEEE, 2012: 1975-1979.

三维动作识别时空特征提取方法

Three-dimensional spatio-temporal feature extraction method for action recognition

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 14

编辑推荐

Metrics

[1]	李豆豆, 李汪根, 夏义春, 束阳, 高坤. 基于特征交互与自适应融合的骨骼动作识别[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2581-2587.
[2]	李南帆, 司文文, 杜思远, 王志勇, 钟重阳, 夏时洪. 基于循环神经网络的人体运动模型的隐状态初始化方法[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 723-727.
[3]	刘磊, 伍鹏, 谢凯, 程贝芝, 盛冠群. 自监督学习HOG预测辅助任务下的车位检测方法[J]. 《计算机应用》唯一官方网站, 2023, 43(12): 3933-3940.
[4]	陈亭秀, 尹建芹. 基于关键帧筛选网络的视听联合动作识别[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 731-735.
[5]	郭天晓, 胡庆锐, 李建伟, 沈燕飞. 基于人体骨架特征编码的健身动作识别方法[J]. 计算机应用, 2021, 41(5): 1458-1464.
[6]	李前, 杨文柱, 陈向阳, 苑侗侗, 王玉霞. 基于紧耦合时空双流卷积神经网络的人体动作识别模型[J]. 计算机应用, 2020, 40(11): 3178-3183.
[7]	陈立潮, 张雷, 曹建芳, 张睿. 梯度直方图卷积特征的胶囊网络在交通监控下的车型分类[J]. 计算机应用, 2020, 40(10): 2881-2889.
[8]	杨世强, 罗晓宇, 乔丹, 柳培蕾, 李德信. 基于滑动窗口和动态规划的连续动作分割与识别[J]. 计算机应用, 2019, 39(2): 348-353.
[9]	杨天明, 陈志, 岳文静. 基于视频深度学习的时空双流人物动作识别模型[J]. 计算机应用, 2018, 38(3): 895-899.
[10]	张全贵, 蔡丰, 李志强. 基于耦合多隐马尔可夫模型和深度图像数据的人体动作识别[J]. 计算机应用, 2018, 38(2): 454-457.
[11]	吴峰, 王颖. 基于改进信息增益的人体动作识别视觉词典建立[J]. 计算机应用, 2017, 37(8): 2240-2243.
[12]	姬晓飞, 左鑫孟. 基于关键帧特征库统计特征的双人交互行为识别[J]. 计算机应用, 2016, 36(8): 2287-2291.
[13]	陆中秋, 侯振杰, 陈宸, 梁久祯. 基于深度图像与骨骼数据的行为识别[J]. 计算机应用, 2016, 36(11): 2979-2984.
[14]	曾伟朱桂斌陈杰唐丁丁. 多特征融合的鲁棒粒子滤波跟踪算法[J]. 计算机应用, 2010, 30(3): 643-645.