计算机应用 ›› 2016, Vol. 36 ›› Issue (2): 568-573.DOI: 10.11772/j.issn.1001-9081.2016.02.0568

• 人工智能 • 上一篇    下一篇

三维动作识别时空特征提取方法

徐海宁1, 陈恩庆1, 梁成武1,2   

  1. 1. 郑州大学 信息工程学院, 郑州 450001;
    2. 河南城建学院 电气与信息工程学院, 河南 平顶山 467001
  • 收稿日期:2015-07-13 修回日期:2015-09-09 出版日期:2016-02-10 发布日期:2016-02-03
  • 通讯作者: 徐海宁(1989-),男,河南鹤壁人,硕士研究生,主要研究方向:模式识别、图像处理。
  • 作者简介:陈恩庆(1977-),男,河南郑州人,副教授,博士,主要研究方向:图像处理、信号处理;梁成武(1982-),男,河南平顶山人,博士研究生,主要研究方向:模式识别、图像处理。
  • 基金资助:
    国家自然科学基金重大国际合作项目(61210005);国家自然科学基金重点项目(61331021)。

Three-dimensional spatio-temporal feature extraction method for action recognition

XU Haining1, CHEN Enqing1, LIANG Chengwu1,2   

  1. 1. College of Information Engineering, Zhengzhou University, Zhengzhou Henan 450001, China;
    2. College of Electrical and Information Engineering, Henan University of Urban Construction, Pingdingshan Henan 467001, China
  • Received:2015-07-13 Revised:2015-09-09 Online:2016-02-10 Published:2016-02-03

摘要: 针对传统的彩色视频中动作识别算法成本高,且二维信息不足导致动作识别效果不佳的问题,提出一种新的基于三维深度图像序列的动作识别方法。该算法在时间维度上提出了时间深度模型(TDM)来描述动作。在三个正交的笛卡尔平面上,将深度图像序列分成几个子动作,对所有子动作作帧间差分并累积能量,形成深度运动图来描述动作的动态特征。在空间维度上,用空间金字塔方向梯度直方图(SPHOG)对时间深度模型进行编码得到了最终的描述符。最后用支持向量机(SVM)进行动作的分类。在两个权威数据库MSR Action3D和MSRGesture3D上进行实验验证,该方法识别率分别达到了94.90%(交叉测试组)和94.86%。实验结果表明,该方法能够快速对深度图像序列进行计算并取得较高的识别率,并基本满足深度视频序列的实时性要求。

关键词: 动作识别, 三维深度图像, 方向梯度直方图, 时空金字塔, 深度运动图

Abstract: Concerning the high costs of traditional action recognition algorithm in color video and poor recognition performance caused by insufficient two-dimensional information, a new human action recognition method based on three-dimensional depth image sequence was put forward. On the temporal dimension, Temporal Depth Model (TDM) was proposed to describe the action. Specially, the entire depth maps were divided into several sub-actions under three orthogonal Cartesian planes. The absolute difference between two consecutive projected maps was accumulated to form a depth motion map to describe the dynamic feature of an action. On the spatial-dimension, Spatial Pyramid Histogram of Oriented Gradient (SPHOG) was computed from the TDM for the representation of an action to obtain the final descriptor. Support Vector Machine (SVM) was used to classify the proposed descriptors at last. The proposed method was tested on two authoritative datasets including MSR Action3D dataset and MSRGesture3D dataset, the recognition rates were 94.90% (cross subject test) and 94.86% respectively. The experimental results demonstrate that the proposed method has fast speed and better recognition, also it meets the real-time requirement in the depth video sequence system basically.

Key words: action recognition, three-dimensional depth image, Histogram of Oriented Gradient(HOG), spatio-temporal pyramid, depth motion map

中图分类号: