基于改进时空兴趣点特征的双人交互行为识别

doi:10.11772/j.issn.1001-9081.2016.10.2875

计算机应用 ›› 2016, Vol. 36 ›› Issue (10): 2875-2879.DOI: 10.11772/j.issn.1001-9081.2016.10.2875

基于改进时空兴趣点特征的双人交互行为识别

王佩瑶¹, 曹江涛¹, 姬晓飞²

1. 辽宁石油化工大学信息与控制工程学院, 辽宁抚顺 113001;
2. 沈阳航空航天大学自动化学院, 沈阳 110136

收稿日期:2016-03-14 修回日期:2016-07-04 发布日期:2016-10-10
通讯作者: 姬晓飞,E-mail:jixiaofei7804@126.com
作者简介:王佩瑶(1991—),女,辽宁沈阳人,硕士研究生,主要研究方向:视频分析、模式识别;曹江涛(1978—),男,山东郓城人,教授,博士,主要研究方向:智能控制、视频分析;姬晓飞(1978—),女,辽宁鞍山人,副教授,博士,主要研究方向:视频分析、模式识别。
基金资助:
国家自然科学基金资助项目（61103123）；辽宁省高等学校优秀人才支持计划项目（LJQ2014018，LR2015034）。

Two-person interaction recognition based on improved spatio-temporal interest points

WANG Peiyao¹, CAO Jiangtao¹, JI Xiaofei²

1. School of Information and Control Engineering, Liaoning Shihua University, Fushun Liaoning 113001, China;
2. School of Automation, Shenyang Aerospace University, Shenyang Liaoning 110136, China

Received:2016-03-14 Revised:2016-07-04 Published:2016-10-10
Supported by:
This work is partially supported by the National Natural Science Foundation of China (61103123), the Program for Liaoning Excellent Talents in University (LJQ2014018,LR2015034).

摘要/Abstract

摘要： 针对实际监控视频下双人交互行为的兴趣点特征选取不理想，且聚类词典中冗余单词导致识别率不高的问题，提出一种基于改进时空兴趣点（STIP）特征的交互行为识别方法。首先，引入基于信息熵的不可跟踪性检测方法，对序列图像进行跟踪得到交互动作的前景运动区域，仅在此区域内提取时空兴趣点以提高兴趣点检测的准确性。其次采用3维尺度不变特性转换（3D-SIFT）描述子对检测得到的兴趣点进行表述，利用改进的模糊C均值聚类方法得到视觉词典，以提升词典的分布特性；在此基础上建立词袋模型，即将训练集样本向词典进行投影得到每帧图像的直方图统计特征表示。最后，采用帧帧最近邻分类方法进行双人交互动作识别。在UT-interaction数据库上进行测试，该算法得到了91.7%的正确识别率。实验结果表明，通过不可跟踪性检测得到的时空兴趣点的改进词袋算法可以较大程度提高交互行为识别的准确率，并且适用于动态背景下的双人交互行为识别。

关键词: 时空兴趣点, 信息熵, 双人交互行为识别, 词袋模型, 模糊C均值, 3维尺度不变特性转换, 最近邻分类器

Abstract: Concerning the problem of unsatisfactory feature extraction and low recognition rate caused by redundant words in clustering dictionary in the practical monitoring video for two-person interaction recognition, a Bag Of Word (BOW) model based on improved Spatio-Temporal Interest Point (STIP) feature was proposed. First of all, foreground movement area of interaction was detected in the image sequences by the intractability method of information entropy, then the STIPs were extracted and described by 3-Dimensional Scale-Invariant Feature Transform (3D-SIFT) descriptor in detected area to improve the accuracy of the detection of interest points. Second, the BOW model was built by using the improved Fuzzy C-Means (FCM) clustering method to get the dictionary, and the representation of the training video was obtained based on dictionary projection. Finally, the nearest neighbor classification method was chosen for the two-person interaction recognition. Experimental results showed that compared with the recent STIPs feature algorithm, the improved method with intractability detection achieved 91.7% of recognition rate. The simulation results demonstrate that the intractability detection method combined with improved BOW model can greatly improve the accuracy of two-person interaction recognition, and it is suitable for dynamic background.

Key words: Spatio-Temporal Interest Point (STIP), information entropy, two-person interaction recognition, Bag Of Word (BOW) model, Fuzzy C-Means (FCM), 3-Dimensional Scale-Invariant Feature Transform (3D-SIFT), nearest neighbor classifier

中图分类号:

TP391.4

王佩瑶, 曹江涛, 姬晓飞. 基于改进时空兴趣点特征的双人交互行为识别[J]. 计算机应用, 2016, 36(10): 2875-2879.

WANG Peiyao, CAO Jiangtao, JI Xiaofei. Two-person interaction recognition based on improved spatio-temporal interest points[J]. Journal of Computer Applications, 2016, 36(10): 2875-2879.

参考文献

[1] YU G, YUAN J, LIU Z. Predicting human activities using spatio-temporal structure of interest points[C]//Proceedings of the 20th ACM International Conference on Multimedia. New York: ACM, 2012: 1049-1052.
[2] YU T H, KIM T K, CIPOLLA R. Real-time action recognition by spatiotemporal semantic and structural forests[C]//BMVC 2010: Proceedings of the 21st British Machine Vision Conference. Bristol: BMVA, 2010: 1-12.
[3] LAPTEV I, LINDEBERG T. Space-time interest points [C]//Proceedings of the 9th IEEE International Conference on Computer Vision. Piscataway, NJ: IEEE, 2003:432-439.
[4] DOLLÁR P, RABAUD V, COTTRELL G, et al. Behavior recognition via sparse spatio-temporal features[C]//Proceedings of the 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance. Piscataway, NJ: IEEE, 2005: 65-72.
[5] BURGHOUTS G J, SCHUTTE K. Spatio-temporal layout of human actions for improved bag-of-words action detection[J]. Pattern Recognition Letters, 2013, 34(15): 1861-1869.
[6] ZHANG X, CUI J, TIAN L, et al. Local spatio-temporal feature based voting framework for complex human activity detection and localization[C]//Proceedings of the First Asian Conference on Pattern Recognition. Piscataway, NJ: IEEE, 2011: 12-16.
[7] LI N, CHENG X, GUO H, et al. A hybrid method for human interaction recognition using spatio-temporal interest points [C]//ICPR 2014: Proceedings of the 22nd International Conference on Pattern Recognition. Piscataway, NJ: IEEE, 2014: 2513-2518.
[8] 韩磊, 李君峰, 贾云得. 基于时空单词的两人交互行为识别方法[J]. 计算机学报, 2010, 33(4): 776-784.(HAN L, LI J F, JIA Y D. Human interaction recognition using spatio-temporal words[J]. Chinese Journal of Computers, 2010, 33(4): 776-784.)
[9] GAUR U, ZHU Y, SONG B, et al. A "string of feature graphs" model for recognition of complex activities in natural videos[C]//Proceedings of the 2011 IEEE International Conference on Computer Vision. Piscataway, NJ: IEEE, 2011: 2595-2602.
[10] PENG X, PENG Q, QIAO Y, et al. Exploring dense trajectory feature and encoding methods for human interaction recognition[C]//ICIMCS 2013: Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service. Piscataway, NJ: IEEE, 2013: 23-27.
[11] 王策, 姬晓飞, 李一波. 一种简便的视角无关动作识别方法[J]. 智能系统学报, 2014, 9(5): 577-583.(WANG C, JI X F, LI Y B. Study on a simple view-invariant action recognition method[J]. CAAI Transactions on Intelligent Systems, 2014, 9(5): 577-583.)
[12] 王世刚, 孙爱朦, 赵文婷, 等. 基于时空兴趣点的单人行为及交互行为识别[J]. 吉林大学学报(工学版), 2015,45(1): 304-308.(WANG S G, SUN A M, ZHAO W T, et al. Single and interactive human behavior recognition algorithm based on spatio-temporal interest point[J]. Journal of Jilin University (Engineering and Technology Edition), 2015,45(1): 304-308.)
[16] KONG Y, LIANG W, DONG Z Y, et al. Recognising human interaction from videos by a discriminative model[J]. IET Computer Vision, 2014, 8(4): 277-286.
[17] PATRON-PEREZ A, MARSZALEK M, REID I, et al. Structured learning of human interactions in TV shows[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(12): 2441-2453.
[13] GONG H, ZHU S C. Intrackability: characterizing video statistics and pursuing video representations[J]. International Journal of Computer Vision, 2012, 97(3): 255-275.
[14] SCOVANNER P, ALI S, SHAH M. A 3-dimensional SIFT descriptor and its application to action recognition[C]//Proceedings of the 15th International Conference on Multimedia. New York: ACM, 2007: 357-360.
[15] 朱旭锋, 马彩文, 刘波. 采用改进词袋模型的空中目标自动分类[J]. 红外与激光工程, 2012, 41(5): 1384-1388.(ZHU X F, MA C W, LIU B. Aerial target automatic classification based on improving bag of words model[J]. Infrared and Laser Engineering, 2012, 41(5):1384-1388.)
[16] KONG Y, LIANG W, DONG Z Y, et al. Recognising human interaction from videos by a discriminative model[J]. IET Computer Vision, 2014, 8(4): 277-286.
[17] PATRON-PEREZ A, MARSZALEK M, REID I, et al. Structured learning of human interactions in TV shows[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(12): 2441-2453.

基于改进时空兴趣点特征的双人交互行为识别

Two-person interaction recognition based on improved spatio-temporal interest points

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	石雪松, 李宪华, 孙青, 宋韬. 基于人工蜂群与模糊C均值的自适应小波变换的噪声图像分割[J]. 计算机应用, 2021, 41(8): 2312-2317.
[2]	王治和, 常筱卿, 杜辉. 基于万有引力的自适应近邻传播聚类算法[J]. 计算机应用, 2021, 41(5): 1337-1342.
[3]	袁芊芊, 邓洪敏, 王晓航. 基于超像素快速模糊C均值聚类与支持向量机的柑橘病虫害区域分割[J]. 计算机应用, 2021, 41(2): 563-570.
[4]	袁园, 吴文, 万毅. 基于熵驱动域适应学习的单幅图像阴影检测方法[J]. 计算机应用, 2020, 40(7): 2131-2136.
[5]	孙建军, 徐岩. 基于加权改进模糊C均值聚类的欠定混合矩阵估计[J]. 计算机应用, 2020, 40(6): 1769-1773.
[6]	陈程军, 毛莺池, 王绎超. 基于激活-熵的分层迭代剪枝策略的CNN模型压缩[J]. 计算机应用, 2020, 40(5): 1260-1265.
[7]	张伍, 陈红梅. 基于多核模糊粗糙集与蝗虫优化算法的高光谱波段选择[J]. 计算机应用, 2020, 40(5): 1425-1430.
[8]	王燕, 何宏科. 基于邻域信息的改进模糊c均值脑MRI分割[J]. 计算机应用, 2020, 40(4): 1196-1201.
[9]	童玉珍, 王应明. 基于后悔理论及EDAS法的概率语言多属性群决策方法[J]. 计算机应用, 2020, 40(11): 3152-3158.
[10]	张伍, 陈红梅. 基于核模糊粗糙集的高光谱波段选择算法[J]. 计算机应用, 2020, 40(1): 258-263.
[11]	董发志, 丁洪伟, 杨志军, 熊成彪, 张颖婕. 基于遗传算法和模糊C均值聚类的WSN分簇路由算法[J]. 计算机应用, 2019, 39(8): 2359-2365.
[12]	丁莲静, 刘光帅, 李旭瑞, 陈晓文. 加权信息熵与增强局部二值模式结合的人脸识别[J]. 计算机应用, 2019, 39(8): 2210-2216.
[13]	毛莺池, 曹海, 平萍, 李晓芳. 基于最大联合条件互信息的特征选择[J]. 计算机应用, 2019, 39(3): 734-741.
[14]	冯国政, 徐金东, 范宝德, 赵甜雨, 朱萌, 孙潇. 基于半监督模糊C均值算法的遥感影像分类[J]. 计算机应用, 2019, 39(11): 3227-3232.
[15]	王荣淼, 张峰峰, 詹蔚, 陈军, 吴昊. 基于空间约束的模糊C均值聚类肝脏CT图像分割[J]. 计算机应用, 2019, 39(11): 3366-3369.