基于改进信息增益的人体动作识别视觉词典建立

doi:10.11772/j.issn.1001-9081.2017.08.2240

计算机应用 ›› 2017, Vol. 37 ›› Issue (8): 2240-2243.DOI: 10.11772/j.issn.1001-9081.2017.08.2240

基于改进信息增益的人体动作识别视觉词典建立

吴峰, 王颖

北京化工大学信息科学与技术学院, 北京 100029

收稿日期:2017-02-24 修回日期:2017-04-12 发布日期:2017-08-12 出版日期:2017-08-10
通讯作者: 王颖
作者简介:吴峰(1992-),男,黑龙江绥化人,硕士研究生,主要研究方向:数字图像处理、人体动作识别;王颖(1969-),女,天津人,副教授,主要研究方向:光电检测、机器视觉检测、人工智能检测。
基金资助:
国家自然科学基金资助项目（61340056）。

Visual dictionary construction for human actions recognition based on improved information gain

WU Feng, WANG Ying

College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029

Received:2017-02-24 Revised:2017-04-12 Online:2017-08-12 Published:2017-08-10
Supported by:
This work is partially supported by the National Natural Science Foundation of China (61340056).

摘要/Abstract

摘要： 针对词袋（BoW）模型方法基于信息增益的视觉词典建立方法未考虑词频对动作识别的影响，为提高动作识别准确率，提出了基于改进信息增益建立视觉词典的方法。首先，基于3D Harris提取人体动作视频时空兴趣点并利用K均值聚类建立初始视觉词典；然后引入类内词频集中度和类间词频分散度改进信息增益，计算初始词典中词汇的改进信息增益，选择改进信息增益大的视觉词汇建立新的视觉词典；最后基于支持向量机（SVM）采用改进信息增益建立的视觉词典进行人体动作识别。采用KTH和Weizmann人体动作数据库进行实验验证。相比传统信息增益，两个数据库利用改进信息增益建立的视觉词典动作识别准确率分别提高了1.67%和3.45%。实验结果表明，提出的基于改进信息增益的视觉词典建立方法能够选择动作识别能力强的视觉词汇，提高动作识别准确率。

关键词: 人体动作识别, 词袋模型, 信息增益, 词频

Abstract: Since term frequency is not considered by traditional information gain in Bag-of-Words (BoW) model, a new visual dictionary constructing method based on improved information gain was proposed to improve the human actions recognition accuracy. Firstly, spatio-temporal interest points of human action video were extracted by using 3D Harris, then clustered by K-means to construct initial visual dictionary. Secondly, concentration of term frequency within cluster and dispersion of term frequency between clusters were introduced to improve the information gain, which was used to compute the initial dictionary; then the visual words with larger information gain were selected to build a new visual dictionary. Finally, the human actions were recognized based on Support Vector Machine (SVM) using the improved information gain. The proposed method was verified by human actions recognition of KTH and Weizmann databases. Compared with the traditional information gain, the actions recognition accuracy was increased by 1.67% and 3.45% with the dictionary constructed by improved information gain. Experimental results show that the visual dictionary of human actions based on improved information gain increases the accuracy of human actions recognition by selecting more discriminate visual words.

Key words: human actions recognition, Bag-of-Words (BoW) model, information gain, term frequency

中图分类号:

吴峰, 王颖. 基于改进信息增益的人体动作识别视觉词典建立[J]. 计算机应用, 2017, 37(8): 2240-2243.

WU Feng, WANG Ying. Visual dictionary construction for human actions recognition based on improved information gain[J]. Journal of Computer Applications, 2017, 37(8): 2240-2243.

参考文献

[1] 石祥滨,刘拴朋,张德园.基于关键帧的人体动作识别方法[J]. 系统仿真学报,2015,27(10):2401-2408. (SHI X B, LIU S P, ZHANG D Y. Human action recognition method based on key frames[J]. Journal of System Simulation, 2015, 27(10):2401-2408.)
[2] KHAN R, BARAT C, MUSELET D, et al. Spatial orientations of visual word pairs to improve bag-of-visual-words model[C]//BMVC 2012:Procedings of the 2012 British Machine Vision Conference. Durham, UK:BMVA Press, 2012:1-11.
[3] FARAKI M, PALHANG M, SANDERSON C. Log-Euclidean bag of words for human action recognition[J]. IET Computer Vision, 2016, 9(3):331-339.
[4] LAZEBNIK S, SCHMID C, PONCE J. Beyond bags of features:spatial pyramid matching for recognizing natural scene categories[C]//CVPR' 06:Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2006, 2:2169-2178.
[5] LIU J, SHAH M. Learning human actions via information maximization[C]//CVPR' 08:Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society. Washington, DC:IEEE Computer Society, 2008:2971-2978.
[6] LI Z, LU W, SUN Z, et al. A parallel feature selection method study for text classification[J]. Neural Computing & Applications, 2016, 27:1-12.
[7] 贾隆嘉,孙铁利,杨凤芹,等.基于类空间密度的文本分类特征加权算法[J]. 吉林大学学报(信息科学版),2017,35(1):92-97. (JIA L J, SUN T L, YANG F Q, et al. Class space density based weighting scheme for automated text categorization[J]. Journal of Jilin University (Information Science Edition), 2017, 35(1):92-97.)
[8] UYSAL A K. An improved global feature selection scheme for text classification[J]. Expert Systems with Applications, 2016, 43(C):82-92.
[9] KIM S, KWEON I S, LEE C W. Visual categorization robust to large intra-class variations using entropy-guided codebook[C]//Proceedings of the 2007 IEEE International Conference on Robotics and Automation. Piscataway, NJ:IEEE, 2007:3793-3798. DOI:10.1109/ROBOT.2007.364060 https://doi.org/10.1109/ROBOT.2007.364060
[10] YANG J, JIANG Y-G, HAUPTMANN A G, et al. Evaluating bag-of-visual-words representations in scene classification[C]//MIR' 07:Proceedings of the International Workshop on Workshop on Multimedia Information Retrieval. New York:ACM, 2007:197-206. doi>10.1145/1290082.1290111
[11] LAPTEV I, LINDEBERG T. On space-time interest points[J]. International Journal of Computer Vision, 2005, 64(2/3):107-123. DOI:10.1007/s11263-005-1838-7
[12] 李学明,李海瑞,薛亮,等.基于信息增益与信息熵的TFIDF算法[J].计算机工程,2012,38(8):37-40. (LI X M, LI H R, XUE L, et al. TFIDF algorithm based on information gain and information entropy[J]. Computer Engineering, 2012, 38(8):37-40.)
[13] KLÄSER A, MARSZALEK M, SCHMID C. A spatio-temporal descriptor based on 3D-gradients[C]//BMVC 2008:Procedings of the 2008 British Machine Vision Conference. Durham, UK:BMVA Press, 2008:995-1004. DOI:10.5244/C.22.99
[14] LAPTEV I, MARSZALEK M, SCHMID C, et al. Learning realistic human actions from movies[C]//CVPR' 08:Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2008:1-8. DOI:10.1109/CVPR.2008.4587756
[15] LERTNIPHONPHAN K, ARAMVITH S, CHALIDABHONGSE T H. Human action recognition using direction histograms of optical flow[C]//ISCIT 2011:Proceedings of the 201111th International Symposium on Communications and Information Technologies. Piscataway, NJ:IEEE, 2011:574-579.

基于改进信息增益的人体动作识别视觉词典建立

Visual dictionary construction for human actions recognition based on improved information gain

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	赵静, 韩京宇, 钱龙, 毛毅. 基于改进的RAKEL算法的心电图诊断分类[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1892-1897.
[2]	张建, 严珂, 马祥. 基于神经网络的复杂垃圾信息过滤算法分析[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 770-777.
[3]	邱云志, 汪廷华, 戴小路. 双重特征加权模糊支持向量机[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 683-687.
[4]	李前, 杨文柱, 陈向阳, 苑侗侗, 王玉霞. 基于紧耦合时空双流卷积神经网络的人体动作识别模型[J]. 计算机应用, 2020, 40(11): 3178-3183.
[5]	王伟, 谢耀滨, 尹青. 针对不平衡数据的决策树改进方法[J]. 计算机应用, 2019, 39(3): 623-628.
[6]	李勇, 相中启. 基于计数型布隆过滤器的可排序密文检索方法[J]. 计算机应用, 2018, 38(9): 2554-2559.
[7]	王茜, 陈一民, 丁友东. 复杂环境中基于视觉词袋模型的车辆再识别算法[J]. 计算机应用, 2018, 38(5): 1299-1303.
[8]	杨宏宇, 王玥. 云存储环境下的多关键字密文搜索方法[J]. 计算机应用, 2018, 38(2): 343-347.
[9]	张全贵, 蔡丰, 李志强. 基于耦合多隐马尔可夫模型和深度图像数据的人体动作识别[J]. 计算机应用, 2018, 38(2): 454-457.
[10]	章宁, 陈钦. 基于TF-IDF算法的P2P贷款违约预测模型[J]. 计算机应用, 2018, 38(10): 3042-3047.
[11]	王嘉卿, 朱焱, 陈同孝, 张真诚. 欺诈网页检测中基于遗传算法的特征优选[J]. 计算机应用, 2018, 38(1): 295-299.
[12]	张永, 杨浩. 基于优化视觉词袋模型的图像分类方法[J]. 计算机应用, 2017, 37(8): 2244-2247.
[13]	王欢, 张丽萍, 闫盛, 刘东升. 克隆代码有害性预测中的特征选择模型[J]. 计算机应用, 2017, 37(4): 1135-1142.
[14]	陈桌, 张丽萍, 王欢, 张久杰, 王春晖. 基于改进向量空间模型的克隆群映射方法[J]. 计算机应用, 2016, 36(7): 2031-2037.
[15]	黄伟, 林劼, 江育娥. 云环境下软件错误报告自动分类算法改进[J]. 计算机应用, 2016, 36(5): 1212-1215.