计算机应用 ›› 2017, Vol. 37 ›› Issue (12): 3517-3522.DOI: 10.11772/j.issn.1001-9081.2017.12.3517

• 计算机视觉与虚拟现实 • 上一篇    下一篇

基于多尺度特征融合Hessian稀疏编码的图像分类算法

刘盛清, 孙季丰, 余家林, 宋治国   

  1. 华南理工大学 电子与信息学院, 广州 510641
  • 收稿日期:2017-06-05 修回日期:2017-08-05 出版日期:2017-12-10 发布日期:2017-12-18
  • 通讯作者: 刘盛清
  • 作者简介:刘盛清(1991-),男,江西吉安人,硕士研究生,主要研究方向:机器学习、图像分类;孙季丰(1962-),男,广东广州人,教授,博士,主要研究方向:机器学习、模式识别、计算机视觉;余家林(1989-),男,贵州镇远人,博士研究生,主要研究方向:机器学习、人体姿态估计;宋治国(1988-),男,湖南湘西人,博士研究生,主要研究方向:机器学习、目标跟踪。
  • 基金资助:
    国家自然科学基金资助项目(61202292);广东省自然科学基金资助项目(9151064101000037)。

Image classification algorithm based on multi-scale feature fusion and Hessian sparse coding

LIU Shengqing, SUN Jifeng, YU Jialin, SONG Zhiguo   

  1. School of Electronic and Information Engineering, South China University of Technology, Guangzhou Guangdong 510641, China
  • Received:2017-06-05 Revised:2017-08-05 Online:2017-12-10 Published:2017-12-18
  • Supported by:
    The work is partially supported by the National Natural Science Foundation of China (61202292), the Natural Science Foundation of Guangdong Province (9151064101000037).

摘要: 针对传统稀疏编码图像分类算法提取单一类型特征,忽略图像的空间结构信息,特征编码时无法充分利用特征拓扑结构信息的问题,提出了基于多尺度特征融合Hessian稀疏编码的图像分类算法(HSC)。首先,对图像进行空间金字塔多尺度划分;其次,在各个子空间层将方向梯度直方图(HOG)和尺度不变特征转换(SIFT)进行有效的融合;然后,为了充分利用特征的拓扑结构信息,在传统稀疏编码目标函数中引入二阶Hessian能量函数作为正则项;最后,利用支持向量机(SVM)进行分类。在Scene15数据集上的实验结果表明,HSC的准确率比局部约束线性编码(LLC)高了3~5个百分点,比支持区别性字典学习(SDDL)等对比方法高了1~3个百分点;在Caltech101数据集上的耗时实验结果表明,HSC的用时比多核学习稀疏编码(MKLSC)少40%左右。所提HSC可以有效提高图像分类准确率,算法的效率也优于对比算法。

关键词: 图像分类, 特征融合, 空间金字塔, 稀疏编码, 支持向量机

Abstract: The traditional sparse coding image classification algorithms extract single type features, ignore the spatial structure information of the images, and can not make full use of the feature topological structure information in feature coding. In order to solve the problems, a image classification algorithm based on multi-scale feature fusion and Hessian Sparse Coding (HSC) was proposed. Firstly, the image was divided into sub-regions with multi-scale spatial pyramid. Secondly, the Histogram of Oriented Gradient (HOG) and Scale-Invariant Feature Transform (SIFT) were effectively merged in each subspace layer. Then, in order to make full use of the feature topology information, the second order Hessian energy function was introduced to the traditional sparse coding target function as a regularization term. Finally, Support Vector Machine (SVM) was used to classify the images. The experimental results on dataset Scene15 show that, the accuracy of HSC is 3-5 percentage points higher than that of Locality-constrained Linear Coding (LLC), while it is 1-3 percentage points higher than that of Support Discrimination Dictionary Learning (SDDL) and other comparative methods. Time-consuming experimental results on dataset Caltech101 show that, the time-consuming of HSC is about 40% less than that of the Multiple Kernel Learning Sparse Coding (MKLSC). The proposed HSC can effectively improve the accuracy of image classification, and its efficiency is also better than the contrast algorithms.

Key words: image classification, feature fusion, spatial pyramid, sparse coding, Support Vector Machine (SVM)

中图分类号: