Journal of Computer Applications ›› 2016, Vol. 36 ›› Issue (4): 1033-1038.DOI: 10.11772/j.issn.1001-9081.2016.04.1033

Previous Articles     Next Articles

Image target recognition method based on multi-scale block convolutional neural network

ZHANG Wenda, XU Yuelei, NI Jiacheng, MA Shiping, SHI Hehuan   

  1. Institute of Aeronautics and Astronautics Engineering, Air Force Engineering University, Xi'an Shaanxi 710038, China
  • Received:2015-09-29 Revised:2015-12-03 Online:2016-04-10 Published:2016-04-08
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61372167, 61379104).

基于多尺度分块卷积神经网络的图像目标识别算法

张文达, 许悦雷, 倪嘉成, 马时平, 史鹤欢   

  1. 空军工程大学 航空航天工程学院, 西安 710038
  • 通讯作者: 张文达
  • 作者简介:张文达(1991-),男,山东淄博人,硕士研究生,主要研究方向:模式识别、人工智能; 许悦雷(1975-),男,河北辛集人,教授,博士,主要研究方向:图像处理、模式识别; 倪嘉成(1990-),男,陕西西安人,硕士研究生,主要研究方向:模式识别、人工智能。
  • 基金资助:
    国家自然科学基金资助项目(61372167, 61379104)。

Abstract: The deformation such as translation, rotation and random scaling of local images in image recognition tasks is a complicated problem. An algorithm based on pre-training convolutional filters and Multi-Scale block Convolutional Neural Network (MS-CNN) was proposed to solve these problems. Firstly, the training dataset without labels was used to train a sparse autoencoder and get a collection of convolutional filters with characteristics in accord with the dataset and good initial values. To enhance the robustness and reduce the impact of the pooling layer for the feature extraction, a new Convolutional Neural Network (CNN) structure with multiple channels was proposed. The multi-scale block operation was applied to input image to form several channels, and each channel was convolved with corresponding size of filter. Then the convolutional layer, a local contrast normalization layer and a pooling layer were set to obtain invariability. The feature maps were put in the full connected layer and final features were exported for target recognition. The recognition rates of STL-10 database and remote sensing airplane images were both improved compared to traditional CNN. The experimental results show that the proposed method has robust performance when dealing with deformations such as translation, rotation and scaling.

Key words: Convolutional Neural Network (CNN), autoencoder, unsupervised pre-training, multi-scale blocking, target recognition

摘要: 针对图像在平移、旋转或局部形变等复杂情况下的识别问题,提出一种基于非监督预训练和多尺度分块的卷积神经网络(CNN)目标识别算法。算法首先利用不含标签的图像训练一个稀疏自动编码器,得到符合数据集特性、有较好初始值的滤波器集合。为了增强鲁棒性,同时减小下采样对特征提取的影响,提出一种多通路结构的卷积神经网络,对输入图像进行多尺度分块形成多个通路,每个通路与相应尺寸的滤波器卷积,不同通路的特征经过局部对比度标准化和下采样后在全连接层进行融合,从而形成最终用于图像分类的特征,将特征输入分类器完成图像目标识别。仿真实验中,所提算法对STL-10数据集和遥感飞机图像的识别率较传统的CNN均有提高,并对图像各种形变具有较好的鲁棒性。

关键词: 卷积神经网络, 自动编码器, 非监督训练, 多尺度分块, 目标识别

CLC Number: