Review of image classification algorithms based on convolutional neural network

doi:10.11772/j.issn.1001-9081.2021071273

Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (4): 1044-1049.DOI: 10.11772/j.issn.1001-9081.2021071273

Special Issue: CCF第36届中国计算机应用大会 (CCF NCCA 2021)

• The 36 CCF National Conference of Computer Applications (CCF NCCA 2020) • Previous Articles Next Articles

Review of image classification algorithms based on convolutional neural network

Changqing JI¹^,², Zhiyong GAO², Jing QIN³, Zumin WANG²()

^1.College of Physical Science and Technology，Dalian University，Dalian Liaoning 116622，China
^2.College of Information Engineering，Dalian University，Dalian Liaoning 116622，China
^3.College of Software Engineering，Dalian University，Dalian Liaoning 116622，China

Received:2021-07-14 Revised:2021-08-18 Accepted:2021-08-27 Online:2022-04-15 Published:2022-04-10
Contact: Zumin WANG
About author:JI Changqing， born in 1980， Ph. D.， associate professor. His research interests include artificial intelligence， big data analysis， spatial data base， smart healthcare.
GAO Zhiyong， born in 1996， M. S. candidate. His research interests include artificial intelligence.
QIN Jing， born in 1981， Ph. D.， associate professor. Her research interests include signal processing， big data analysis.
Supported by:
National Natural Science Foundation of China(62002038)

基于卷积神经网络的图像分类算法综述

季长清¹^,², 高志勇², 秦静³, 汪祖民²()

^1.大连大学物理科学与技术学院，辽宁大连 116622
^2.大连大学信息工程学院，辽宁大连 116622
^3.大连大学软件工程学院，辽宁大连 116622

通讯作者: 汪祖民
作者简介:季长清（1980—），男，辽宁庄河人，副教授，博士，CCF会员，主要研究方向：人工智能、大数据分析、空间数据库、智慧医疗
高志勇（1996—），男，山东聊城人，硕士研究生，CCF会员，主要研究方向：人工智能
秦静（1981—），女，甘肃张掖人，副教授，博士，CCF会员，主要研究方向：信号处理、大数据分析
基金资助:
国家自然科学基金资助项目(62002038)

Abstract

Abstract:

Convolutional Neural Network （CNN） is one of the important research directions in the field of computer vision based on deep learning at present. It performs well in applications such as image classification and segmentation， target detection. Its powerful feature learning and feature representation capability are admired by researchers increasingly. However， CNN still has problems such as incomplete feature extraction and overfitting of sample training. Aiming at these issues， the development of CNN， classical CNN network models and their components were introduced， and the methods to solve the above issues were provided. By reviewing the current status of research on CNN models in image classification， the suggestions were provided for further development and research directions of CNN.

Key words: deep learning, Convolutional Neural Network (CNN), image classification, feature extraction, overfitting

摘要：

卷积神经网络（CNN）是目前基于深度学习的计算机视觉领域中重要的研究方向之一。它在图像分类和分割、目标检测等的应用中表现出色，其强大的特征学习与特征表达能力越来越受到研究者的推崇。然而，CNN仍存在特征提取不完整、样本训练过拟合等问题。针对这些问题，介绍了CNN的发展、CNN经典的网络模型及其组件，并提供了解决上述问题的方法。通过对CNN模型在图像分类中研究现状的综述，为CNN的进一步发展及研究方向提供了建议。

关键词: 深度学习, 卷积神经网络, 图像分类, 特征提取, 过拟合

CLC Number:

TP181

Changqing JI, Zhiyong GAO, Jing QIN, Zumin WANG. Review of image classification algorithms based on convolutional neural network[J]. Journal of Computer Applications, 2022, 42(4): 1044-1049.

季长清, 高志勇, 秦静, 汪祖民. 基于卷积神经网络的图像分类算法综述[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1044-1049.

Figures/Tables 7

Fig. 1 Development history of deep neural network model

Fig. 2 ReLU function

Fig. 3 Structure of LeNet-5 model

Fig. 4 Principle model of fully connected layer and Dropout layer

Fig. 5 Inception module conceptual model

Fig. 6 Residual module structure

Fig. 7 Deep separable convolution layer structure

References 44

1	HUANG B， HE B Y， WU L N， et al. A deep learning approach to detecting ships from high-resolution aerial remote sensing images［J］. Journal of Coastal Research， 2020， 111（SI）： 16-20. 10.2112/jcr-si111-003.1
2	LI X F， LIU B， ZHENG G， et al. Deep-learning-based information mining from ocean remote-sensing imagery［J］. National Science Review， 2020， 7（10）： 1584-1605. 10.1093/nsr/nwaa047
3	谢志华，江鹏，余新河，等. 基于VGGNet和多谱带循环网络的高光谱人脸识别系统［J］. 计算机应用， 2019， 39（2）：388-391. 10.11772/j.issn.1001-9081.2018081788
	XIE Z H， JIANG P， YU X H， et al. Hyperspectral face recognition system based on VGGNet and multi-band recurrent network［J］. Journal of Computer Applications， 2019， 39（2）：388-391. 10.11772/j.issn.1001-9081.2018081788
4	FU K S， ROSENFELD. Pattern recognition and image processing［J］. IEEE Transactions on Computers， 1976， C-25（12）： 1336-1346. 10.1109/tc.1976.1674602
5	RUCK D W， ROGERS S K， KABRISKY M. Feature selection using a multilayer perceptron［J］. Journal of Neural Network Computing， 1990， 2（2）： 40-48. 10.1109/ijcnn.1990.137802
6	HINTON G E， SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks［J］. Science， 2006， 313（5786）： 504-507. 10.1126/science.1127647
7	NIU X X， SUEN C Y. A novel hybrid CNN-SVM classifier for recognizing handwritten digits［J］. Pattern Recognition， 2012， 45（4）： 1318-1325. 10.1016/j.patcog.2011.09.021
8	RUMELHART D E， HINTON G E， WILLIAMS R J. Learning representations by back-propagating errors［J］. Nature， 1986， 323（6088）： 533-536. 10.1038/323533a0
9	HE K M， ZHANG X Y， REN S Q， et al. Spatial pyramid pooling in deep convolutional networks for visual recognition［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2015， 37（9）： 1904-1916. 10.1109/tpami.2015.2389824
10	KRIZHEVSKY A， SUTSKEVER I， HINTON G E. ImageNet classification with deep convolutional neural networks C］// Proceedings of the 25th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2012： 1097-1105.
11	SIMONYAN K， ZISSERMAN A. Very deep convolutional networks for large-scale image recognition［EB/OL］. （2015-04-10）［2021-06-20］.. 10.5244/c.28.6
12	SZEGEDY C， LIU W， JIA Y Q， et al. Going deeper with convolutions［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 1-9. 10.1109/cvpr.2015.7298594
13	HE K M， ZHANG X Y， REN S Q， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778. 10.1109/cvpr.2016.90
14	HOWARD A G， ZHU M L， CHEN B， et al. MobileNets： efficient convolutional neural networks for mobile vision applications［EB/OL］. （2017-04-17）［2021-06-20］.. 10.1109/cvpr.2018.00286
15	ZHANG L， WANG X S， YANG D， et al. Generalizing deep learning for medical image segmentation to unseen domains via deep stacked transformation［J］. IEEE Transactions on Medical Imaging， 2020， 39（7）： 2531-2540. 10.1109/tmi.2020.2973595
16	孔令军，王茜雯，包云超，等. 基于深度学习的医疗图像分割综述［J］.无线电通信技术， 2021， 47（2）：121-130. 10.3969/j.issn.1003-3114.2021.02.001
	KONG L J， WANG Q W， BAO Y C， et al. A survey on medical image segmentation based on deep learning［J］. Radio Communications Technology， 2021， 47（2）：121-130. 10.3969/j.issn.1003-3114.2021.02.001
17	田锦，袁家政，刘宏哲. 基于实例分割的车道线检测及自适应拟合算法［J］. 计算机应用， 2020， 40（7）：1932-1937. 10.1109/cvidl51233.2020.00-92
	TIAN J， YUAN J Z， LIU H Z. Instance segmentation based lane line detection and adaptive fitting algorithm［J］. Journal of Computer Applications， 2020， 40（7）：1932-1937. 10.1109/cvidl51233.2020.00-92
18	樊玮，刘挺，黄睿，等. 卷积神经网络低层特征辅助的图像实例分割方法［J］. 计算机科学， 2020， 47（11）：186-191. 10.11896/jsjkx.191200063
	FAN W， LIU T， HUANG R， et al. Low-level CNN feature aided image instance segmentation［J］. Computer Science， 2020， 47（11）：186-191. 10.11896/jsjkx.191200063
19	LE Q V， NGIAM J Q， CHEN Z H， et al. Tiled convolutional neural networks［C］// Proceedings of the 23rd International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2010： 1279-1287.
20	ZEILER M D， FERGUS R. Visualizing and understanding convolutional networks［C］// Proceedings of the 2014 European Conference on Computer Vision， LNCS 8689. Cham： Springer， 2014： 818-833.
21	YU F， KOLTUN V. Multi-scale context aggregation by dilated convolutions［EB/OL］. （2016-04-30）［2021-06-20］.. 10.4236/psych.2020.1110096
22	NOH H， HONG S， HAN B. Learning deconvolution network for semantic segmentation［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 1520-1528. 10.1109/iccv.2015.178
23	CHEN R， WANG M L， LAI Y. Analysis of the role and robustness of artificial intelligence in commodity image recognition under deep learning neural network［J］. PLoS ONE， 2020， 15（7）： No.e0235783. 10.1371/journal.pone.0235783
24	FUKUSHIMA K， MIYAKE S. Neocognitron： a new algorithm for pattern recognition tolerant of deformations and shifts in position［J］. Pattern Recognition， 1982， 15（6）： 455-469. 10.1016/0031-3203(82)90024-3
25	ZHANG J M， BARGAL S A， LIN Z， et al. Top-down neural attention by excitation backprop［J］. International Journal of Computer Vision， 2018， 126（10）： 1084-1102. 10.1007/s11263-017-1059-x
26	McCULLOCH W S， PITTS W. A logical calculus of the ideas immanent in nervous activity［J］. The Bulletin of Mathematical Biophysics， 1943， 5（4）： 115-133. 10.1007/bf02478259
27	XU B， WANG N Y， CHEN T Q， et al. Empirical evaluation of rectified activations in convolutional network［EB/OL］. （2015-11-27）［2021-06-20］..
28	CLEVERT D A， UNTERTHINER T， HOCHREITER S. Fast and accurate deep network learning by Exponential Linear Units （ELUs）［EB/OL］. （2016-02-22）［2021-06-20］..
29	MAAS A L， HANNUN A Y， NG A Y. Rectifier nonlinearities improve neural network acoustic models［C/OL］// Proceedings of the 30th International Conference on Machine Learning. ［2021-06-20］..
30	IOFFE S， SZEGEDY C. Batch normalization： accelerating deep network training by reducing internal covariate shift［C］// Proceedings of the 2015 International Conference on Machine Learning. New York： JMLR.org， 2015： 448-456.
31	GRAHAM B. Fractional max-pooling［EB/OL］. （2015-05-12）［2021-06-20］..
32	ZHAI S F， WU H， KUMAR A， et al. S3Pool： pooling with stochastic spatial sampling［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 4003-4033.
33	KIM K J， KIM P K， CHUNG Y S， et al. Performance enhancement of YOLOv3 by adding prediction layers with spatial pyramid pooling for vehicle detection［C］// Proceedings of the 15th IEEE International Conference on Advanced Video and Signal Based Surveillance. Piscataway：IEEE， 2018： 1-6. 10.1109/avss.2018.8639438
34	RUSSAKOVSKY O， DENG J， SU H， et al. ImageNet large scale visual recognition challenge［J］. International Journal of Computer Vision， 2015， 115（3）： 211-252. 10.1007/s11263-015-0816-y
35	LeCUN Y， BOTTOU L， BENGIO Y， et al. Gradient-based learning applied to document recognition［J］. Proceedings of the IEEE， 1998， 86（11）： 2278-2324. 10.1109/5.726791
36	SZEGEDY C， VANHOUCKE V， IOFFE S， et al. Rethinking the inception architecture for computer vision［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 2818-2826. 10.1109/cvpr.2016.308
37	SZEGEDY C， IOFFE S， VANHOUCKE V， et al. Inception-v4， Inception-ResNet and the impact of residual connections on learning［C］// Proceedings of the 31st AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2017：4278-4284.
38	SANDLER M， HOWARD A， ZHU M， et al. MobileNetv2： inverted residuals and linear bottlenecks［C］// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 4510-4520. 10.1109/cvpr.2018.00474
39	HOWARD A， SANDLER M， CHU G， et al. Searching for MobileNetv3［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 1314-1324. 10.1109/iccv.2019.00140
40	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2017：6000-6010. 10.1016/s0262-4079(17)32358-8
41	DOSOVITSKIY A， BEYER L， KOLESNIKOV A， et al. An image is worth 16x16 words： transformers for image recognition at scale［EB/OL］. （2021-06-03）［2021-06-20］..
42	TOUVRON H， CORD M， DOUZE M， et al. Training data-efficient image transformers & distillation through attention［C］// Proceedings of the 38th International Conference on Machine Learning. New York： JMLR.org， 2021： 10347-10357. 10.1109/iccv48922.2021.00091
43	WANG W H， XIE E Z， LI X， et al. Pyramid vision transformer： a versatile backbone for dense prediction without convolutions［EB/OL］. （2021-02-24）［2021-06-20］ . 10.1109/iccv48922.2021.00061
44	LIU Z， LIN Y T， CAO Y， et al. Swin transformer： hierarchical vision transformer using shifted windows［EB/OL］. （2021-03-25）［2021-06-20］.. 10.1109/iccv48922.2021.00986

[1]	Shunyong LI, Shiyi LI, Rui XU, Xingwang ZHAO. Incomplete multi-view clustering algorithm based on self-attention fusion [J]. Journal of Computer Applications, 2024, 44(9): 2696-2703.
[2]	Yunchuan HUANG, Yongquan JIANG, Juntao HUANG, Yan YANG. Molecular toxicity prediction based on meta graph isomorphism network [J]. Journal of Computer Applications, 2024, 44(9): 2964-2969.
[3]	Xin YANG, Xueni CHEN, Chunjiang WU, Shijie ZHOU. Short-term traffic flow prediction of urban highway based on variant residual model and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2947-2951.
[4]	Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877.
[5]	Yun LI, Fuyou WANG, Peiguang JING, Su WANG, Ao XIAO. Uncertainty-based frame associated short video event detection method [J]. Journal of Computer Applications, 2024, 44(9): 2903-2910.
[6]	Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974.
[7]	Xiyuan WANG, Zhancheng ZHANG, Shaokang XU, Baocheng ZHANG, Xiaoqing LUO, Fuyuan HU. Unsupervised cross-domain transfer network for 3D/2D registration in surgical navigation [J]. Journal of Computer Applications, 2024, 44(9): 2911-2918.
[8]	Shuai FU, Xiaoying GUO, Ruyi BAI, Tao YAN, Bin CHEN. Age estimation method combining improved CloFormer model and ordinal regression [J]. Journal of Computer Applications, 2024, 44(8): 2372-2380.
[9]	Tong CHEN, Fengyu YANG, Yu XIONG, Hong YAN, Fuxing QIU. Construction method of voiceprint library based on multi-scale frequency-channel attention fusion [J]. Journal of Computer Applications, 2024, 44(8): 2407-2413.
[10]	Yuhan LIU, Genlin JI, Hongping ZHANG. Video pedestrian anomaly detection method based on skeleton graph and mixed attention [J]. Journal of Computer Applications, 2024, 44(8): 2551-2557.
[11]	Yanjie GU, Yingjun ZHANG, Xiaoqian LIU, Wei ZHOU, Wei SUN. Traffic flow forecasting via spatial-temporal multi-graph fusion [J]. Journal of Computer Applications, 2024, 44(8): 2618-2625.
[12]	Qianhong SHI, Yan YANG, Yongquan JIANG, Xiaocao OUYANG, Wubo FAN, Qiang CHEN, Tao JIANG, Yuan LI. Multi-granularity abrupt change fitting network for air quality prediction [J]. Journal of Computer Applications, 2024, 44(8): 2643-2650.
[13]	Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG. Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU [J]. Journal of Computer Applications, 2024, 44(8): 2493-2499.
[14]	Zheng WU, Zhiyou CHENG, Zhentian WANG, Chuanjian WANG, Sheng WANG, Hui XU. Deep learning-based classification of head movement amplitude during patient anaesthesia resuscitation [J]. Journal of Computer Applications, 2024, 44(7): 2258-2263.
[15]	Dongwei WANG, Baichen LIU, Zhi HAN, Yanmei WANG, Yandong TANG. Deep network compression method based on low-rank decomposition and vector quantization [J]. Journal of Computer Applications, 2024, 44(7): 1987-1994.

Review of image classification algorithms based on convolutional neural network

基于卷积神经网络的图像分类算法综述

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 7

References 44

Related Articles 15

Recommended Articles

Metrics