Abstract:In order to solve the problems of too many parameters and high computational complexity of traditional convolutional neural networks, a lightweight convolutional neural network architecture named C-Net based on cross-channel fusion and cross-module connection was proposed. Firstly, a method called cross-channel fusion was proposed. With it, the shortcoming of lacking information flow between different groups of grouped convolution was solved to a certain extent, and the information communication between different groups was realized efficiently and easily. Then, a method called cross-module connection was proposed. With it, the shortcoming that the basic building blocks in the traditional lightweight architecture were independent to each other was overcome, and the information fusion between different modules with the same resolution feature mapping within the same stage was achieved, enhancing the feature extraction capability. Finally, a novel lightweight convolutional neural network architecture C-Net was designed based on the two proposed methods. The accuracy of C-Net on the Food_101 dataset is 69.41%, and the accuracy of C-Net on the Caltech_256 dataset is 63.93%. Experimental results show that C-Net reduces the memory cost and computational complexity in comparison with the state-of-the-art lightweight convolutional neural network models. The ablation experiment verifies the effectiveness of the two proposed methods on the Cifar_10 dataset.
[1] HINTON G E, SALAKHUTDINOV R R. Reducing the dimensionality of data with neuralnetworks[J]. Science,2006, 313(5786):504-507. [2] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE,1998,86(11):2278-2324. [3] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neuralnetworks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Red Hook:Curran Associates Inc.,2012:1097-1105. [4] RUSSAKOVSKY O,DENG J,SU H,et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision,2015,115(3):211-252. [5] SIMONYAN K,ZISSERMAN A. Very deep convolutionalnetworks for large-scale image recognition[EB/OL].[2020-03-04]. https://arxiv.org/pdf/1409.1556.pdf. [6] DUMOULIN V,VISIN F. A guide to convolution arithmetic for deep learning[EB/OL].[2020-03-23]. https://arxiv.org/pdf/1603.07285.pdf. [7] SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2015:1-9. [8] HE K,ZHANG X,REN S,et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2016:770-778. [9] HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutionalnetworks[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:2261-2269. [10] HU J,SHEN L,SUN G. Squeeze-and-excitationnetworks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2018:7132-7141. [11] 纪荣嵘, 林绍辉, 晁飞, 等. 深度神经网络压缩与加速综述[J]. 计算机研究与发展, 2018, 55(9):1871-1888.(JI R R,LIN S H, CHAO F, et al. Deep neuralnetworkcompression and acceleration:a review[J]. Journal of Computer Research and Development,2018,55(9):1871-1888.) [12] SETIONO R,LIU H. Neural-network feature selector[J]. IEEE Transactions on Neural Networks,1997,8(3):654-662. [13] ZHANG Y,JIANG Z,DAVIS L S. Learning structured low-rank representations for image classification[C]//Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2013:676-683. [14] PENG C,ZHANG X,YU G,et al. Large kernel matters-improve semantic segmentation by global convolutionalnetwork[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:1743-1751. [15] BUCILUǍ C, CARUANA R, NICULESCU-MIZIL A. Modelcompression[C]//Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM,2006:535-541. [16] GAO L,CHEN P Y,YU S. Demonstration of convolution kernel operation on resistive cross-point array[J]. IEEE Electron Device Letters,2016,37(7):870-873. [17] CHOLLET F. Xception:deep learning with depthwise separable convolutions[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2017:1800-1807. [18] LIN M,CHEN Q,YAN S. Network innetwork[EB/OL].[2020-03-16]. https://arxiv.org/pdf/1312.4400.pdf. [19] RUMELHART D E,HINTON G E,WILLIAMS R J,et al. Learning representations by back-propagating errors[J]. Nature, 1986,323(6088):533-536. [20] HOCHREITER S. The vanishing gradient problem during learning recurrent neuralnets and problem solutions[J]. International Journal of Uncertainty, Fuzziness and Knowledge-Based, Systems,1998,6(2):107-116. [21] GLOROT X,BENGIO Y. Understanding the difficulty of training deep feedforward neuralnetworks[J]. Journal of Machine Learning Research,2010,9:249-256. [22] ZHANG X,ZHOU X,LIN M,et al. ShuffleNet:an extremely efficient convolutional neuralnetwork for mobile devices[EB/OL].[2020-03-04]. https://arxiv.org/pdf/1707.01083.pdf. [23] MA N,ZHANG X,ZHENG H,et al. ShuffleNet V2:practical guidelines for efficient CNN architecture design[C]//Proceedings of the 2018 European Conference on Computer Vision,LNCS 11218. Cham:Springer,2018:122-138. [24] XIE S,GIRSHICK R,DOLLÁR P,et al. Aggregated residual transformations for deep neuralnetworks[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:5987-5995. [25] IANDOLA F N,HAN S,MOSKEWICZ M W,et al. SqueezeNet:AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size[EB/OL].[2020-03-24]. https://arxiv.org/pdf/1602.07360.pdf. [26] HOWARD A G,ZHU M,CHEN B,et al. MobileNets:efficient convolutional neuralnetworks for mobile vision applications[EB/OL].[2020-03-17]. https://arxiv.org/pdf/1704.04861.pdf. [27] SANDLER M,HOWARD A,ZHU M,et al. MobileNetV2:inverted residuals and linear bottlenecks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2018:4510-4520. [28] LEE C Y,GALLAGHER P W,TU Z. Generalizing pooling functions in convolutional neuralnetworks:mixed,gated,and tree[C]//Proceedings of the 19th International Conference on Artificial Intelligence and Statistics. Cambridge:MIT Press, 2016:464-472. [29] SCHERER D,MÜLLER A,BEHNKE S. Evaluation of pooling operations in convolutional architectures for object recognition[C]//Proceedings of the 20th International Conference on Artificial Neural Networks,LNCS 6354. Berlin:Springer,2010:92-101. [30] HAN D,KIM J,KIM J. Deep pyramidal residualnetworks[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:6307-6315. [31] ZHANG S,ZHANG S,ZHANG C,et al. Cucumber leaf disease identification with global pooling dilated convolutional neuralnetwork[J]. Computers and Electronics in Agriculture,2019, 162:422-430. [32] SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al. Dropout:a simple way to prevent neuralnetworks from overfitting[J]. Journal of Machine Learning Research,2014,15(1):1929-1958. [33] IOFFE S,SZEGEDY C. Batch normalization:accelerating deepnetwork training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on Machine Learning. New York:ACM,2015:448-456. [34] KETKAR N. Introduction to PyTorch[M]//Deep Learning with Python:A Hands-on Introduction. Berkeley,CA:Apress,2017:195-208. [35] 张蕊, 李锦涛. 基于深度学习的场景分割算法研究综述[J]. 计算机研究与发展, 2020, 57(4):859-875.(ZHANG R,LI J T. A survey on algorithm research of scene parsing based on deep learning[J]. Journal of Computer Research and Development, 2020,57(4):859-875.) [36] 黄继鹏, 史颖欢, 高阳. 面向小目标的多尺度Faster-RCNN检测算法[J]. 计算机研究与发展, 2019, 56(2):319-327.(HUANG J P,SHI Y H,GAO Y. Multi-scale faster-RCNN algorithm for small object detection[J]. Journal of Computer Research and Development,2019,56(2):319-327.)