基于卷积神经网络的多尺度葡萄图像识别方法

doi:10.11772/j.issn.1001-9081.2019040594

计算机应用 ›› 2019, Vol. 39 ›› Issue (10): 2930-2936.DOI: 10.11772/j.issn.1001-9081.2019040594

基于卷积神经网络的多尺度葡萄图像识别方法

邱津怡^1,2, 罗俊^1,2, 李秀³, 贾伟¹, 倪福川^1,2, 冯慧¹

1. 华中农业大学信息学院, 武汉 430070;
2. 湖北省农业大数据工程技术研究中心, 武汉 430070;
3. 华中农业大学工学院, 武汉 430070

收稿日期:2019-04-10 修回日期:2019-06-22 发布日期:2019-08-21 出版日期:2019-10-10
通讯作者: 罗俊
作者简介:邱津怡(1995-),女,天津人,硕士研究生,CCF会员,主要研究方向:计算机视觉、图像处理;罗俊(1981-),男,湖北武汉人,副教授,博士,CCF会员,主要研究方向:机器学习、大数据;李秀(1995-),女,山东济南人,硕士研究生,CCF会员,主要研究方向:计算机视觉、图像处理;贾伟(1994-),男,四川德阳人,硕士研究生,主要研究方向:计算机视觉、深度学习;倪福川(1974-),男,湖北黄冈人,讲师,博士,主要研究方向:机器学习、大数据;冯慧(1987-),女,湖北浠水人,讲师,博士,主要研究方向:计算机视觉、植物表型检测。
基金资助:
国家自然科学基金资助项目（21800305）；国家重点研发计划项目（2018YFC1604000）；中央高校基本科研业务费专项资金资助项目（2662017PY059）。

Multi-scale grape image recognition method based on convolutional neural network

QIU Jinyi^1,2, LUO Jun^1,2, LI Xiu³, JIA Wei¹, NI Fuchuan^1,2, FENG Hui¹

1. College of Informatics, Huazhong Agricultural University, Wuhan Hubei 430070, China;
2. Hubei Engineering Technology Research Center of Agricultural Big Data, Wuhan Hubei 430070;
3. College of Engineering, Huazhong Agricultural University, Wuhan Hubei 430070, China

Received:2019-04-10 Revised:2019-06-22 Online:2019-08-21 Published:2019-10-10
Supported by:
This work is partially supported by the National Natural Science Foundation of China (21800305), the National Key R&D Program of China (2018YFC1604000), the Fundamental Research Funds for the Central Universities (2662017PY059).

摘要/Abstract

摘要： 葡萄品种质量检测需要识别多类别的葡萄，而葡萄图片中存在多种景深变化、多串等多种场景，单一预处理方法存在局限导致葡萄识别的效果不佳。实验的研究对象是大棚中采集的15个类别的自然场景葡萄图像，并建立相应图像数据集Vitis-15。针对葡萄图像中同一类别的差异较大而不同类别的差异较小的问题，提出一种基于卷积神经网络（CNN）的多尺度葡萄图像识别方法。首先，对Vitis-15数据集中的数据通过三种方法进行预处理：旋转图像的数据扩增方法、中心裁剪的多尺度图像方法以及前两种方法的数据融合方法；然后，采用迁移学习方法和卷积神经网络方法来进行分类识别，迁移学习选取ImageNet上预训练的Inception V3网络模型，卷积神经网络采用AlexNet、ResNet、Inception V3这三类模型；最后，提出适合Vitis-15的多尺度图像数据融合的分类模型MS-EAlexNet。实验结果表明，在同样的学习率和同样的测试集上，数据融合方法在MS-EAlexNet上的测试准确率达到了99.92%，相较扩增和多尺度图像方法提升了近1个百分点，并且所提方法在分类小样本数据集上具有较高的效率。

关键词: 图像识别, 自然场景, 迁移学习, 卷积神经网络, 多尺度图像, 数据融合

Abstract: Grape quality inspection needs the identification of multiple categories of grapes, and there are many scenes such as depth of field changes and multiple strings in the grape images. Grape recognition is ineffective due to the limitations of single pretreatment method. The research objects were 15 kinds of natural scene grape images collected in the greenhouse, and the corresponding image dataset Vitis-15 was established. Aiming at the large intra-class differences and small inter-class of differences grape images, a multi-scale grape image recognition method based on Convolutional Neural Network (CNN) was proposed. Firstly, the data in Vitis-15 dataset were pre-processed by three methods, including the image rotating based data augmentation method, central cropping based multi-scale image method and data fusion method of the above two. Then, transfer learning method and convolution neural network method were adopted to realiize the classification and recognition. The Inception V3 network model pre-trained on ImageNet was selected for transfer learning, and three types of models-AlexNet, ResNet and Inception V3 were selected for convolution neural network. The multi-scale image data fusion classification model MS-EAlexNet was proposed, which was suitable for Vitis-15. Experimental results show that with the same learning rate on the same test dataset, compared with the augmentation and multi-scale image method, the data fusion method improves nearly 1% testing accuracy on MS-EAlexNet model with 99.92% accuracy, meanwhile the proposed method has higher efficiency in classifying small sample datasets.

Key words: image recognition, natural scene, transfer learning, Convolutional Neural Network (CNN), multi-scale image, data fusion

中图分类号:

TP183

邱津怡, 罗俊, 李秀, 贾伟, 倪福川, 冯慧. 基于卷积神经网络的多尺度葡萄图像识别方法[J]. 计算机应用, 2019, 39(10): 2930-2936.

QIU Jinyi, LUO Jun, LI Xiu, JIA Wei, NI Fuchuan, FENG Hui. Multi-scale grape image recognition method based on convolutional neural network[J]. Journal of Computer Applications, 2019, 39(10): 2930-2936.

参考文献

[1] 晁无疾. 调整提高转型升级促进我国葡萄产业稳步发展[J]. 中国果菜, 2015(9):12-14. (CHAO W J. Adjustment, improvement, transformation and upgrading to promote the steady development of China's grape industry[J]. China Fruit Vegetable, 2015(9):12-14.)
[2] ZHAO B, FENG J, WU X, et al. A Survey on deep learning-based fine-grained object classification and semantic segmentation[J]. International Journal of Automation and Computing, 2017, 14(2):119-135.
[3] LUO L, TANG Y, ZOU X, et al. Vision-based extraction of spatial information in grape clusters for harvesting robots[J]. Biosystems Engineering, 2016, 151:90-104.
[4] FAN J, GAO Y, LUO H. Multi-level annotation of natural scenes using dominant image components and semantic concepts[C]//Proceedings of the 12th Annual ACM International Conference on Multimedia. New York:ACM, 2004:540-547.
[5] NIXON M S, AGUADO A S. 特征提取与图像处理[M]. 李实英, 杨高波, 译.北京:电子工业出版社, 2010:147-289. (NIXON M S, AGUADO A S. Feature Extraction and Image Processing[M]. LI S Y, YANG G B, translated. Beijing:Publishing House of Electronics Industry, 2010:147-289.)
[6] HINTON G E, OSINDERO S, TEH Y W. A fast learning algorithm for deep belief nets[J]. Neural Computation, 2006, 18(7):1527-1554.
[7] YU C, WANG J, PENG C, et al. Learning a discriminative feature network for semantic segmentation[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2018:1857-1866.
[8] SCHUSTER M, PALIWAL K K. Bidirectional recurrent neural networks[J]. IEEE Transactions on Signal Processing, 1997, 45(11):2673-2681.
[9] CARNEIRO G, VASCONCELOS N. Formulating semantic image annotation as a supervised learning problem[C]//Proceedings of the 2005 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2005:163-168.
[10] LeCUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11):2278-2324.
[11] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. La Jolla, CA:Neural Information Processing Systems Foundation, 2012:1097-1105.
[12] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL].[2019-02-10]. https://arxiv.org/pdf/1409.1556.pdf.
[13] SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Patten Recognition. Piscataway:IEEE, 2015:1-9.
[14] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:770-778.
[15] SZEGEDY C, IOFFE S, van HOUCKE V, et al. Inception-V4, inception-ResNet and the impact of residual connections on learning[C]//Proceedings of the 201631st AAAI Conference on Artificial Intelligence. Pola Alto, CA:AAAI, 2016:4278-4284.
[16] GEHLER P, NOWOZIN S. On feature combination for multiclass object classification[C]//Proceedings of the 12th IEEE International Conference on Computer Vision. Piscataway:IEEE, 2009:221-228.
[17] JARRETT K, KAVUKCUOGLU K, RANZATO M, et al. What is the best multi-stage architecture for object recognition?[C]//Proceedings of the 12th IEEE International Conference on Computer Vision. Piscataway:IEEE, 2009:2146-2153.
[18] CHEN P H, LIN C J, SCHOLKOPF, BERNHARD. A tutorial on ν-support vector machines[J]. Applied Stochastic Models in Business and Industry, 2005, 21(2):111-136.
[19] WEISS K, KHOSHGOFTAAR T M, WANG D D. A survey of transfer learning[J]. Journal of Big Data, 2016, 3:9.
[20] WOLD S. Principal component analysis[J]. Chemometrics & Intelligent Laboratory Systems, 1987, 2(1):37-52.
[21] SZEGEDY C, van HOUCKE V, IOFFE S, et al. Rethinking the Inception architecture for computer vision[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:2818-2826.
[22] LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521(7553):436-444.
[23] IOFFE S, SZEGEDY C. Batch normalization:accelerating deep network training by reducing internal covariate shift[EB/OL].[2019-01-10]. https://arxiv.org/pdf/1502.03167.pdf.

基于卷积神经网络的多尺度葡萄图像识别方法

Multi-scale grape image recognition method based on convolutional neural network

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974.
[2]	李云, 王富铕, 井佩光, 王粟, 肖澳. 基于不确定度感知的帧关联短视频事件检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2903-2910.
[3]	陈虹, 齐兵, 金海波, 武聪, 张立昂. 融合1D-CNN与BiGRU的类不平衡流量异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2493-2499.
[4]	张春雪, 仇丽青, 孙承爱, 荆彩霞. 基于两阶段动态兴趣识别的购买行为预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2365-2371.
[5]	刘艺, 杨国利, 郑奇斌, 李翔, 周杨森, 陈德鹏. 无人系统数据融合流水线架构设计[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2536-2543.
[6]	赵宇博, 张丽萍, 闫盛, 侯敏, 高茂. 基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2421-2429.
[7]	高阳峄, 雷涛, 杜晓刚, 李岁永, 王营博, 闵重丹. 基于像素距离图和四维动态卷积网络的密集人群计数与定位方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2233-2242.
[8]	王东炜, 刘柏辰, 韩志, 王艳美, 唐延东. 基于低秩分解和向量量化的深度网络压缩方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 1987-1994.
[9]	罗玮, 刘金全, 张铮. 融合秘密分享技术的双重纵向联邦学习框架[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1872-1879.
[10]	黄梦源, 常侃, 凌铭阳, 韦新杰, 覃团发. 基于层间引导的低光照图像渐进增强算法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1911-1919.
[11]	李健京, 李贯峰, 秦飞舟, 李卫军. 基于不确定知识图谱嵌入的多关系近似推理模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1751-1759.
[12]	翟飞宇, 马汉达. 基于DenseNet的经典-量子混合分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1905-1910.
[13]	沈君凤, 周星辰, 汤灿. 基于改进的提示学习方法的双通道情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1796-1806.
[14]	姚迅, 秦忠正, 杨捷. 生成式标签对抗的文本分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1781-1785.
[15]	席治远, 唐超, 童安炀, 王文剑. 基于双路时空网络的驾驶员行为识别[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1511-1519.