Multi-branch neural network model based weakly supervised fine-grained image classification method

doi:10.11772/j.issn.1001-9081.2019111883

Journal of Computer Applications ›› 2020, Vol. 40 ›› Issue (5): 1295-1300.DOI: 10.11772/j.issn.1001-9081.2019111883

• Artificial intelligence • Previous Articles Next Articles

Multi-branch neural network model based weakly supervised fine-grained image classification method

BIAN Xiaoyong^1,2,3, JIANG Peiling^1,2,3, ZHAO Min^4,5, DING Sheng^1,2,3, ZHANG Xiaolong^1,2,3

1.School of Computer Science and Technology, Wuhan University of Science and Technology, WuhanHubei 430065, China
2.Institute of Big Data Science and Engineering, Wuhan University of Science and Technology, WuhanHubei 430065, China
3.Key Laboratory of Hubei Province for Intelligent Information Processing and Real-time Industrial System;(Wuhan University of Science and Technology), WuhanHubei 430065, China
4.School of Information Science and Engineering, Wuhan University of Science and Technology, WuhanHubei 430081, China
5.Engineering Research Center for Metallurgical Automation and Measurement Technology, Ministry of Education;(Wuhan University of Science and Technology), WuhanHubei 430081, China

Received:2019-11-05 Revised:2019-12-17 Online:2020-05-15 Published:2020-05-10
Contact: BIAN Xiaoyong, born in 1976, Ph. D., associate professor. His research interests include remote sensing scene classification, feature learning.
About author:BIAN Xiaoyong, born in 1976, Ph. D., associate professor. His research interests include remote sensing scene classification, feature learning.JIANG Peiling, born in 1993, M. S. candidate. His research interests include fine-grained image classification, deep learning.ZHAO Min, born in 1978, M. S., lecturer. His research interests include fault diagnosis.DING Sheng, born in 1975, Ph. D., associate professor. His research interests include object detection, deep learning.ZHANG Xiaolong, born in 1963, Ph. D., professor. His research interests include artificial intelligence, machine learning, data mining, bioinformatics.
Supported by:
This work is partially supported by the National Natural Science Foundation of China (61572381, 61501337, 61972299), the Natural Science Foundation of Hubei Province (2018CFB575), the Open Fund of Engineering Research Center for Metallurgical Automation and Measurement Technology of Ministry of Education (MADT201707).

基于多分支神经网络模型的弱监督细粒度图像分类方法

边小勇^1,2,3, 江沛龄^1,2,3, 赵敏^4,5, 丁胜^1,2,3, 张晓龙^1,2,3

1.武汉科技大学计算机科学与技术学院，武汉 430065
2.武汉科技大学大数据科学与工程研究院，武汉 430065
3.智能信息处理与实时工业系统湖北省重点实验室(武汉科技大学)，武汉 430065
4.武汉科技大学信息科学与工程学院，武汉 430081
5.冶金自动化与检测技术教育部工程研究中心(武汉科技大学)，武汉 430081

通讯作者: 边小勇(1976—)
作者简介:边小勇(1976—)，男，江西吉安人，副教授，博士，主要研究方向：遥感场景分类、特征学习；江沛龄(1993—)，男，湖北武汉人，硕士研究生，主要研究方向：细粒度图像分类、深度学习；赵敏（1978—），男，湖北咸宁人，讲师，硕士，主要研究方向：故障诊断；丁胜(1975—)，男，湖北武汉人，副教授，博士，主要研究方向：目标检测、深度学习；张晓龙(1963—)，男，江西吉安人，教授，博士，主要研究方向：人工智能、机器学习、数据挖掘、生物信息处理。
基金资助:
国家自然科学基金资助项目（61572381，61501337，61972299）;湖北省自然科学基金资助项目（2018CFB575）;冶金自动化与检测技术教育部工程研究中心开放基金资助项目（MADT201707）。

Abstract

Abstract:

Concerning the problem that the local feature and rotation invariant feature cannot be jointly paid attention to in traditional attention-based neural networks, a multi-branch neural network model based weakly supervised fine-grained image classification method was proposed. Firstly, the lightweight Class Activation Map (CAM) network was utilized to localize the local region with potential semantic information, and the residual network ResNet-50 with deformable convolution and Oriented Response Network (ORN) with rotation invariant coding were designed. Secondly, the pre-trained model was employed to initialize the feature networks respectively, and the original image and the above regions were input to fine-tune the model. Finally, the three intra-branch losses and between-branch losses were combined to optimize the entire network, and the classification and prediction were performed on the test set. The proposed method achieves the classification accuracies of 87.7% and 90.8% on CUB-200-2011 dataset and FGVC_Aircraft dataset respectively, which are increased by 1.2 percentage points, and 0.9 percentage points respectively compared with those of the Multi-Attention Convolutional Neural Network (MA-CNN) method. On Aircraft_2 dataset, the proposed method reaches 91.8% classification accuracy, which is 4.1 percentage points higher than that of ResNet-50. The experimental results show that the proposed method improves the accuracy of weakly supervised fine-grained image classification effectively.

Key words: fine-grained image classification, deep learning, weakly supervised, deformable convolution, Class Activation Map (CAM), Oriented Response Network (ORN)

摘要：

针对传统基于注意力机制的神经网络不能联合关注局部特征和旋转不变特征的问题，提出一种基于多分支神经网络模型的弱监督细粒度图像分类方法。首先，用轻量级类激活图(CAM)网络定位有潜在语义信息的局部区域，设计可变形卷积的残差网络ResNet-50和旋转不变编码的方向响应网络(ORN)；其次，利用预训练模型分别初始化特征网络，并输入原图和以上局部区域分别对模型进行微调；最后，组合三个分支内损失和分支间损失优化整个网络，对测试集进行分类预测。所提方法在CUB-200-2011和FGVC_Aircraft数据集上的分类准确率分别达到87.7%和90.8%，与多注意力卷积神经网络(MA-CNN)方法相比，分别提高了1.2个百分点和0.9个百分点；在Aircraft_2数据集上的分类准确率达到91.8%，比ResNet-50网络提高了4.1个百分点。实验结果表明，所提方法有效提高了弱监督细粒度图像分类的准确率。

关键词: 细粒度图像分类, 深度学习, 弱监督, 可变形卷积, 类激活图, 方向响应网络

CLC Number:

TP391.4

BIAN Xiaoyong, JIANG Peiling, ZHAO Min, DING Sheng, ZHANG Xiaolong. Multi-branch neural network model based weakly supervised fine-grained image classification method[J]. Journal of Computer Applications, 2020, 40(5): 1295-1300.

边小勇, 江沛龄, 赵敏, 丁胜, 张晓龙. 基于多分支神经网络模型的弱监督细粒度图像分类方法[J]. 计算机应用, 2020, 40(5): 1295-1300.

References

1 罗建豪,吴建鑫 . 基于深度卷积特征的细粒度图像分类研究综述[J]. 自动化学报, 2017, 43(8):1306-1318. (LUO J H, WU J X. A survey on fine-grained image categorization using deep convolutional features[J]. Acta Automatica Sinica, 2017, 43(8): 1306-1318.)
2 XIAO T , XU Y , YANG K , et al . The application of two-level attention models in deep convolutional neural network for fine-grained image classification[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 842-850.
3 LIN T Y , ROYCHOWDHURY A , MAJI S . Bilinear CNN models for fine-grained visual recognition[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 1449-1457.
4 ZHANG X , XIONG H , ZHOU W , et al . Picking neural activations for fine-grained recognition[J]. IEEE Transactions on Multimedia, 2017, 19(12): 2736-2750.
5 WANG Y , MORARIU V I , DAVIS L S . Learning a discriminative filter bank within a CNN for fine-grained recognition[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4148-4157.
6 王永雄,张晓兵 . 聚焦—识别网络架构的细粒度图像分类[J]. 中国图象图形学报, 2019, 24(4): 493-502. (WANG Y X, ZHANG X B. Fine-grained image classification with network architecture of focus and recognition[J]. Journal of Image and Graphics, 2019, 24(4): 493-502.)
7 YANG Z , LUO T , WANG D , et al . Learning to navigate for fine-grained classification[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11218. Cham: Springer, 2018: 438-454.
8 ZHAO B , WU X , FENG J , et al . Diversified visual attention networks for fine-grained object classification[J]. IEEE Transactions on Multimedia, 2017, 19(6): 1245-1256.
9 SUN M , YUAN Y , ZHOU F , et al . Multi-attention multi-class constraint for fine-grained image recognition[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11220. Cham: Springer, 2018: 834-850.
10 王培森,宋彦,戴礼荣 . 基于多通道视觉注意力的细粒度图像分类[J]. 数据采集与处理, 2019, 34(1): 157-166. (WANG P S, SONG Y, DAI L R. Fine-grained image classification with multi-channel visual attention[J]. Journal of Data Acquisition and Processing, 2019, 34(1): 157-166.)
11 边小勇,费雄君,穆楠 .基于尺度注意力网络的遥感图像场景分类[J]. 计算机应用, 2020, 40(3): 872-877.（BIAN X Y, FEI X J, MU N. Remote sensing image scene classification based on scale-attention network[J]. Journal of Computer Applications, 2020, 40(3): 872-877.)
12 ZHOU B , KHOSLA A , LAPEDRIZA A , et al . Learning deep features for discriminative localization[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 2921-2929.
13 ZHOU Y , YE Q , QIU Q , et al . Oriented response networks[C]// Proceedings of the 2017 International Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 4961-4970.
14 WANG J , LIU W , MA L , et al . IORN: an effective remote sensing image scene classification framework[J]. IEEE Geoscience and Remote Sensing Letters, 2018, 15(11): 1695-1699.
15 LIU X , XIA T , WANG J , et al . Fully convolutional attention networks for fine-grained recognition[C]// Proceedings of the 2016 International Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 1-9.
16 FU J , ZHENG H , MEI T . Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 4476-4484.
17 ZHENG H , FU J , MEI T , et al . Learning multi-attention convolutional neural network for fine-grained image recognition[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 5219-5227.
18 SERMANET P , FROME A , REAL E . Attention for fine-grained categorization[EB/OL]. [2019-03-13]. https://arxiv.org/pdf/1412.7054.pdf.
19 HAN J , YAO X , CHENG G , et al . P-CNN: part-based convolutional neural networks for fine-grained visual categorization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019(Early Access): 1.
20 DAI J , QI H , XIONG Y , et al . Deformable convolutional networks[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 764-773.
21 ZHU X , HU H , LIN S , et al. Deformable ConvNets V 2: more deformable, better results[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018:9300-9308.
22 WAH C, BRANSON S , WELINDER P , et al . The Caltech-UCSD birds-200-2011 dataset[R]. Pasadena, CA: California Institute of Technology, 2011.
23 MAJI S , RAHTU E , KANNALA J , et al . Fine-grained visual classification of aircraft[EB/OL]. [2019-11-05].https://arxiv.org/pdf/1306.5151v1.pdf.
24 ZHANG N , DONAHUE J , GIRSHICK R , et al . Part-based R-CNNs for fine-grained category detection[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8689. Cham: Springer, 2014:834-849.
25 WANG D , SHEN Z , SHAO J , et al . Multiple granularity descriptors for fine-grained categorization[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 2399-2406.
26 WANG Y , CHOI J , MORARIU V I , et al . Mining discriminative triplets of patches for fine-grained classification[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 1163-1172.
27 GOSSELIN P H , MURRAY N , JéGOU H , et al . Revisiting the fisher vector for fine-grained classification[J]. Pattern Recognition Letters, 2014, 49: 92-98.

[1]	Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974.
[2]	Xiyuan WANG, Zhancheng ZHANG, Shaokang XU, Baocheng ZHANG, Xiaoqing LUO, Fuyuan HU. Unsupervised cross-domain transfer network for 3D/2D registration in surgical navigation [J]. Journal of Computer Applications, 2024, 44(9): 2911-2918.
[3]	Shunyong LI, Shiyi LI, Rui XU, Xingwang ZHAO. Incomplete multi-view clustering algorithm based on self-attention fusion [J]. Journal of Computer Applications, 2024, 44(9): 2696-2703.
[4]	Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877.
[5]	Yunchuan HUANG, Yongquan JIANG, Juntao HUANG, Yan YANG. Molecular toxicity prediction based on meta graph isomorphism network [J]. Journal of Computer Applications, 2024, 44(9): 2964-2969.
[6]	Yuhan LIU, Genlin JI, Hongping ZHANG. Video pedestrian anomaly detection method based on skeleton graph and mixed attention [J]. Journal of Computer Applications, 2024, 44(8): 2551-2557.
[7]	Yanjie GU, Yingjun ZHANG, Xiaoqian LIU, Wei ZHOU, Wei SUN. Traffic flow forecasting via spatial-temporal multi-graph fusion [J]. Journal of Computer Applications, 2024, 44(8): 2618-2625.
[8]	Qianhong SHI, Yan YANG, Yongquan JIANG, Xiaocao OUYANG, Wubo FAN, Qiang CHEN, Tao JIANG, Yuan LI. Multi-granularity abrupt change fitting network for air quality prediction [J]. Journal of Computer Applications, 2024, 44(8): 2643-2650.
[9]	Yiqun ZHAO, Zhiyu ZHANG, Xue DONG. Anisotropic travel time computation method based on dense residual connection physical information neural networks [J]. Journal of Computer Applications, 2024, 44(7): 2310-2318.
[10]	Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199.
[11]	Xun SUN, Ruifeng FENG, Yanru CHEN. Monocular 3D object detection method integrating depth and instance segmentation [J]. Journal of Computer Applications, 2024, 44(7): 2208-2215.
[12]	Zheng WU, Zhiyou CHENG, Zhentian WANG, Chuanjian WANG, Sheng WANG, Hui XU. Deep learning-based classification of head movement amplitude during patient anaesthesia resuscitation [J]. Journal of Computer Applications, 2024, 44(7): 2258-2263.
[13]	Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072.
[14]	Zhi ZHANG, Xin LI, Naifu YE, Kaixi HU. DKP： defending against model stealing attacks based on dark knowledge protection [J]. Journal of Computer Applications, 2024, 44(7): 2080-2086.
[15]	Yajuan ZHAO, Fanjun MENG, Xingjian XU. Review of online education learner knowledge tracing [J]. Journal of Computer Applications, 2024, 44(6): 1683-1698.

Multi-branch neural network model based weakly supervised fine-grained image classification method

基于多分支神经网络模型的弱监督细粒度图像分类方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics