Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (8): 2556-2563.DOI: 10.11772/j.issn.1001-9081.2022071090
• Multimedia computing and computer simulation • Previous Articles
Received:
2022-07-27
Revised:
2022-11-03
Accepted:
2022-11-07
Online:
2023-01-15
Published:
2023-08-10
Contact:
Xuanlin WANG
About author:
QI Ailing, born in 1972, Ph. D., associate professor. Her research interests include artificial intelligence, digital image processing.
Supported by:
通讯作者:
王宣淋
作者简介:
齐爱玲(1972—),女,陕西西安人,副教授,博士,主要研究方向:人工智能、数字图像处理;
基金资助:
CLC Number:
Ailing QI, Xuanlin WANG. Fine-grained image recognition based on mid-level subtle feature extraction and multi-scale feature fusion[J]. Journal of Computer Applications, 2023, 43(8): 2556-2563.
齐爱玲, 王宣淋. 基于中层细微特征提取与多尺度特征融合细粒度图像识别[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2556-2563.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022071090
数据集 | 名字 | 类别数 | 样本数 | |
---|---|---|---|---|
训练集 | 测试集 | |||
CUB-200-2011 | Bird | 200 | 5 994 | 5 794 |
Stanford Cars | Car | 196 | 8 144 | 8 041 |
FGVC-Aircraft | Aircraft | 100 | 6 667 | 3 333 |
Tab. 1 Statistics of three fine-grained datasets
数据集 | 名字 | 类别数 | 样本数 | |
---|---|---|---|---|
训练集 | 测试集 | |||
CUB-200-2011 | Bird | 200 | 5 994 | 5 794 |
Stanford Cars | Car | 196 | 8 144 | 8 041 |
FGVC-Aircraft | Aircraft | 100 | 6 667 | 3 333 |
CUB-200-2011 | Stanford Cars | FGVC-Aircraft | |
---|---|---|---|
0.5 | 87.64 | 91.10 | 90.94 |
0.6 | 88.10 | 92.54 | 91.46 |
0.7 | 88.94 | 93.36 | 93.20 |
0.8 | 89.52 | 94.64 | 92.79 |
0.9 | 89.18 | 93.87 | 92.56 |
Tab. 2 Top-1 Accuracy of different ? values on datasets
CUB-200-2011 | Stanford Cars | FGVC-Aircraft | |
---|---|---|---|
0.5 | 87.64 | 91.10 | 90.94 |
0.6 | 88.10 | 92.54 | 91.46 |
0.7 | 88.94 | 93.36 | 93.20 |
0.8 | 89.52 | 94.64 | 92.79 |
0.9 | 89.18 | 93.87 | 92.56 |
算法 | CUB-200-2011 | Stanford Cars | FGVC Aircraft | |||
---|---|---|---|---|---|---|
Top-1 | Top-5 | Top-1 | Top-5 | Top-1 | Top-5 | |
ResNet | 85.50 | 92.54 | 89.80 | 94.63 | 90.30 | 94.41 |
ResNet-CPFDEN | 88.94 | 96.44 | 93.40 | 97.82 | 92.60 | 96.83 |
Resnet-CPFDEN-CSMFN | 89.52 | 98.46 | 94.64 | 98.62 | 93.20 | 97.98 |
Tab. 3 Results of ablation experiments on three datasets
算法 | CUB-200-2011 | Stanford Cars | FGVC Aircraft | |||
---|---|---|---|---|---|---|
Top-1 | Top-5 | Top-1 | Top-5 | Top-1 | Top-5 | |
ResNet | 85.50 | 92.54 | 89.80 | 94.63 | 90.30 | 94.41 |
ResNet-CPFDEN | 88.94 | 96.44 | 93.40 | 97.82 | 92.60 | 96.83 |
Resnet-CPFDEN-CSMFN | 89.52 | 98.46 | 94.64 | 98.62 | 93.20 | 97.98 |
算法 | CUB-200-2011 | Stanford Cars | FGVC-Aircraft |
---|---|---|---|
DCL-Net | 87.40 | 93.10 | 91.70 |
TPA-CNN | 88.00 | 94.00 | 91.70 |
ACB-Net | 88.10 | 94.60 | 92.40 |
本文算法 | 89.52 | 94.64 | 93.20 |
Tab. 4 Comparison of Top-1 classification accuracy of different algorithms on three datasets
算法 | CUB-200-2011 | Stanford Cars | FGVC-Aircraft |
---|---|---|---|
DCL-Net | 87.40 | 93.10 | 91.70 |
TPA-CNN | 88.00 | 94.00 | 91.70 |
ACB-Net | 88.10 | 94.60 | 92.40 |
本文算法 | 89.52 | 94.64 | 93.20 |
算法 | CUB-200-2011 | Stanford Cars | FGVC-Aircraft | |||
---|---|---|---|---|---|---|
Top-1 | Top-5 | Top-1 | Top-5 | Top-1 | Top-5 | |
PPL-Net | 88.30 | — | 94.00 | — | 92.60 | — |
PCA-Net | 88.30 | 97.43 | 94.30 | 97.74 | 92.40 | 96.86 |
本文算法 | 89.52 | 98.46 | 94.64 | 98.62 | 93.20 | 97.98 |
Tab. 5 Comparison of accuracy of the proposed algorithm with PPL-Net and PCA-Net algorithms on three datasets
算法 | CUB-200-2011 | Stanford Cars | FGVC-Aircraft | |||
---|---|---|---|---|---|---|
Top-1 | Top-5 | Top-1 | Top-5 | Top-1 | Top-5 | |
PPL-Net | 88.30 | — | 94.00 | — | 92.60 | — |
PCA-Net | 88.30 | 97.43 | 94.30 | 97.74 | 92.40 | 96.86 |
本文算法 | 89.52 | 98.46 | 94.64 | 98.62 | 93.20 | 97.98 |
1 | 马瑶,智敏,殷雁君,等. CNN和Transformer在细粒度图像识别中的应用综述[J]. 计算机工程与应用, 2022, 58(19):53-63. 10.3778/j.issn.1002-8331.2201-0374 |
MA Y, ZHI M, YIN Y J, et al. Review of applications of CNN and Transformer in fine-grained image recognition[J]. Computer Engineering and Applications, 2022, 58(19):53-63. 10.3778/j.issn.1002-8331.2201-0374 | |
2 | WEI X S, XIE C W, WU J X, et al. Mask-CNN: localizing parts and selecting descriptors for fine-grained bird species categorization[J]. Pattern Recognition, 2018, 76:704-714. 10.1016/j.patcog.2017.10.002 |
3 | ZHANG N, DONAHUE J, GIRSHICK R, et al. Part-based R-CNNs for fine-grained category detection[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8689. Cham: Springer, 2014:834-849. |
4 | ZHANG X F, LIN W S, HUANG Q M. Fine-grained image quality assessment: a revisit and further thinking[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(5):2746-2759. 10.1109/tcsvt.2021.3096528 |
5 | CHEN Y, BAI Y L, ZHANG W, et al. Destruction and construction learning for fine-grained image recognition[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019:5152-5161. 10.1109/cvpr.2019.00530 |
6 | YAN T T, WANG S J, WANG Z H, et al. Progressive learning for weakly supervised fine-grained classification[J]. Signal Processing, 2020, 171: No.107519. 10.1016/j.sigpro.2020.107519 |
7 | ZHANG T, CHANG D L, MA Z Y, et al. Progressive co-attention network for fine-grained visual classification[C]// Proceedings of the 2021 International Conference on Visual Communications and Image Processing. Piscataway: IEEE, 2021:1-5. 10.1109/vcip53242.2021.9675376 |
8 | ZHAO Y F, YAN K, HUANG F Y, et al. Graph-based high order relation discovery for fine-grained recognition[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 15074-15083. 10.1109/cvpr46437.2021.01483 |
9 | WEI H, ZHU M, WANG B, et al. Two-level progressive attention convolutional network for fine-grained image recognition[J]. IEEE Access, 2020, 8:104985-104995. 10.1109/access.2020.2999722 |
10 | 东南大学. 一种基于多尺度特征融合的图像细粒度识别方法: 201910282865.4[P]. 2019-08-06. |
Southeast University. A fine-grained image recognition method based on multi-scale feature fusion: 201910282865.4[P]. 2019-08-06. | |
11 | JI R Y, WEN L Y, ZHANG L B, et al. Attention convolutional binary neural tree for fine-grained visual categorization[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020:10465-10474. 10.1109/cvpr42600.2020.01048 |
12 | YAN T T, SHI J, LI H J. Discriminative information restoration and extraction for weakly supervised low-resolution fine-grained image recognition[J]. Pattern Recognition, 2022, 127: No.108629. 10.1016/j.patcog.2022.108629 |
13 | CAO S Y, WANG W, ZHANG J, et al. A few-shot fine-grained image classification method leveraging global and local structures[J]. International Journal of Machine Learning and Cybernetics, 2022, 13(8):2273-2281. 10.1007/s13042-022-01522-w |
14 | WANG L, HE K, FENG X, et al. Multilayer feature fusion with parallel convolutional block for fine-grained image classification[J]. Applied Intelligence, 2022, 52(3):2872-2883. 10.1007/s10489-021-02573-2 |
15 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016:770-778. 10.1109/cvpr.2016.90 |
16 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018:7132-7141. 10.1109/cvpr.2018.00745 |
17 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 3-19. |
18 | WAH C, BRANSON S, WELINDER P, et al. The Caltech-UCSD Birds-200-2011 dataset[EB/OL]. [2020-07-05].. |
19 | KRAUSE J, STARK M, DENG J, et al. 3D object representations for fine-grained categorization[C]// Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops. Piscataway: IEEE, 2013: 554-561. 10.1109/iccvw.2013.77 |
20 | MAJI S, RAHTU E, KANNALA J, et al. Fine-grained visual classification of aircraft[EB/OL]. (2013-06-21) [2020-07-05].. |
21 | LI P H, XIE J T, WANG Q L, et al. Towards faster training of global covariance pooling networks by iterative matrix square root normalization[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 947-955. 10.1109/cvpr.2018.00105 |
22 | LERMA M, LUCAS M. Grad-CAM++ is equivalent to Grad-CAM with positive gradients[C/OL]// Proceedings of the 24th Irish Machine Vision and Image Processing Conference [2022-05-22].. 10.56541/awjv6348 |
[1] | Yumeng CUI, Jingya WANG, Xiaowen LIU, Shangyi YAN, Zhizhong TAO. General text classification model combining attention and cropping mechanism [J]. Journal of Computer Applications, 2023, 43(8): 2396-2405. |
[2] | Zexi JIN, Lei LI, Ji LIU. Transfer learning model based on improved domain separation network [J]. Journal of Computer Applications, 2023, 43(8): 2382-2389. |
[3] | Yuan LIU, Yongquan DONG, Rui JIA, Haolin YANG. Hierarchical and phased attention network model for personalized course recommendation [J]. Journal of Computer Applications, 2023, 43(8): 2358-2363. |
[4] | Jinghong WANG, Zhixia ZHOU, Hui WANG, Haokang LI. Attribute network representation learning with dual auto-encoder [J]. Journal of Computer Applications, 2023, 43(8): 2338-2344. |
[5] | Min LIANG, Jiayi LIU, Jie LI. Image super-resolution reconstruction method based on iterative feedback and attention mechanism [J]. Journal of Computer Applications, 2023, 43(7): 2280-2287. |
[6] | Kunpei YE, Xi XIONG, Zhe DING. Recruitment recommendation model based on field fusion and time weight [J]. Journal of Computer Applications, 2023, 43(7): 2133-2139. |
[7] | Shuai ZHENG, Xiaolong ZHANG, He DENG, Hongwei REN. 3D liver image segmentation method based on multi-scale feature fusion and grid attention mechanism [J]. Journal of Computer Applications, 2023, 43(7): 2303-2310. |
[8] | Yuxin TUO, Tao XUE. Joint triple extraction model combining pointer network and relational embedding [J]. Journal of Computer Applications, 2023, 43(7): 2116-2124. |
[9] | Yuanyuan QIN, Hong ZHANG. Pulmonary nodule detection algorithm based on attention feature pyramid networks [J]. Journal of Computer Applications, 2023, 43(7): 2311-2318. |
[10] | Yuan WEI, Yan LIN, Shengnan GUO, Youfang LIN, Huaiyu WAN. Prediction of taxi demands between urban regions by fusing origin-destination spatial-temporal correlation [J]. Journal of Computer Applications, 2023, 43(7): 2100-2106. |
[11] | Zhongyu LI, Haodong SUN, Jiao LI. Lightweight gesture recognition algorithm for basketball referee [J]. Journal of Computer Applications, 2023, 43(7): 2173-2181. |
[12] | Huibin ZHANG, Liping FENG, Yaojun HAO, Yining WANG. Ancient mural dynasty identification based on attention mechanism and transfer learning [J]. Journal of Computer Applications, 2023, 43(6): 1826-1832. |
[13] | Zhixiong ZHENG, Jianhua LIU, Shuihua SUN, Ge XU, Honghui LIN. Aspect-based sentiment analysis model fused with multi-window local information [J]. Journal of Computer Applications, 2023, 43(6): 1796-1802. |
[14] | Hui WANG, Jianhong LI. Few-shot recognition method of 3D models based on Transformer [J]. Journal of Computer Applications, 2023, 43(6): 1750-1758. |
[15] | Ke FANG, Rong LIU, Chiyu WEI, Xinyue ZHANG, Yang LIU. Pedestrian fall detection algorithm in complex scenes [J]. Journal of Computer Applications, 2023, 43(6): 1811-1817. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||