Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (1): 51-60.DOI: 10.11772/j.issn.1001-9081.2021122090
• Artificial intelligence • Previous Articles Next Articles
SHEN Zhijun1,2, MU Lina2, GAO Jing2, SHI Yuanhang2, LIU Zhiqiang2
Received:
2021-12-14
Revised:
2022-02-12
Online:
2022-08-02
Contact:
SHEN Zhijun, born in 1976, Ph. D., professor. His research interests include intelligent computing, data mining.
About author:
SHEN Zhijun, born in 1976, Ph. D., professor. His research interests include intelligent computing, data mining;MU Lina, born in 1996, M. S. candidate. Her research interests include computer vision, image recognition;GAO Jing, born in 1970, Ph. D., professor. Her research interests include big data intelligence and knowledge discovery, analysis of animal and plant phenotype and omics big data, intelligent system for agriculture and animal husbandry;SHI Yuanhang, born in 1997, M. S. candidate. His research interests include artificial intelligence;LIU Zhiqiang, born in 1996, M. S. candidate. His research interests include artificial intelligence;
Supported by:
申志军1,2, 穆丽娜2, 高静2, 史远航2, 刘志强2
通讯作者:
申志军(1976—),男,河南信阳人,教授,博士,主要研究方向:智能计算、数据挖掘shensljx@sina.com
作者简介:
申志军(1976—),男,河南信阳人,教授,博士,主要研究方向:智能计算、数据挖掘;穆丽娜(1996—),女,山西大同人,硕士研究生,主要研究方向:计算机视觉、图像识别;高静(1970—),女,内蒙古呼和浩特人,教授,博士生导师,博士,主要研究方向:大数据智能与知识发现、动植物表型与组学大数据分析、农牧业智能系统;史远航(1997—),男,河南新乡人,硕士研究生,主要研究方向:人工智能;刘志强(1996—),男,江西抚州人,硕士研究生,主要研究方向:人工智能;
基金资助:
CLC Number:
SHEN Zhijun, MU Lina, GAO Jing, SHI Yuanhang, LIU Zhiqiang. Review of fine-grained image categorization[J]. Journal of Computer Applications, 2023, 43(1): 51-60.
申志军, 穆丽娜, 高静, 史远航, 刘志强. 细粒度图像分类综述[J]. 《计算机应用》唯一官方网站, 2023, 43(1): 51-60.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021122090
1 ZOU D N, ZHANG S H, MU T J, et al. A new dataset of dog breed images and a benchmark for fine?grained classification[J]. Computational Visual Media, 2020, 6(4):477-487. 10.1007/s41095-020-0184-6 2 王美华,吴振鑫,周祖光. 基于注意力改进CBAM的农作物病虫害细粒度识别研究[J]. 农业机械学报, 2021, 52(4):239-247. 10.6041/j.issn.1000-1298.2021.04.025 WANG M H, WU Z X, ZHOU Z G. Fine?grained identification research of crop pests and diseases based on improved CBAM via attention[J]. Transactions of the Chinese Society for Agricultural Machinery, 2021, 52(4): 239- 247. 10.6041/j.issn.1000-1298.2021.04.025 3 陈前,刘骊,付晓东,等. 部件检测和语义网络的细粒度鞋类图像检索[J]. 中国图象图形学报, 2020, 25(8):1578-1590. 10.11834/jig.190467 CHEN Q, LIU L, FU X D, et al. Fine?grained shoe image retrieval by part detection and semantic network[J]. Journal of Image and Graphics, 2020, 25(8): 1578-1590. 10.11834/jig.190467 4 陈立潮,朝昕,曹建芳,等. 融合独立组件的ResNet在细粒度车型识别中的应用[J]. 计算机工程与应用, 2021, 57(11):248-253. CHEN L C, CHAO X, CAO J F, et al. Application of ResNet with independent components in fine?grained vehicle recognition[J]. Computer Engineering and Applications, 2021, 57(11):248-253. 5 BOSCH A, ZISSERMAN A, MUNOZ X. Scene classification using a hybrid generative/discriminative approach[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(4): 712-727. 10.1109/tpami.2007.70716 6 WU J X, REHG J M. CENTRIST: a visual descriptor for scene categorization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(8): 1489-501. 10.1109/tpami.2010.224 7 薄康虎,李菲菲,陈虬. 基于改进CNN特征的场景识别[J]. 计算机系统应用, 2018, 27(12):25-32. 10.15888/j.cnki.csa.006684 BO K H, LI F F, CHEN Q. Scene recognition algorithm using advanced CNN features[J]. Computer Systems and Applications, 2018, 27(12):25-32. 10.15888/j.cnki.csa.006684 8 SEONG H, HYUN J, KIM E. FOSNet: an end?to?end trainable deep neural network for scene recognition[J]. IEEE Access, 2020, 8:82066-82077. 10.1109/access.2020.2989863 9 CHEN L, BO K H, LEE F F, et al. Advanced feature fusion algorithm based on multiple convolutional neural network for scene recognition[J]. Computer Modeling in Engineering and Sciences, 2020, 122(2): 505-523. 10.32604/cmes.2020.08425 10 朱铭武,韩军,陆冬明,等. 自然场景中基于局部轮廓特征的对象识别方法[J]. 计算机工程与应用, 2016, 52(1):162-167. 10.3778/j.issn.1002-8331.1409-0267 ZHU M W, HAN J, LU D M, et al. Object recognition method based on local contour feature in natural scene[J]. Computer Engineering and Applications, 2016, 52(1):162-167. 10.3778/j.issn.1002-8331.1409-0267 11 GEHLER P, NOWOZIN S. On feature combination for multiclass object classification[C]// Proceedings of the IEEE 12th International Conference on Computer Vision. Piscataway: IEEE, 2009:221-228. 10.1109/iccv.2009.5459169 12 JARRETT K, KAVUKCUOGLU K, RANZATO M, et al. What is the best multi?stage architecture for object recognition?[C]// Proceedings of the IEEE 12th International Conference on Computer Vision. Piscataway: IEEE, 2009:2146-2153. 10.1109/iccv.2009.5459469 13 WRIGHT J, YANG A Y, GANESH A, et al. Robust face recognition via sparse representation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(2): 210-227. 10.1109/tpami.2008.79 14 李晓莉,达飞鹏. 基于排除算法的快速三维人脸识别方法[J]. 自动化学报, 2010, 36(1): 153-158. 10.3724/sp.j.1004.2010.00153 LI X L, DA F P. A rapid method for 3D face recognition based on rejection algorithm[J]. Acta Automatica Sinica, 2010, 36(1): 153-158. 10.3724/sp.j.1004.2010.00153 15 DENG J, DONG W, SOCHER R, et al. ImageNet: a large?scale hierarchical image database[C]// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2009:248-255. 10.1109/cvpr.2009.5206848 16 KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]// Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1. Red Hook, NY: Curran Associates Inc., 2012:1097-1105. 17 SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large?scale image recognition[EB/OL]. (2015-04-10) [2021-11-11].https://arxiv.org/pdf/1409.1556.pdf. 18 BO L F, REN X F, FOX D. Kernel descriptors for visual recognition[C]// Proceedings of the 23rd International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2010:244-252. 10.1109/iros.2011.6095119 19 LOWE D G. Distinctive image features from scale?invariant key points[J]. International Journal of Computer Vision, 2004, 60(2): 91-110. 10.1023/b:visi.0000029664.99615.94 20 YAN K, SUKYHANKAR R. PCA-SIFT: a more distinctive representation for local image descriptors[C]// Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society,2004: 506-513. 10.1109/cvpr.2004.1314997 21 LOWE D G. Object recognition from local scale?invariant features[C]// Proceedings of the 7th IEEE International Conference on Computer Vision, Volume 2. Piscataway: IEEE, 1999:1150-1157. 10.1109/iccv.1999.790410 22 BAY H, TUYTELAARS T, GOOL L van. SURF: speeded up robust features[C]// Proceedings of the 2006 European Conference on Computer Vision, LNCS 3951. Berlin: Springer, 2006:404-417. 23 DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Volume 1. Piscataway: IEEE, 2005:886-893. 10.1109/cvpr.2005.177 24 OJALA T, PIETIKAINEN M, M?ENP?? T. Multiresolution gray?scale and rotation invariant texture classification with local binary patterns[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7): 971-987. 10.1109/tpami.2002.1017623 25 BERG T, BELHUMEUR P N. POOF: part?based one?vs.?one features for fine?grained categorization, face verification, and attribute estimation[C]// Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2013:955-962. 10.1109/cvpr.2013.128 26 PERRONNIN F, SáNCHEZ J, MENSINK T. Improving the Fisher kernel for large?scale image classification[C]// Proceedings of the 2010 European Conference on Computer Vision, LNCS 6314. Berlin: Springer, 2010: 143-156. 27 BRANSON S, HORN G van, WAH C, et al. The ignorant led by the blind: a hybrid human?machine vision system for fine?grained categorization[J]. International Journal of Computer Vision, 2014, 108(1/2): 3-29. 28 CHAI Y N, LEMPITSKY V, ZISSERMAN A. Symbiotic segmentation and part localization for fine?grained categorization[C]// Proceedings of the 2013 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2013:321-328. 10.1109/iccv.2013.47 29 GAVVES E, FERNANDO B, SNOEK C G M, et al. Fine?grained categorization by alignments[C]// Proceedings of the 2013 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2013:1713-1720. 10.1109/iccv.2013.215 30 BRANSON S, WAH C, SCHROFF F, et al. Visual recognition with humans in the loop[C]// Proceedings of the 2010 European Conference on Computer Vision, LNCS 6314. Berlin: Springer, 2010: 438-451. 31 WAH C, BRANSON S, PERONA P, et al. Multiclass recognition and part localization with humans in the loop[C]// Proceedings of the 2011 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2011: 2524-2531. 10.1109/iccv.2011.6126539 32 WANG D Q, SHEN Z Q, SHAO J, et al. Multiple granularity descriptors for fine?grained categorization[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015:2399-2406. 10.1109/iccv.2015.276 33 WANG Y M, CHOI J, MORARIU V I, et al. Mining discriminative triplets of patches for fine?grained classification[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016:1163-1172. 10.1109/cvpr.2016.131 34 LIN T Y, RoyCHOWDHURY A, MAJI S. Bilinear CNN models for fine?grained visual recognition[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015:1449-1457. 10.1109/iccv.2015.170 35 苏志明,王烈,蓝峥杰. 基于多尺度分层双线性池化网络的细粒度表情识别模型[J]. 计算机工程, 2021, 47(12):299-307, 315. 10.19678/j.issn.1000-3428.0060133 SU Z M, WANG L, LAN Z J. Fine?grained expression recognition model based on multi?scale hierarchical bilinear pooling network[J]. Computer Engineering, 2021, 47(12):299-307, 315. 10.19678/j.issn.1000-3428.0060133 36 ZHANG Y, WEI X S, WU J X, et al. Weakly supervised fine?grained categorization with part?based image representation[J]. IEEE Transactions on Image Processing, 2016, 25(4): 1713-1725. 10.1109/tip.2016.2531289 37 XIAO T J, XU Y C, YANG K Y, et al. The application of two?level attention models in deep convolutional neural network for fine?grained image classification[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015:842-850. 10.1109/cvpr.2015.7298685 38 LIU X, XIA T, WANG J, et al. Fully convolutional attention networks for fine?grained recognition[EB/OL]. (2017-03-21) [2021-11-11].https://arxiv.org/pdf/1603.06765.pdf. 39 FU J L, ZHENG H L, MEI T. Look closer to see better: recurrent attention convolutional neural network for fine?grained image recognition[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017:4476-4484. 10.1109/cvpr.2017.476 40 王林,李聪会. 基于多级注意力跳跃连接网络的行人属性识别[J]. 计算机工程, 2021, 47(2):314-320. 10.19678/j.issn.1000-3428.0057107 WANG L, LI C H. Pedestrian attribute recognition based on multi?level attention skip connection network[J]. Computer Engineering, 2021, 47(2):314-320. 10.19678/j.issn.1000-3428.0057107 41 李宽宽,刘立波. 双线性聚合残差注意力的细粒度图像分类模型[J]. 计算机科学与探索, 2022, 16(4):938-949. 10.3778/j.issn.1673-9418.2010031 LI K K, LIU L B. Fine?grained image classification model based on bilinear aggregate residual attention[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(4):938-949. 10.3778/j.issn.1673-9418.2010031 42 陆鑫伟,余鹏飞,李海燕,等. 基于注意力自身线性融合的弱监督细粒度图像分类算法[J]. 计算机应用, 2021, 41(5):1319-1325. 10.11772/j.issn.1001-9081.2020071105 LU X W, YU P F, LI H Y, et al. Weakly supervised fine?grained image classification method based on attention?attention bilinear pooling[J]. Journal of Computer Applications, 2021, 41(5):1319-1325. 10.11772/j.issn.1001-9081.2020071105 43 WAH C, BRANSON S, WELINDER P, et al. The Caltech?UCSD Birds200?2011 dataset: CNS?TR?2011?001[R]. Pasadena, CA: California Institute of Technology, 2011. 44 KHOSLA A, JAYADEVAPRAKASH N, YAO B P, et al. Novel dataset for fine?grained image categorization[C/OL]// Proceedings of the 1st Workshop on Fine?Grained Visual Categorization at CVPR 2011. [2021-11-11].https://people.csail.mit.edu/khosla/papers/fgvc2011.pdf. 45 KRAUSE J, STARK M, DENG J, et al. 3D object representations for fine-grained categorization[C]// Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops. Piscataway: IEEE, 2013:554-561. 10.1109/iccvw.2013.77 46 MAJI S, RAHTU E, KANNALA J, et al. Fine?grained visual classification of aircraft[EB/OL]. (2013-06-21) [2021-10-08].https://arxiv.org/pdf/1306.5151.pdf. 47 NILSBACK M E, ZISSERMAN A. Automated flower classification over a large number of classes[C]// Proceedings of the 6th Indian Conference on Computer Vision, Graphics and Image Processing. Piscataway: IEEE, 2008:722-729. 10.1109/icvgip.2008.47 48 FISHER R B, CHEN?BURGER Y H, GIORDANO D, et al. Fish4Knowledge: Collecting and Analyzing Massive Coral Reef Fish Video Data, ISRL 104[M]. Cham: Springer, 2016. 10.1007/978-3-319-30208-9 49 ZHUANG P Q, WANG Y L, QIAO Y. WildFish: a large benchmark for fish recognition in the wild[C]// Proceedings of 26th ACM Multimedia Conference. New York: ACM, 2018:1301-1309. 10.1145/3240508.3240616 50 DONAHUE J, JIA Y Q, VINYALS O, et al. DeCAF: a deep convolutional activation feature for generic visual recognition[C]// Proceedings of the 31st International Conference on Machine Learning. New York: JMLR.org, 2014:647-655. 51 FARRELL R, OZA O, ZHANG N, et al. Birdlets: subordinate categorization using volumetric primitives and pose?normalized appearance[C]// Proceedings of the 2011 International Conference on Computer Vision. Piscataway: IEEE, 2011:161-168. 10.1109/iccv.2011.6126238 52 BOURDEV L, MALIK J. Poselets: body part detectors trained using 3D human pose annotations[C]// Proceedings of the 2009 IEEE 12th International Conference on Computer Vision. Piscataway: IEEE, 2009:1365-1372. 10.1109/iccv.2009.5459303 53 BOURDEV L, MAJI S, MALIK J. Describing people: poselet?based approach to attribute classification[C]// Proceedings of the 2011 International Conference on Computer Vision. Piscataway: IEEE, 2011:1543-1550. 10.1109/iccv.2011.6126413 54 FELZENSZWALB P F, GIRSHICK R B, McALLESTER D, et al. Object detection with discriminatively trained part based models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9):1627-1645. 10.1109/tpami.2009.167 55 PARKHI O M, VEDALDI A, JAWAHAR C V, et al. The truth about cats and dogs[C]// Proceedings of the 2011 International Conference on Computer Vision. Piscataway: IEEE, 2011:1427-1434. 10.1109/iccv.2011.6126398 56 ZHANG N, FARRELL R, IANDOLA F, et al. Deformable part descriptors for fine?grained recognition and attribute prediction[C]// Proceedings of the 2013 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2013:729-736. 10.1109/iccv.2013.96 57 ZHANG N, PALURI M, RANZATO M, et al. PANDA: pose aligned networks for deep attribute modeling[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014:1637-1644. 10.1109/cvpr.2014.212 58 GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014:580-587. 10.1109/cvpr.2014.81 59 ZHANG N, DONAHUE J, GIRSHICK R, et al. Part?based RCNNs for fine?grained category detection[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8689. Cham: Springer, 2014:834-849. 60 BRANSON S, HORN G van, BELONGIE S, et al. Bird species categorization using pose normalized deep convolutional nets[C]// Proceedings of the 2014 British Machine Vision Conference. Durham: BMVA Press, 2014:No.71. 10.5244/c.28.87 61 LIN D, SHEN X Y, LU C W, et al. Deep LAC: deep localization, alignment and classification for fine?grained recognition[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015:1666-1674. 10.1109/cvpr.2015.7298775 62 SHELHAMER E, LONG J, DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4):640-651. 10.1109/tpami.2016.2572683 63 WEI X S, XIE C W, WU J X, et al. Mask?CNN: localizing parts and selecting descriptors for fine?grained image recognition[J]. Pattern Recognition, 2018, 76:704-714. 10.1016/j.patcog.2017.10.002 64 黄伟锋,张甜,常东良,等. 基于多视角融合的细粒度图像分类方法[J]. 信号处理, 2020, 36(9):1607-1614. 10.16798/j.issn.1003-0530.2020.09.027 HUANG W F, ZHANG T, CHANG D L, et al. Multi?view comprehensive based fine?grained image classification[J]. Journal of Signal Processing, 2020, 36(9):1607-1614. 10.16798/j.issn.1003-0530.2020.09.027 65 GAO Y, BEIJBOM O, ZHANG N, et al. Compact bilinear pooling[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016:317-326. 10.1109/cvpr.2016.41 66 KAR P, KARNICK H. Random feature maps for dot product kernels[C]// Proceedings of the 15th International Conference on Artificial Intelligence and Statistics. New York: JMLR.org, 2012:583-591. 67 PHAM N, PAGH R. Fast and scalable polynomial kernels via explicit feature maps[C]// Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2013:239-247. 10.1145/2487575.2487591 68 KONG S, FOWLKES C. Low?rank bilinear pooling for fine?grained classification[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017:7025-7034. 10.1109/cvpr.2017.743 69 LI Y H, WANG N Y, LIU J Y, et al. Factorized bilinear models for image recognition[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017:2098-2106. 10.1109/iccv.2017.229 70 葛疏雨,高子淋,张冰冰,等. 基于核化双线性卷积网络的细粒度图像分类[J]. 电子学报, 2019, 47(10):2134-2141. 10.3969/j.issn.0372-2112.2019.10.015 GE S Y, GAO Z L, ZHANG B B, et al. Kernelized bilinear CNN models for fine?grained visual recognition[J]. Acta Electronica Sinica, 2019, 47(10):2134-2141. 10.3969/j.issn.0372-2112.2019.10.015 71 LIN T Y, MAJI S. Improved bilinear pooling with CNNs[C]// Proceedings of the 2017 British Machine Vision Conference. Durham: BMVA Press, 2017: No.117. 10.5244/c.31.117 72 CUI Y, ZHOU F, WANG J, et al. Kernel pooling for convolutional neural networks[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017:3049-3058. 10.1109/cvpr.2017.325 73 MOGHIMI M, SABERIAN M, YANG J, et al. Boosted convolutional neural networks[C]// Proceedings of the 2016 British Machine Vision Conference. Durham: BMVA Press, 2016: No.24. 10.5244/c.30.24 74 闫子旭,侯志强,熊磊,等. YOLOv3和双线性特征融合的细粒度图像分类[J]. 中国图象图形学报, 2021, 26(4):847-856. 10.11834/jig.200031 YAN Z X, HOU Z Q, XIONG L, et al. Fine?grained classification based on bilinear feature fusion and YOLOv3[J]. Journal of Image and Graphics, 2021, 26(4):847-856. 10.11834/jig.200031 75 YU C J, ZHAO X Y, ZHENG Q, et al. Hierarchical bilinear pooling for fine?grained visual recognition[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11220. Cham: Springer, 2018:595-610. 76 ITTI L, KOCH C, NIEBUR E. A model of saliency?based visual attention for rapid scene analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(11):1254-1259. 10.1109/34.730558 77 MNIH V, HEESS N, GRAVES, et al. Recurrent models of visual attention[C]// Proceedings of the 27th International Conference on Neural Information Processing Systems, Volume 2. Cambridge: MIT Press, 2014:2204-2212. 78 ZHENG H L, FU J L, MEI T, et al. Learning multi?attention convolutional neural network for fine?grained image recognition[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017:5219-5227. 10.1109/iccv.2017.557 79 CHANG D L, DING Y F, XIE J Y, et al. The devil is in the channels: mutual?channel loss for fine?grained image classification[J]. IEEE Transactions on Image Processing, 2020, 29:4683-4695. 10.1109/tip.2020.2973812 80 ZHUANG P Q, WANG Y L, QIAO Y. Learning attentive pairwise interaction for fine?grained classification[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020:13130-13137. 10.1609/aaai.v34i07.7016 81 ZHANG T, CHANG D L, MA Z Y, et al. Progressive co?attention network for fine?grained visual classification[C]// Proceedings of the 2021 International Conference on Visual Communications and Image Processing. Piscataway: IEEE, 2021:1-5. 10.1109/vcip53242.2021.9675376 82 JI R Y, WEN L Y, ZHANG L B, et al. Attention convolutional binary neural tree for fine?grained visual categorization[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020:10465-10474. 10.1109/cvpr42600.2020.01048 83 SUN M, YUAN Y C, ZHOU F, et al. Multi?attention multi?class constraint for fine?grained image recognition[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11220. Cham: Springer, 2018: 834-850. 84 VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017:6000-6010. 85 OTT M, EDUNOV S, GRANGIER D, et al. Scaling neural machine translation[C]// Proceedings of the 3rd Conference on Machine Translation: Research Papers. Stroudsburg, PA: ACL, 2018:1-9. 10.18653/v1/w18-6301 86 DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[EB/OL]. (2021-06-03) [2021-06-11].https://arxiv.org/pdf/2010.11929.pdf. 87 CARION N, MASSA F, SYNNAEVE G, et al. End?to?end object detection with transformers[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12346. Cham: Springer, 2020:213-229. 88 ZHU X Z, SU W J, LU L W, et al. Deformable DETR: deformable Transformers for end?to?end object detection[EB/OL]. (2021-03-18) [2021-11-11].https://arxiv.org/pdf/2010.04159.pdf. |
[1] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. |
[2] | Xiyuan WANG, Zhancheng ZHANG, Shaokang XU, Baocheng ZHANG, Xiaoqing LUO, Fuyuan HU. Unsupervised cross-domain transfer network for 3D/2D registration in surgical navigation [J]. Journal of Computer Applications, 2024, 44(9): 2911-2918. |
[3] | Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738. |
[4] | Shunyong LI, Shiyi LI, Rui XU, Xingwang ZHAO. Incomplete multi-view clustering algorithm based on self-attention fusion [J]. Journal of Computer Applications, 2024, 44(9): 2696-2703. |
[5] | Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877. |
[6] | Yun LI, Fuyou WANG, Peiguang JING, Su WANG, Ao XIAO. Uncertainty-based frame associated short video event detection method [J]. Journal of Computer Applications, 2024, 44(9): 2903-2910. |
[7] | Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892. |
[8] | Yunchuan HUANG, Yongquan JIANG, Juntao HUANG, Yan YANG. Molecular toxicity prediction based on meta graph isomorphism network [J]. Journal of Computer Applications, 2024, 44(9): 2964-2969. |
[9] | Shuai FU, Xiaoying GUO, Ruyi BAI, Tao YAN, Bin CHEN. Age estimation method combining improved CloFormer model and ordinal regression [J]. Journal of Computer Applications, 2024, 44(8): 2372-2380. |
[10] | Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392. |
[11] | Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406. |
[12] | Yuhan LIU, Genlin JI, Hongping ZHANG. Video pedestrian anomaly detection method based on skeleton graph and mixed attention [J]. Journal of Computer Applications, 2024, 44(8): 2551-2557. |
[13] | Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594. |
[14] | Shangbin MO, Wenjun WANG, Ling DONG, Shengxiang GAO, Zhengtao YU. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding [J]. Journal of Computer Applications, 2024, 44(8): 2611-2617. |
[15] | Yanjie GU, Yingjun ZHANG, Xiaoqian LIU, Wei ZHOU, Wei SUN. Traffic flow forecasting via spatial-temporal multi-graph fusion [J]. Journal of Computer Applications, 2024, 44(8): 2618-2625. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||