《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (1): 51-60.DOI: 10.11772/j.issn.1001-9081.2021122090

所属专题: 人工智能 综述

• 人工智能 • 上一篇    下一篇

细粒度图像分类综述

申志军1,2, 穆丽娜2, 高静2, 史远航2, 刘志强2   

  1. 1.阜阳师范大学 计算机与信息工程学院,安徽 阜阳 236037
    2.内蒙古农业大学 计算机与信息工程学院,呼和浩特 010011
  • 收稿日期:2021-12-14 修回日期:2022-02-12 发布日期:2022-08-02
  • 通讯作者: 申志军(1976—),男,河南信阳人,教授,博士,主要研究方向:智能计算、数据挖掘shensljx@sina.com
  • 作者简介:申志军(1976—),男,河南信阳人,教授,博士,主要研究方向:智能计算、数据挖掘;穆丽娜(1996—),女,山西大同人,硕士研究生,主要研究方向:计算机视觉、图像识别;高静(1970—),女,内蒙古呼和浩特人,教授,博士生导师,博士,主要研究方向:大数据智能与知识发现、动植物表型与组学大数据分析、农牧业智能系统;史远航(1997—),男,河南新乡人,硕士研究生,主要研究方向:人工智能;刘志强(1996—),男,江西抚州人,硕士研究生,主要研究方向:人工智能;
  • 基金资助:
    阜阳师范大学科学研究项目(2021KYQD0028); 内蒙古自治区科技攻关项目(2021GG0090); 内蒙古农业大学博士科研启动基金资助项目(BJ2013B?1); 内蒙纪检监察大数据实验室开放课题(IMDBD2020015)。

Review of fine-grained image categorization

SHEN Zhijun1,2, MU Lina2, GAO Jing2, SHI Yuanhang2, LIU Zhiqiang2   

  1. 1.School of Computer and Information Engineering, Fuyang Normal University, Fuyang Anhui 236037, China
    2.College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot Inner Mongolia 010011, China
  • Received:2021-12-14 Revised:2022-02-12 Online:2022-08-02
  • Contact: SHEN Zhijun, born in 1976, Ph. D., professor. His research interests include intelligent computing, data mining.
  • About author:SHEN Zhijun, born in 1976, Ph. D., professor. His research interests include intelligent computing, data mining;MU Lina, born in 1996, M. S. candidate. Her research interests include computer vision, image recognition;GAO Jing, born in 1970, Ph. D., professor. Her research interests include big data intelligence and knowledge discovery, analysis of animal and plant phenotype and omics big data, intelligent system for agriculture and animal husbandry;SHI Yuanhang, born in 1997, M. S. candidate. His research interests include artificial intelligence;LIU Zhiqiang, born in 1996, M. S. candidate. His research interests include artificial intelligence;
  • Supported by:
    This work is partially supported by Scientific Research Project of Fuyang Normal University (2021KYQD0028), Science and Technology Research Project of Inner Mongolia Autonomous Region (2021GG0090), Doctoral Research Start?up Fund of Inner Mongolia Agricultural University (BJ2013B?1), Open Project of Inner Mongolia Discipline Inspection and Supervision Big Data Laboratory (IMDBD2020015).

摘要: 细粒度图像具有类内方差大、类间方差小的特点,致使细粒度图像分类(FGIC)的难度远高于传统的图像分类任务。介绍了FGIC的应用场景、任务难点、算法发展历程和相关的常用数据集,主要概述相关算法:基于局部检测的分类方法通常采用连接、求和及池化等操作,模型训练较为复杂,在实际应用中存在较多局限;基于线性特征的分类方法模仿人类视觉的两个神经通路分别进行识别和定位,分类效果相对较优;基于注意力机制的分类方法模拟人类观察外界事物的机制,先扫描全景,后锁定重点关注区域并形成注意力焦点,分类效果有进一步的提高。最后针对目前研究的不足,展望FGIC下一步的研究方向。

关键词: 细粒度图像分类, 深度学习, 卷积神经网络, 注意力机制, 计算机视觉

Abstract: The fine-grained image has characteristics of large intra-class variance and small inter-class variance, which makes Fine-Grained Image Categorization (FGIC) much more difficult than traditional image classification tasks. The application scenarios, task difficulties, algorithm development history and related common datasets of FGIC were described, and an overview of related algorithms was mainly presented. Classification methods based on local detection usually use operations of connection, summation and pooling, and the model training was complex and had many limitations in practical applications. Classification methods based on linear features simulated two neural pathways of human vision for recognition and localization respectively, and the classification effect is relatively better. Classification methods based on attention mechanism simulated the mechanism of human observation of external things, scanning the panorama first, and then locking the key attention area and forming the attention focus, and the classification effect was further improved. For the shortcomings of the current research, the next research directions of FGIC were proposed.

Key words: Fine-Grained Image Categorization (FGIC), deep learning, Convolutional Neural Network (CNN), attention mechanism, computer vision

中图分类号: