《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (5): 1383-1390.DOI: 10.11772/j.issn.1001-9081.2021071240

• 人工智能 • 上一篇    下一篇

基于全局与局部标签关系的多标签图像分类方法

任炜(), 白鹤翔   

  1. 山西大学 计算机与信息技术学院,太原 030006
  • 收稿日期:2021-07-16 修回日期:2021-08-31 接受日期:2021-09-14 发布日期:2021-09-28 出版日期:2022-05-10
  • 通讯作者: 任炜
  • 作者简介:任炜(1996—),男,山西襄汾人,硕士研究生,主要研究方向:深度学习、计算机视觉 2783800599@qq.com
    白鹤翔(1980—),男,山西榆次人,副教授,博士,主要研究方向:机器学习、数据挖掘。
  • 基金资助:
    国家自然科学基金资助项目(41871286)

Multi-label image classification method based on global and local label relationship

Wei REN(), Hexiang BAI   

  1. School of Computer and Information Technology,Shanxi University,Taiyuan Shanxi 030006,China
  • Received:2021-07-16 Revised:2021-08-31 Accepted:2021-09-14 Online:2021-09-28 Published:2022-05-10
  • Contact: Wei REN
  • About author:REN Wei, born in 1996, M. S. candidate. His research interests include deep learning, computer vision.
    BAI Hexiang, born in 1980, Ph. D., associate professor. His research interests include machine learning, data mining.
  • Supported by:
    National Natural Science Foundation of China(41871286)

摘要:

针对多标签图像分类任务中存在的难以对标签间的相互作用建模和全局标签关系固化的问题,结合自注意力机制和知识蒸馏(KD)方法,提出了一种基于全局与局部标签关系的多标签图像分类方法(ML-GLLR)。首先,局部标签关系(LLR)模型使用卷积神经网络(CNN)、语义模块和双层自注意力(DLSA)模块对局部标签关系建模;然后,利用KD方法使LLR学习全局标签关系。在公开数据集MSCOCO2014和VOC2007上进行实验,LLR相较于基于图卷积神经网络多标签图像分类(ML-GCN)方法,在平均精度均值(mAP)上分别提高了0.8个百分点和0.6个百分点,ML-GLLR相较于LLR在mAP上分别进一步提高了0.2个百分点和1.3个百分点。实验结果表明,所提ML-GLLR不仅能对标签间的相互关系进行建模,也能避免全局标签关系固化的问题。

关键词: 图像分类, 自注意力机制, 深度学习, 知识蒸馏, 多标签分类

Abstract:

Considering the difficulty of modeling the interaction between labels and solidification of global label relationship in multi-label image classification tasks, a new Multiple-Label image classification method based on Global and Local Label Relationship (ML-GLLR) was proposed by combining self-attention mechanism and Knowledge Distillation (KD) method. Firstly, Convolutional Neural Network (CNN), semantic module and Dual Layer Self-Attention (DLSA) module were used by the Local Label Relationship (LLR) model to model local label relationship. Then, the KD method was used to make LLR learn global label relationship. The experimental results on the public datasets of MicroSoft Common Objects in COntext (MSCOCO) 2014 and PASCAL VOC challenge 2007 (VOC2007) show that, LLR improves the mean Average Precision (mAP) by 0.8 percentage points and 0.6 percentage points compared with Multiple Label classification based on Graph Convolutional Network (ML-GCN) respectively, and the proposed ML-GLLR increases the mAP by 0.2 percentage points and 1.3 percentage points compared with LLR. Experimental results show that, the proposed ML-GLLR can not only model the interaction between labels, but also avoid the problem of global label relationship solidification.

Key words: image classification, self-attention mechanism, deep learning, knowledge distillation, multi-label classification

中图分类号: