计算机应用

• 人工智能与仿真 •    下一篇

基于全局与局部的多标签图像分类方法

任炜1,白鹤翔2   

  1. 1. 山西大学计算机与信息技术学院
    2. 山西大学
  • 收稿日期:2021-07-16 修回日期:2021-08-31 发布日期:2021-09-28 出版日期:2021-09-28
  • 通讯作者: 任炜

Multi-label image classification method based on global and local label relationship

  • Received:2021-07-16 Revised:2021-08-31 Online:2021-09-28 Published:2021-09-28

摘要: 针对多标签图像分类任务中存在的难以对标签间的相互作用建模和全局标签关系固化的问题,结合自注意力机制和知识蒸馏方法,提出一种基于全局与局部标签关系的多标签图像分类方法(ML-GLLR)。首先,局部标签关系(LLR)模型使用卷积神经网络(CNN)、语义模块和双层自注意力模型(DLSA)对局部标签关系建模,然后利用知识蒸馏(KD)方法使LLR学习全局标签关系。通过在公开数据集MSCOCO、VOC2007上对比了其他方法,实验得出,LLR较图卷积神经网络多标签图像分类(ML-GCN)方法在平均精度均值上分别提高了0.8%和0.6%,ML-GLLR较LLR在平均精度均值上分别进一步提高了0.2%和1.3%。实验结果表明ML-GLLR不仅能对标签间相互关系建模,也能避免全局标签关系固化问题。

Abstract: Abstract: Considering the issues of modeling the interaction between labels and solidified global label relationship in multi-label image classification task, a Multiple Label image classification method based on Global and Local Label Relationship (ML-GLLR) was proposed by combined with the self-attention mechanism and the knowledge distillation method. Firstly, Convolutional Neural Network (CNN), semantic modules and Dual Layer Self-Attention (DLSA) model were used by the Local Label Relationship (LLR) model to model local label relationship, and then the Knowledge Distillation (KD) method was used to make LLR learn global label relationship. Compared with other methods on the Microsoft Common Objects in Context (MSCOCO) and PASCAL VOC challenge 2007 (VOC2007) public database, the experimental result show that LLR achieves the mean average precision of 0.8% and 0.6% more than the Multiple label classification based on Graph Convolutional Network (ML-GCN), and ML-GLLR achieves the mean average precision of 0.2% and 1.3% more than LLR. The results show that the ML-GLLR can not only model the relationship between labels, but also avoid the problem of global label relationship solidification.

中图分类号: