Journal of Computer Applications ›› 2020, Vol. 40 ›› Issue (4): 1016-1022.DOI: 10.11772/j.issn.1001-9081.2019081475

• Artificial intelligence • Previous Articles     Next Articles

Zero-shot image classification based on visual error and semantic attributes

XU Ge1, XIAO Yongqiang2,3,4, WANG Tao1,2, CHEN Kaizhi2,3,4, LIAO Xiangwen2,3,4, WU Yunbing2,3,4   

  1. 1. College of Computer and Control Engineering, Minjiang University, Fuzhou Fujian 350108, China;
    2. College of Mathematics and Computer Science, Fuzhou University, Fuzhou Fujian 350116, China;
    3. Fujian Provincial Key Laboratory of Networking Computing and Intelligent Information Processing(Fuzhou University), Fuzhou Fujian 350116, China;
    4. Digital Fujian Financial Big Data Institute, Fuzhou Fujian 350116, China
  • Received:2019-09-03 Revised:2019-10-23 Online:2020-04-10 Published:2019-11-18
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China(61772135,U1605251,61703195),the Open Fund of the Key Laboratory of Network Data Science and Technology of the Chinese Academy of Sciences(CASNDST201708,CASNDST 201606), the Open Fund of the State Key Laboratory of Pattern Recognition (201900041),the Surface Program of Fujian Natural Science Foundation (2017J01755), the CERNET Innovation Project(NGII20160501).

基于视觉误差与语义属性的零样本图像分类

徐戈1, 肖永强2,3,4, 汪涛1,2, 陈开志2,3,4, 廖祥文2,3,4, 吴运兵2,3,4   

  1. 1. 闽江学院 计算机与控制工程学院, 福州 350108;
    2. 福州大学 数学与计算机科学学院, 福州 350116;
    3. 福建省网络计算与智能信息处理重点实验室(福州大学), 福州 350116;
    4. 数字福建金融大数据研究所, 福州 350116
  • 通讯作者: 廖祥文
  • 作者简介:徐戈(1978-),男,福建福州人,副教授,博士,主要研究方向:人工智能、自然语言处理;肖永强(1994-),男,福建龙岩人,硕士研究生,主要研究方向:多模态、人工智能;汪涛(1987-),男,福建福州人,讲师,博士,主要研究方向:场景理解、目标检测与分割;陈开志(1983-),男,福建福州人,讲师,博士,主要研究方向:自然语言处理、图像识别、深度学习算法;廖祥文(1980-),男,福建福州人,教授,博士,主要研究方向:观点挖掘、情感分析、自然语言处理;吴运兵(1976-),男,福建福州人,副教授,硕士,主要研究方向:知识表示与知识发现。
  • 基金资助:
    国家自然科学基金资助项目(61772135,U1605251,61703195);中国科学院网络数据科学与技术重点实验室开放课题基金资助项目(CASNDST201708,CASNDST201606);模式识别国家重点实验室开放课题基金资助项目(201900041);福建省自然科学基金面上项目(2017J01755);赛尔网络下一代互联网技术创新项目(NGII20160501)。

Abstract: In the practical applications of image classification,some categories may have no labeled training data at all. The purpose of Zero-Shot Learning(ZSL)is to transfer knowledge such as image features of labeled categories to unlabeled categories and to correctly classify the unlabeled categories. However,the existing state-of-the-art methods cannot explicitly distinguish the input image belonging to the known categories or unknown categories,which leads to a large performance gap for unlabeled categories between the traditional ZSL prediction and the Generalized ZSL(GZSL)prediction. Therefore,a method of fusing of visual error and semantic attributes was proposed to alleviate the prediction bias problem in zero-shot image classification. Firstly,a semi-supervised learning based generative adversarial network framework was designed to obtain visual error information,so as to predict whether the image belongs to the known categories. Then,a zero-shot image classification network combining semantic attributes was proposed to achieve zero-shot image classification. Finally,the performance of zero-shot image classification algorithm combining visual error and semantic attributes was tested on AwA2 (Animal with Attributes) and CUB (Caltech-UCSD-Birds-200-2011) datasets. The experimental results show that, compared to the baseline models,the proposed method can effectively alleviate the prediction bias problem,and has the harmonic index H increased by 31. 7 percentage points on AwA2 dataset and 8. 7 percentage points on CUB dataset.

Key words: Zero-Shot Learning (ZSL), image classification, generative adversarial network, visual error, semantic attribute

摘要: 在图像分类的实际应用过程中,部分类别可能完全没有带标签的训练数据。零样本学习(ZSL)的目的是将带标签类别的图像特征等知识迁移到无标签的类别上,实现无标签类别的正确分类。现有方法在测试时无法显式地区分输入图像属于已知类还是未知类,很大程度上导致未知类在传统设定下的ZSL和广义设定下的ZSL(GZSL)上的预测效果相差甚远。为此,提出一种融合视觉误差与属性语义信息的方法来缓解零样本图像分类中的预测偏置问题。首先,设计一种半监督学习方式的生成对抗网络架构来获取视觉误差信息,由此预测图像是否属于已知类;然后,提出融合属性语义信息的零样本图像分类网络来实现零样本图像分类;最后,测试融合视觉误差与属性语义的零样本图像分类方法在数据集AwA2和CUB上的效果。实验结果表明,与对比模型相比,所提方法有效缓解了预测偏置问题,其调和指标H在AwA2(Animal with Attributes)上提升了31.7个百分点,在CUB(Caltech-UCSD-Birds-200-2011)上提升了8.7个百分点。

关键词: 零样本学习, 图像分类, 生成对抗网络, 视觉误差, 属性语义

CLC Number: