Journal of Computer Applications ›› 2018, Vol. 38 ›› Issue (11): 3199-3203.DOI: 10.11772/j.issn.1001-9081.2018041349

Previous Articles     Next Articles

Image automatic annotation based on transfer learning and multi-label smoothing strategy

WANG Peng1,2, ZHANG Aofan1, WANG Liqin1,2, DONG Yongfeng1,2   

  1. 1. School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China;
    2. Hebei Province Key Laboratory of Big Data Calculation(Hebei University of Technology), Tianjin 300401, China
  • Received:2018-04-23 Revised:2018-06-15 Online:2018-11-10 Published:2018-11-10
  • Supported by:
    This work is partially supported by the Basic Research Plan Major Project of Hebei Province (F2016202144), the Application Basis and Advanced Technology Research Plan of Tianjin (15JCTPJC62000, 16JCYBJC15600).

基于迁移学习与多标签平滑策略的图像自动标注

汪鹏1,2, 张奥帆1, 王利琴1,2, 董永峰1,2   

  1. 1. 河北工业大学 人工智能与数据科学学院, 天津 300401;
    2. 河北省大数据计算重点实验室(河北工业大学), 天津 300401
  • 通讯作者: 王利琴
  • 作者简介:汪鹏(1978-),男,河北邯郸人,副教授,博士,主要研究方向:计算机软件设计、数据挖掘;张奥帆(1992-),男,河北石家庄人,硕士研究生,主要研究方向:计算机视觉、机器学习;王利琴(1980-),女,河北张北人,实验师,博士,主要研究方向:数据挖掘、机器学习;董永峰(1977-),男,河北保定人,教授,博士,主要研究方向:机器学习、大数据分析。
  • 基金资助:
    河北省基础研究计划重点项目(F2016202144);天津市应用基础与前沿技术研究计划项目(15JCTPJC62000,16JCYBJC15600)。

Abstract: In order to solve the problem of imbalance of label distribution in an image dataset and improve the annotation performance of rare labels, a Multi Label Smoothing Unit (MLSU) based on label smoothing strategy was proposed. High-frequency labels in the dataset were automatically smoothed during training the network model, so that the network appropriately raised the output value of low-frequency labels, thus, the annotation performance of low-frequency labels was improved. Focusing on the problem that the number of images was insufficient in the dataset for image annotation, a Convolutional Neural Network (CNN) model based on transfer learning was proposed. Firstly, the deep convolutional neural network was pre-trained by using the large public image datasets on the Internet. Then, the target dataset was used to fine-tune the network parameters, and a Convolutional Neural Network model using Multi-Label Smoothing Unit (CNN-MLSU) was established. Experiments were carried out on the benchmark image annotation datasets Corel5K and the IAPR TC-12 respectively. The experimental results show that the average accuracy and average recall of the proposed method are 5 percentage points and 8 percentage points higher than those of the Convolutional Neural Network Regression (CNN-R) on the Corel5K dataset. And on the IAPR TC-12 dataset, the average recall of the proposed method has increased by 6 percentage points compared with the Two-Pass K-Nearest Neighbor (2PKNN_ML). The results show that the CNN-MLSU method based on transfer learning can effectively prevent the over-fitting of network and improve the annotation performance of low-frequency labels.

Key words: automatic image annotation, multi-label smoothing, transfer learning, Convolutional Neural Network (CNN), image retrieval

摘要: 针对图像标注数据集标签分布不平衡问题,提出了基于标签平滑策略的多标签平滑单元(MLSU)。MLSU在网络模型训练过程中自动平滑数据集中的高频标签,使网络适当提升了低频标签的输出值,从而提升了低频标注词的标注性能。为解决图像标注数据集样本数量不足造成网络过拟合的问题,提出了基于迁移学习的卷积神经网络(CNN)模型。首先利用互联网上的大型公共图像数据集对深度网络进行预训练,然后利用目标数据集对网络参数进行微调,构建了一个多标签平滑卷积神经网络模型(CNN-MLSU)。分别在Corel5K和IAPR TC-12图像标注数据集上进行实验,在Corel5K数据集上,CNN-MLSU较卷积神经网络回归方法(CNN-R)的平均准确率与平均召回率分别提升了5个百分点和8个百分点;在IAPR TC-12数据集上,CNN-MLSU较两场K最邻近模型(2PKNN_ML)的平均召回率提升了6个百分点。实验结果表明,基于迁移学习的CNN-MLSU方法能有效地预防网络过拟合,同时提升了低频词的标注效果。

关键词: 图像自动标注, 多标签平滑, 迁移学习, 卷积神经网络, 图像检索

CLC Number: