基于迁移学习与多标签平滑策略的图像自动标注

doi:10.11772/j.issn.1001-9081.2018041349

计算机应用 ›› 2018, Vol. 38 ›› Issue (11): 3199-3203.DOI: 10.11772/j.issn.1001-9081.2018041349

• 第七届中国数据挖掘会议(CCDM 2018) • 上一篇下一篇

基于迁移学习与多标签平滑策略的图像自动标注

汪鹏^1,2, 张奥帆¹, 王利琴^1,2, 董永峰^1,2

1. 河北工业大学人工智能与数据科学学院, 天津 300401;
2. 河北省大数据计算重点实验室(河北工业大学), 天津 300401

收稿日期:2018-04-23 修回日期:2018-06-15 出版日期:2018-11-10 发布日期:2018-11-10
通讯作者: 王利琴
作者简介:汪鹏(1978-),男,河北邯郸人,副教授,博士,主要研究方向:计算机软件设计、数据挖掘;张奥帆(1992-),男,河北石家庄人,硕士研究生,主要研究方向:计算机视觉、机器学习;王利琴(1980-),女,河北张北人,实验师,博士,主要研究方向:数据挖掘、机器学习;董永峰(1977-),男,河北保定人,教授,博士,主要研究方向:机器学习、大数据分析。
基金资助:
河北省基础研究计划重点项目（F2016202144）；天津市应用基础与前沿技术研究计划项目（15JCTPJC62000，16JCYBJC15600）。

Image automatic annotation based on transfer learning and multi-label smoothing strategy

WANG Peng^1,2, ZHANG Aofan¹, WANG Liqin^1,2, DONG Yongfeng^1,2

1. School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China;
2. Hebei Province Key Laboratory of Big Data Calculation(Hebei University of Technology), Tianjin 300401, China

Received:2018-04-23 Revised:2018-06-15 Online:2018-11-10 Published:2018-11-10
Supported by:
This work is partially supported by the Basic Research Plan Major Project of Hebei Province (F2016202144), the Application Basis and Advanced Technology Research Plan of Tianjin (15JCTPJC62000, 16JCYBJC15600).

摘要/Abstract

摘要： 针对图像标注数据集标签分布不平衡问题，提出了基于标签平滑策略的多标签平滑单元（MLSU）。MLSU在网络模型训练过程中自动平滑数据集中的高频标签，使网络适当提升了低频标签的输出值，从而提升了低频标注词的标注性能。为解决图像标注数据集样本数量不足造成网络过拟合的问题，提出了基于迁移学习的卷积神经网络（CNN）模型。首先利用互联网上的大型公共图像数据集对深度网络进行预训练，然后利用目标数据集对网络参数进行微调，构建了一个多标签平滑卷积神经网络模型（CNN-MLSU）。分别在Corel5K和IAPR TC-12图像标注数据集上进行实验，在Corel5K数据集上，CNN-MLSU较卷积神经网络回归方法（CNN-R）的平均准确率与平均召回率分别提升了5个百分点和8个百分点；在IAPR TC-12数据集上，CNN-MLSU较两场K最邻近模型（2PKNN_ML）的平均召回率提升了6个百分点。实验结果表明，基于迁移学习的CNN-MLSU方法能有效地预防网络过拟合，同时提升了低频词的标注效果。

关键词: 图像自动标注, 多标签平滑, 迁移学习, 卷积神经网络, 图像检索

Abstract: In order to solve the problem of imbalance of label distribution in an image dataset and improve the annotation performance of rare labels, a Multi Label Smoothing Unit (MLSU) based on label smoothing strategy was proposed. High-frequency labels in the dataset were automatically smoothed during training the network model, so that the network appropriately raised the output value of low-frequency labels, thus, the annotation performance of low-frequency labels was improved. Focusing on the problem that the number of images was insufficient in the dataset for image annotation, a Convolutional Neural Network (CNN) model based on transfer learning was proposed. Firstly, the deep convolutional neural network was pre-trained by using the large public image datasets on the Internet. Then, the target dataset was used to fine-tune the network parameters, and a Convolutional Neural Network model using Multi-Label Smoothing Unit (CNN-MLSU) was established. Experiments were carried out on the benchmark image annotation datasets Corel5K and the IAPR TC-12 respectively. The experimental results show that the average accuracy and average recall of the proposed method are 5 percentage points and 8 percentage points higher than those of the Convolutional Neural Network Regression (CNN-R) on the Corel5K dataset. And on the IAPR TC-12 dataset, the average recall of the proposed method has increased by 6 percentage points compared with the Two-Pass K-Nearest Neighbor (2PKNN_ML). The results show that the CNN-MLSU method based on transfer learning can effectively prevent the over-fitting of network and improve the annotation performance of low-frequency labels.

Key words: automatic image annotation, multi-label smoothing, transfer learning, Convolutional Neural Network (CNN), image retrieval

中图分类号:

汪鹏, 张奥帆, 王利琴, 董永峰. 基于迁移学习与多标签平滑策略的图像自动标注[J]. 计算机应用, 2018, 38(11): 3199-3203.

WANG Peng, ZHANG Aofan, WANG Liqin, DONG Yongfeng. Image automatic annotation based on transfer learning and multi-label smoothing strategy[J]. Journal of Computer Applications, 2018, 38(11): 3199-3203.

参考文献

[1] WU J, SHEN H, LI Y D, et al. Learning a hybrid similarity measure for image retrieval[J]. Pattern Recognition, 2013, 46(11):2927-2939.
[2] 臧淼. 图像自动标注关键技术研究[D].北京:北京邮电大学,2017:3-20.(ZANG M. Research on key technology of image automatic annotation[D].Beijing:Beijing University of Posts and Telecommunications,2017:3-20.)
[3] GUILLAUMIN M, MENSINK T, VERBEEK J, et al. TagProp:Discriminative metric learning in nearest neighbor models for image auto-annotation[C]//Proceedings of the 12th IEEE International Conference on Computer Vision. Piscataway, NJ:IEEE, 2009:309-316.
[4] JEON J, LAVRENKO V, MANMATHA R. Automatic image annotation and retrieval using cross-media relevance models[C]//Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York:ACM, 2003:119-126.
[5] MORAN S, LAVRENKO V. A sparse kernel relevance model for automatic image annotation[J].Journal of Multimedia Information Retrieval,2014, 3(4):209-229.
[6] MAKADIA A, PAVLOVIC V, KUMAR S. Baselines for image annotation[J]. International Journal of Computer Vision, 2010, 90(1):88-105.
[7] VERMA Y,JAWAHAR C V. Image annotation using metric learning in semantic neighborhoods[M]//ECCV'12:Proceedings of the 12th European Conference on Computer Vision. Berlin:Springer, 2012:836-849.
[8] VERMA Y, JAWAHAR C V. Image annotation by propagating labels from semantic neighborhoods[J]. International Journal of Computer Vision, 2017, 121(1):126-148.
[9] KASHANI M M, AMIRI S H. Leveraging deep learning representation for search-based image annotation[C]//Proceedings of 2017 Artificial Intelligence and Signal Processing Conference. Piscataway, NJ:IEEE, 2017:156-161.
[10] 黎健成,袁春,宋友.基于卷积神经网络的多标签图像自动标注[J].计算机科学, 2016, 43(7):41-45.(LI J C,YUAN C,SONG Y. Multi-label image annotation based on convolutional neural network[J]. Computer Science, 2016, 43(7):41-45.)
[11] HOA M L, NGUYEN T, DUNG N. Fully automated multi-label image annotation by convolution neural network and adaptive thresholding[C]//Proceedings of the 7th Symposium on Information and Communication Technology. New York:ACM, 2016:323-330.
[12] MURTHY V N, MAJI S, MANMATHA R. Automatic image annotation using deep learning representations[C]//Proceedings of the 5th ACM on International Conference on Multimedia Retrieval. New York:ACM, 2015:603-606.
[13] 高耀东,侯凌燕,杨大利.基于多标签学习的卷积神经网络的图像标注方法[J].计算机应用, 2017, 37(1):228-232.(GAO Y D,HOU L Y,YANG D L. Automatic image annotation method using multi-label learning convolutional neural network[J].Journal of Computer Applications,2017, 37(1):228-232.)
[14] KALAYEH M M, IDREES H,SHAH M. NMF-KNN:Image annotation using weighted multi-view non-negative matrix factorization[C]//Proceedings of the 27th IEEE International Conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE, 2014:184-191.
[15] PAN S J, YANG Q. A survey on transfer learning[J]. IEEE Transactions on Knowledge & Data Engineering, 2010, 22(10):1345-1359.
[16] 庄福振,罗平,何清,等.迁移学习研究进展[J].软件学报,2015, 26(1): 26-39.(ZHUANG F Z,LUO P,HE Q, et al. Survey on transfer learning research[J]. Journal of Software,2015, 26(1):26-39.)
[17] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]// Proceedings of 26th International Conference on Neural Information Processing Systems. Cambridge, MA: MIT Press, 2012: 1097-1105.
[18] 宋光慧.基于迁移学习与深度卷积特征的图像标注方法研究[D].杭州: 浙江大学, 2017: 56-61.(SONG G H. Image annotation method based on transfer learning and deep convolutional feature[D].Hangzhou: Zhejiang University,2017: 56-61.)

基于迁移学习与多标签平滑策略的图像自动标注

Image automatic annotation based on transfer learning and multi-label smoothing strategy

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	宋中山, 梁家锐, 郑禄, 刘振宇, 帖军. 基于双向门控尺度特征融合的遥感场景分类[J]. 计算机应用, 2021, 41(9): 2726-2735.
[2]	李康康, 张静. 基于注意力机制的多层次编码和解码的图像描述模型[J]. 计算机应用, 2021, 41(9): 2504-2509.
[3]	张永斌, 常文欣, 孙连山, 张航. 基于字典的域名生成算法生成域名的检测方法[J]. 计算机应用, 2021, 41(9): 2609-2614.
[4]	赵宏, 孔东一. 图像特征注意力与自适应注意力融合的图像内容中文描述[J]. 计算机应用, 2021, 41(9): 2496-2503.
[5]	徐江浪, 李林燕, 万新军, 胡伏原. 结合目标检测的室内场景识别方法[J]. 计算机应用, 2021, 41(9): 2720-2725.
[6]	牟长宁, 王海鹏, 周丕宇, 侯鑫行. 基于图卷积神经网络的串联质谱从头测序[J]. 计算机应用, 2021, 41(9): 2773-2779.
[7]	王贺兵, 张春梅. 基于非对称卷积-压缩激发-次代残差网络的人脸关键点检测[J]. 计算机应用, 2021, 41(9): 2741-2747.
[8]	曾祥银, 郑伯川, 刘丹. 基于深度卷积神经网络和聚类的左右轨道线检测[J]. 计算机应用, 2021, 41(8): 2324-2329.
[9]	曹玉红, 徐海, 刘荪傲, 王紫霄, 李宏亮. 基于深度学习的医学影像分割研究综述[J]. 计算机应用, 2021, 41(8): 2273-2287.
[10]	秦斌斌, 彭良康, 卢向明, 钱江波. 司机分心驾驶检测研究进展[J]. 计算机应用, 2021, 41(8): 2330-2337.
[11]	黄程程, 董霄霄, 李钊. 基于二维Winograd算法的深流水线5×5卷积方法[J]. 计算机应用, 2021, 41(8): 2258-2264.
[12]	高钦泉, 黄炳城, 刘文哲, 童同. 基于改进CenterNet的竹条表面缺陷检测方法[J]. 计算机应用, 2021, 41(7): 1933-1938.
[13]	吴则举, 焦翠娟, 陈亮. 基于改进Faster R-CNN的轮胎缺陷检测方法[J]. 计算机应用, 2021, 41(7): 1939-1946.
[14]	杨粟, 欧阳智, 杜逆索. 基于相关度距离的无监督并行哈希图像检索[J]. 计算机应用, 2021, 41(7): 1902-1907.
[15]	武光利, 李雷霆, 郭振洲, 王成祥. 基于改进的双向长短期记忆网络的视频摘要生成模型[J]. 计算机应用, 2021, 41(7): 1908-1914.