基于多标签判别字典学习的图像自动标注

doi:10.11772/j.issn.1001-9081.2017112650

计算机应用 ›› 2018, Vol. 38 ›› Issue (5): 1294-1298.DOI: 10.11772/j.issn.1001-9081.2017112650

基于多标签判别字典学习的图像自动标注

杨晓玲, 李志清, 刘雨桐

湘潭大学智能计算与信息处理教育部重点实验室, 湖南湘潭 411100

收稿日期:2017-11-08 修回日期:2017-11-19 发布日期:2018-05-24 出版日期:2018-05-10
通讯作者: 杨晓玲
作者简介:杨晓玲(1992-),女(土家族),贵州铜仁人,硕士研究生,CCF会员,主要研究方向:机器学习、计算机视觉、图像标注;李志清(1975-),男,湖南娄底人,副教授,博士,CCF会员,主要研究方向:视感知学习、视觉特征提取、视觉信息挖掘、图像语义标注、图像检索;刘雨桐(1992-),女,湖南岳阳人,硕士研究生,CCF会员,主要研究方向:计算机视觉、神经网络、机器学习。

Automatic image annotation based on multi-label discriminative dictionary learning

YANG Xiaoling, LI Zhiqing, LIU Yutong

Key Laboratory of Intelligent Computing & Information Processing of Ministry of Education, Xiangtan University, Xiangtan Hunan 411100, China

Received:2017-11-08 Revised:2017-11-19 Online:2018-05-24 Published:2018-05-10
Contact: 杨晓玲

摘要/Abstract

摘要： 针对图像自动标注中底层视觉特征与高层语义之间的语义鸿沟问题，在传统字典学习的基础上，提出一种基于多标签判别字典学习的图像自动标注方法。首先，为每幅图像提取多种类型特征，将多种特征组合作为字典学习输入特征空间的输入信息；然后，设计一个标签一致性正则化项，将原始样本的标签信息融入到初始的输入特征数据中，结合标签一致性判别字典和标签一致性正则化项进行字典学习；最后，通过得到的字典和稀疏编码矩阵求解标签稀疏编向量，实现未知图像的语义标注。在Corel 5K数据集上测试其标注性能，所提标注方法平均查准率和平均查全率分别可达到35%和48%；与传统的稀疏编码方法（MSC）相比，分别提高了10个百分点和16个百分点；与距离约束稀疏/组稀疏编码方法（DCSC/DCGSC）相比，分别提高了3个百分点和14个百分点。实验结果表明，所提方法能够较好地预测未知图像的语义信息，与当前几种流行的图像标注方法进行比较，所提方法具有较好的标注性能。

关键词: 图像自动标注, 字典学习, 特征表示, 稀疏编码, 图像检索

Abstract: Concerning the problem of semantic gap between low-level visual features and high-level semantics in automatic image annotation, based on traditional dictionary learning, a multi-label discriminative dictionary learning method was proposed to automatic image annotation. First of all, multiple types of features for each image were extracted, and a combination of a variety of features was used as input information of the input feature space to the dictionary learning. Then, a label consistency regularization term was designed to integrate the label information of the original samples into the initial input feature data, and the dictionary of label consistency and the label consistency regularization term were combined to learn the dictionary. Finally, the label sparse coding vector was obtained by the dictionary and sparse coding matrix to implement the semantic annotation for an unknown image. The performance of the annotation was tested on the Corel 5K data set. The average precision and average recall could reach 35% and 48% respectively, compared with the traditional Sparse Coding Method (MSC), which were increased by 10 percentage points and 16 percentage points respectively, and increased by 3 percentage points and 14 percentage points respectively than the method of Distance Constraint Sparse/Group Sparse Coding (DCSC/DCGSC) for automatic image lableing. Compared with the current image annotation methods, the experimental results show the proposed method can predict the semantic information for an unknown image properly, and has better annotation performance.

Key words: automatic image annotation, dictionary learning, feature representation, sparse coding, image retrieval

中图分类号:

TP391.41

杨晓玲, 李志清, 刘雨桐. 基于多标签判别字典学习的图像自动标注[J]. 计算机应用, 2018, 38(5): 1294-1298.

YANG Xiaoling, LI Zhiqing, LIU Yutong. Automatic image annotation based on multi-label discriminative dictionary learning[J]. Journal of Computer Applications, 2018, 38(5): 1294-1298.

参考文献

[1] 刘梦迪, 陈燕俐, 陈蕾. 图像自动标注技术研究进展[J]. 计算机应用, 2016, 36(8):2274-2281.(LIU M D,CHEN Y L,CHEN L. Advances in automatic image annotation[J].Journal of Computer Applications, 2016, 36(8):2274-2281.)
[2] DATTA R, JOSHI D,LI J, et al. Image retrieval:ideas, influences, and trends of the new age[J]. ACM Computing Surveys, 2008, 40(2):1-60.
[3] WU J, SHEN H, LI Y D, et al. Learning a hybrid similarity measure for image retrieval[J]. Pattern Recognition, 2013, 46(11):2927-2939.
[4] CHANGE, GOHK, SYCHAY G, et al. CBSA:content-based soft annotation for multimodal image retrieval using Bayes point machines[J]. IEEE Transactions on Circuits & Systems for Video Technology, 2003, 13(1):26-38.
[5] CUSANO C, CIOCCA G, SCHETTINI R. Image annotation using SVM[C]//Proceedings of SPIE 5304, Internet Imaging. Bellingham, WA:SPIE, 2003:330-338.
[6] WANG M, LI F, WANG M. Collaborative visual modeling for automatic image annotation via sparse model coding[J]. Nerocomputing, 2012,95(14):22-28.
[7] ZHANG W,TIAND, HU H, et al. Automatic image annotation via local sparse coding[C]//Proceedings of the 2013 IEEE International Conference on Acoustics,Speech and Signal Processing. Piscataway, NJ:IEEE, 2013:1661-1665.
[8] GAO S, CHIA L T, TSANG W H, et al. Concurrent single-label image classification and annotation via efficient multi-layer group sparse coding[J]. IEEE Transactions on Multimedia, 2014, 16(3):762-771.
[9] LIU L, MA S, RUI L, et al. Locality constrained dictionary learning for non-linear dimensionality reduction and classification[J]. IET Computer Vision, 2017, 11(1):60-67.
[10] CARNEIRO G, CHAN A B, MORENO P J, et al. Supervised learning of semantic classes for image annotation and retrieval[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2007, 29(3):394-410.
[11] XIA Z, FENG X, PENG J, et al. Content-irrelevant tag cleansing via bi-layer clustering and peer cooperation[J]. Journal of Signal Processing Systems, 2015, 81(1):29-44.
[12] WANG C, YAN S, ZHANG L, et al. Multi-label sparse coding for automatic image annotation[C]//Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE, 2009:1643-1650.
[13] 吴伟,高光来,聂建云.一种融合语义距离的最近邻图像标注方法[J].计算机科学,2015,42(1):297-302.(WU W, GAO G L, NIE J Y. Combination of nearest neighbor with semantic distance for image annotation[J].Computer Science,2015,42(1):297-302.)
[14] 臧淼, 徐惠民, 张永梅. 基于距离约束稀疏/组稀疏编码的自动图像标注[J].四川大学学报(工程科学版), 2016, 48(5):78-83.(ZANG M, XU H M, ZHANG Y M. Distance constraint sparse/group sparse coding for automatic image labeling[J]. Journal of Sichuan University (Engineering Science Edition), 2016, 48(5):78-83.)
[15] TANG J, HONG R, YAN S, et al. Image annotation by kNN-sparse graph-based label propagation over noisily tagged Web images[J]. ACM Transactions on Intelligent Systems & Technology, 2011, 2(2):1-15.
[16] AHARON M, ELAD M, BRUCKSTEIN A. K-SVD:an algorithm for designing overcomplete dictionaries for sparse representation[J]. IEEE Transactions on Signal Processing, 2006, 54(11):4311-4322.
[17] JIANG Z, LIN Z, DAVISL S. Label consistent K-SVD:learning a discriminative dictionary for recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(11):2651-2664.
[18] LEE H, BATTLE A, RAINA R, et al. Efficient sparse coding algorithms[C]//Proceedings of the 19th International Conference on Neural Information Processing Systems. Cambridge, MA:MIT Press, 2006:801-808.
[19] GUILLAUMIN M, MENSINK T, VERBEEK J, et al. TagProp:Discriminative metric learning in nearest neighbor models for image auto-annotation[C]//Proceedings of the 2009 IEEE International Conference on Computer Vision. Piscataway, NJ:IEEE, 2009:309-316.
[20] 范馨予,崔晓康.基于稀疏编码的图像自动标注[J].电子技术与软件工程,2017(4):83-84.(FAN X Y,CUI X K.Automatic image annotation based on sparse coding[J]. Electronic Technology and Software Engineering, 2017(4):83-84.)
[21] 吴寿昆,郭玉堂.基于Voronoi K阶邻近图的半监督学习自动图像标注[J].计算机应用与软件, 2016, 33(12):183-187.(WU S K, GUO Y T. Semi supervised learning automatic image annotation based on Voronoi K order adjacency graph[J]. Computer Applications and Software, 2016, 33(12):183-187.)
[22] MAKADIA A, PAVLOVIC V, KUNAR S. Baselines for image annotation[J].International Journal of Computer Vision, 2010, 90(1):88-105.
[23] CHEN M, ZHANG A, WEINBERGER K Q. Fast image tagging[C]//Proceedings of the 30th International Conference on International Conference on Machine Learning.[S.l.]:JMLR.org, 2013:1274-1282.
[24] SUN F, TANG J, LI H, et al. Multi-label image categorization with sparse factor representation[J]. IEEE Transactions on Image Processing, 2014, 23(3):1028-1037.

基于多标签判别字典学习的图像自动标注

Automatic image annotation based on multi-label discriminative dictionary learning

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	贾洁茹, 杨建超, 张硕蕊, 闫涛, 陈斌. 基于自蒸馏视觉Transformer的无监督行人重识别[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2893-2902.
[2]	贾宗泽, 高鹏飞, 马应龙, 刘晓峰, 夏海鑫. 基于注意力机制的多特征融合对话行为层次化分类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 715-721.
[3]	刘晶鑫, 黄雯静, 徐亮胜, 黄冲, 吴建生. 字典学习与样本关联保持结合的无监督特征选择模型[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3766-3775.
[4]	廖列法, 李志明, 张赛赛. 基于深度残差网络的迭代量化哈希图像检索方法[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2845-2852.
[5]	韩亚茹, 闫连山, 姚涛. 基于元学习的深度哈希检索算法[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2015-2021.
[6]	董永峰, 邓亚晗, 董瑶, 王雅琮. 基于深度学习的聚类综述[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1021-1028.
[7]	杨粟, 欧阳智, 杜逆索. 基于相关度距离的无监督并行哈希图像检索[J]. 计算机应用, 2021, 41(7): 1902-1907.
[8]	陆荣秀, 陈明明, 杨辉, 朱建勇. 基于溶液图像时序特征的元素组分含量动态监测系统[J]. 计算机应用, 2021, 41(10): 3075-3081.
[9]	陈莉, 王洪元, 张云鹏, 曹亮, 殷雨昌. 联合均等采样随机擦除和全局时间特征池化的视频行人重识别方法[J]. 计算机应用, 2021, 41(1): 164-169.
[10]	顾军华, 王锋, 戚永军, 孙哲然, 田泽培, 张亚娟. 基于多尺度卷积特征融合的肺结节图像检索方法[J]. 《计算机应用》唯一官方网站, 2020, 40(2): 561-565.
[11]	税留成, 刘卫忠, 冯卓明. 基于生成式对抗网络的图像自动标注[J]. 计算机应用, 2019, 39(7): 2129-2133.
[12]	张美玲, 吴俊峰, 于红, 崔榛, 董婉婷. 基于颜色四通道及空间金字塔的鱼类图像检索[J]. 计算机应用, 2019, 39(5): 1466-1472.
[13]	徐涛, 王晓明. 泛化误差界指导的鉴别字典学习[J]. 计算机应用, 2019, 39(4): 940-948.
[14]	万源, 张景会, 陈治平, 孟晓静. 基于弹性网和直方图相交的非负局部稀疏编码[J]. 计算机应用, 2019, 39(3): 706-711.
[15]	陶永鹏, 景雨, 顼聪. 基于分组字典与变分模型的图像去噪算法[J]. 计算机应用, 2019, 39(2): 551-555.