MTRF:融合空间信息的主题模型

doi:10.11772/j.issn.1001-9081.2015.10.2715

计算机应用 ›› 2015, Vol. 35 ›› Issue (10): 2715-2720.DOI: 10.11772/j.issn.1001-9081.2015.10.2715

• 第十五届中国机器学习会议(CCML2015)论文 • 下一篇

MTRF:融合空间信息的主题模型

潘智勇^1,2, 刘扬¹, 刘国军¹, 郭茂祖¹, 李盼¹

1. 哈尔滨工业大学计算机科学与技术学院, 哈尔滨 150001;
2. 北华大学信息技术与传媒学院, 吉林吉林 132013

收稿日期:2015-06-15 修回日期:2015-06-30 出版日期:2015-10-10 发布日期:2015-10-14
通讯作者: 郭茂祖(1966-),男,山东夏津人,教授,博士生导师,博士,CCF会员,主要研究方向:机器学习、生物信息学,maozuguo@hit.edu.cn
作者简介:潘智勇(1980-),男,吉林吉林人,实验师,博士研究生,主要研究方向:机器学习;刘扬(1976-),男,黑龙江哈尔滨人,副教授,博士,CCF会员,主要研究方向:机器学习、图像处理;刘国军(1979-),男,黑龙江哈尔滨人,讲师,博士,主要研究方向:机器学习;李盼(1988-),男,河北邢台人,硕士,主要研究方向:机器学习、统计学习。
基金资助:
国家自然科学基金资助项目(61171185,61271346);黑龙江省青年科学基金资助项目(QC2014C071)。

MTRF: a topic model with spatial information

PAN Zhiyong^1,2, LIU Yang¹, LIU Guojun¹, GUO Maozu¹, LI Pan¹

1. School of Computer Science and Technology, Harbin Institute of Technology, Harbin Heilongjiang 150001, China;
2. College of Information Technology and Media, Beihua University, Jilin Jilin 132013, China

Received:2015-06-15 Revised:2015-06-30 Online:2015-10-10 Published:2015-10-14

摘要/Abstract

摘要： 针对主题模型中词汇独立性和主题独立性假设忽略了视觉词汇间空间关系的问题,提出了一种融合了视觉词汇空间信息的主题模型,称为马尔可夫主题随机场(MTRF),并且提出了主题在图像处理中的表现形式为对象的组成部件。根据相邻视觉词汇以很大概率产生于同一主题的特点,该算法在产生主题的过程中,通过视觉词汇间是否产生于同一主题,来判断主题产生于马尔可夫随机场(MRF),还是产生于多项式分布。同时,从理论和实验两方面论证了主题并非对象的实例,而是以中层特征的形式表达对象的各个组成部件。与隐狄利克雷分配(LDA)相比,MTRF在Caltech101上的平均准确率提高了3.91%;在VOC2007数据集上的平均精度均值(mAP)提高了2.03%;此外,MTRF更准确地为视觉词汇分配了主题,能产生更有效表达对象的组成部件的中层特征。实验结果表明,MTRF有效地利用了空间信息,提高了模型的准确率。

关键词: 主题模型, 隐狄利克雷分配模型, 马尔可夫随机场, 空间关系, 中层特征, 图像分类

Abstract: To overcome the limitation of the assumptions of topic model-word independence and topic independence, a topic model which inosculated the spatial relationship of visual words was proposed, namely Markov Topic Random Field (MTRF). In addition, it was discussed that the "topic" of topic model represented the part of object in image processing. There is a high probability of the neighbor visual words generated from the same topic, and whether the visual words were generated from the same topic determined the topic was generated from Markov Random Field (MRF) or multinomial distribution of topic model. Meanwhile, both theoretical analysis and experimental results prove that "topic" of topic model appeared as mid-level feature to represent the parts of objects rather than the instances of objects. In experiments of image classification, the average accuracy of MTRF was 3.91% higher than that of Latent Dirichlet Allocation (LDA) on Caltech101 dataset, and the mean Average Precision (mAP) of MTRF was 2.03% higher than that of LDA on VOC2007 dataset. Furthermore, MTRF assigned topics to visual words more accurately and got the mid-level features which represented the parts of objects more effectively than LDA. The experimental results show that MTRF makes use of the spatial information effectively and improves the accuracy of the model.

Key words: topic model, Latent Dirichlet Allocation (LDA) model, Markov Random Field (MRF), spatial relationship, mid-level feature, image classification

中图分类号:

TP181
TP391

潘智勇, 刘扬, 刘国军, 郭茂祖, 李盼. MTRF:融合空间信息的主题模型[J]. 计算机应用, 2015, 35(10): 2715-2720.

PAN Zhiyong, LIU Yang, LIU Guojun, GUO Maozu, LI Pan. MTRF: a topic model with spatial information[J]. Journal of Computer Applications, 2015, 35(10): 2715-2720.

参考文献

[1] LOWE D G. Distinctive image features from scale-invariant keypoints [J]. International Journal of Computer Vision, 2004, 60(2): 91-110.
[2] DALAL N, TRIGGS B. Histograms of oriented gradients for human detection [C]//CVPR 2005: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2005, 1: 886-893.
[3] MAJI S, SHAKHNAROVICH G. Part discovery from partial correspondence [C]//Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2013: 931-938.
[4] LI Q, WU J, TU Z. Harvesting mid-level visual concepts from large-scale Internet images [C]//Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2013: 851-858.
[5] SINGH S, GUPTA A, EFROS A A. Unsupervised discovery of mid-level discriminative patches [C]//ECCV 2012: Proceedings of the 12th European Conference on Computer Vision. Berlin: Springer-Verlag, 2012: 73-86.
[6] FELZENSZWALB P F, GIRSHICK R B, McALLESTER D, et al. Object detection with discriminatively trained part-based models [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2010, 32(9):1627-1645.
[7] MITTELMAN R, LEE H, KUIPERS B, et al. Weakly supervised learning of mid-level features with Beta-Bernoulli process restricted Boltzmann machines [C]//Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2013: 476-483.
[8] SIVIC J, RUSSELL B, EFROS A, et al. Discovering object categories in image collections [R]. Cambridge: Massachusetts Institute of Technology, 2005.
[9] DING Y, GUO Q, LI N. Latent Dirichlet classification: a new method for object detection [J]. Journal of Nanjing University: Natural Science, 2012,48(2):214-220.(丁轶, 郭乔进, 李宁. 一种新的目标检测方法: Latent Dirichlet classification [J].南京大学学报: 自然科学版, 2012, 48(2):214-220.)
[10] LI L-J, SOCHER R, LI F-F. Towards total scene understanding: Classification, annotation and segmentation in an automatic framework [C]//Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2009:2036-2043.
[11] LI F-F, PERONA P. A Bayesian hierarchical model for learning natural scene categories [C]//CVPR 2005: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2005, 2: 524-531.
[12] TANG Y, XU D, XIE W, et al. A novel image scence classification method based on category topic simplex [J]. Journal of Image and Graphics, 2010, 15(7): 1067-1073. (唐颖军, 须德, 解文杰, 等. 一种基于类主题空间的图像场景分类方法 [J].中国图象图形学报, 2010, 15(7): 1067-1073.)
[13] HOFMANN T. Probabilistic latent semantic indexing [C]//Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press, 1999: 50-57.
[14] BLEI D M, NG A Y, JORDAN M I. Latent Dirichlet allocation [J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
[15] VERBEEK J, TRIGGS B. Region classification with Markov field aspect models [C]//Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press,2007: 1-8.
[16] ZHAO B, LI F-F, XING E P. Image segmentation with topic random field [C]//ECCV 2010: Proceedings of the 11th European Conference on Computer Vision. Berlin: Springer-Verlag, 2010: 785-798.
[17] CAO L, LI F F. Spatially coherent latent topic model for concurrent segmentation and classification of objects and scenes[C]//ICCV 2007: Proceedings of the IEEE 11th International Conference on Computer Vision. Piscataway: IEEE Press, 2007:1-8.
[18] WANG X, GRIMSON E. Spatial latent Dirichlet allocation [EB/OL]. [2014-10-10]. http://www.ee.cuhk.edu.hk/~xgwang/papers/wangG07nips.pdf.
[19] NIU Z, HUA G, GAO X, et al. Spatial-DiscLDA for visual recognition [C]//Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2011: 1769-1776.
[20] MACKEY L. Latent Dirichlet Markov random fields for semi-supervised image segmentation and object recognition[EB/OL]. [2014-10-10]. http://stanford.edu/~lmackey/papers/ldmrf-cs281a07.pdf.
[21] GUO Q, LI N, YANG Y, et al. LDA-CRF: object detection based on graphical model [J]. Journal of Computer Research and Development, 2012,49(11): 2296-2304. (郭乔进, 李宁, 杨育彬, 等. LDA-CRF: 一种基于概率图模型的目标检测方法[J].计算机研究与发展, 2013, 49(11): 2296-2304.)
[22] WELLING M, TEH Y W, KAPPEN H. Hybrid variational/Gibbs collapsed inference in topic models[EB/OL].[2014-10-10]. http://arxiv.org/ftp/arxiv/papers/1206/1206.3297.pdf.
[23] LAZEBNIK S, SCHMID C, PONCE J. Beyond bags of features: spatial pyramid matching for recognizing natural scene categories [C]//Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2006: 2169-2178
[24] FAN R E, CHANG K W, HSIEH C J, et al. LIBLINEAR: a library for large linear classification[J]. Journal of Machine Learning Research, 2008, 9(12):1871-1874.

MTRF:融合空间信息的主题模型

MTRF: a topic model with spatial information

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	杨瑞, 钱晓军, 孙振强, 许振. 自然场景下多区域特征融合的混合航拍图像分割算法[J]. 计算机应用, 2021, 41(8): 2445-2452.
[2]	贾鹤鸣, 郎春博, 姜子超. 基于轻量级卷积神经网络的植物叶片病害识别方法[J]. 计算机应用, 2021, 41(6): 1812-1819.
[3]	杨丰瑞, 霍娜, 张许红, 韦巍. 基于注意力机制的主题扩展情感对话生成[J]. 计算机应用, 2021, 41(4): 1078-1083.
[4]	许学斌, 张佳达, 刘伟, 路龙宾, 赵雨晴. 融合空间和通道特征的高精度乳腺癌分类方法[J]. 计算机应用, 2021, 41(10): 3025-3032.
[5]	杨威亚, 余正涛, 高盛祥, 宋燃. 基于跨语言神经主题模型的汉越新闻话题发现方法[J]. 计算机应用, 2021, 41(10): 2879-2884.
[6]	朱思淼, 魏世伟, 魏思恒, 余敦辉. 基于弹幕情感分析和主题模型的视频推荐算法[J]. 计算机应用, 2021, 41(10): 2813-2819.
[7]	尹春勇, 章荪. 面向短文本情感分类的端到端对抗变分贝叶斯方法[J]. 计算机应用, 2020, 40(9): 2536-2542.
[8]	郑延斌, 韩梦云, 樊文鑫. 基于二维主成分分析与卷积神经网络的手写体汉字识别[J]. 计算机应用, 2020, 40(8): 2465-2471.
[9]	田保军, 刘爽, 房建东. 融合主题信息和卷积神经网络的混合推荐算法[J]. 计算机应用, 2020, 40(7): 1901-1907.
[10]	边小勇, 江沛龄, 赵敏, 丁胜, 张晓龙. 基于多分支神经网络模型的弱监督细粒度图像分类方法[J]. 计算机应用, 2020, 40(5): 1295-1300.
[11]	吴立人, 刘政浩, 张浩, 岑悦亮, 周维. 聚焦图像对抗攻击算法PS-MIFGSM[J]. 计算机应用, 2020, 40(5): 1348-1353.
[12]	李添正, 王春桃. 基于马尔可夫随机场的加密二值图像有损压缩算法[J]. 计算机应用, 2020, 40(5): 1354-1363.
[13]	徐戈, 肖永强, 汪涛, 陈开志, 廖祥文, 吴运兵. 基于视觉误差与语义属性的零样本图像分类[J]. 计算机应用, 2020, 40(4): 1016-1022.
[14]	郭志强, 胡永武, 刘鹏, 杨杰. 基于特征融合的室外天气图像分类[J]. 计算机应用, 2020, 40(4): 1023-1029.
[15]	张凯琳, 阎庆, 夏懿, 章军, 丁云. 基于焦点损失的半监督高光谱图像分类[J]. 计算机应用, 2020, 40(4): 1030-1037.