基于语义相关性与拓扑关系的跨媒体检索算法

doi:10.11772/j.issn.1001-9081.2018030553

计算机应用 ›› 2018, Vol. 38 ›› Issue (9): 2529-2534.DOI: 10.11772/j.issn.1001-9081.2018030553

基于语义相关性与拓扑关系的跨媒体检索算法

代刚^1,2, 张鸿^1,2

1. 武汉科技大学计算机科学与技术学院, 武汉 430065;
2. 智能信息处理与实时工业系统湖北省重点实验室(武汉科技大学), 武汉 430065

收稿日期:2018-03-19 修回日期:2018-05-04 发布日期:2018-09-06 出版日期:2018-09-10
通讯作者: 张鸿
作者简介:代刚(1994—),男,湖北京山人,硕士研究生,主要研究方向:机器学习、跨媒体检索;张鸿(1979—),女,湖北襄阳人,教授,博士,主要研究方向:跨媒体检索、机器学习、数据挖掘。

Cross-media retrieval algorithm based on semantic correlation and topological relationship

DAI Gang^1,2, ZHANG Hong^1,2

1. College of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan Hubei 430065, China;
2. Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System(Wuhan University of Science and Technology), Wuhan Hubei 430065, China

Received:2018-03-19 Revised:2018-05-04 Online:2018-09-06 Published:2018-09-10
Contact: 张鸿

摘要/Abstract

摘要： 针对如何挖掘不同模态中具有相同语义的特征数据之间的内在相关性的问题，提出了一种基于语义相关性与拓扑关系（SCTR）的跨媒体检索算法。一方面，利用具有相同语义的多媒体数据之间的潜在相关性去构造多媒体语义相关超图；另一方面，挖掘多媒体数据的拓扑关系来构建多媒体近邻关系超图。通过结合多媒体数据语义相关性与拓扑关系去为每种媒体类型学习一个最优的投影矩阵，然后将多媒体数据的特征向量投影到一个共同空间，从而实现跨媒体检索。该算法在XMedia数据集上，对多项跨媒体检索任务的平均查准率为51.73%，与联合图正则化的异构度量学习（JGRHML）、跨模态相关传播（CMCP）、近邻的异构相似性度量（HSNN）、共同的表示学习（JRL）算法相比，分别提高了22.73、15.23、11.7、9.11个百分点。实验结果从多方面证明了该算法有效提高了跨媒体检索的平均查准率。

关键词: 跨媒体检索, 语义信息, 近邻关系, 半监督正则化, 语义相关性, 稀疏正则化

Abstract: Focused on how to mine the intrinsic correlation between feature data with the same semantics in different modalities, a novel cross-media retrieval algorithm based on Semantic Correlation and Topological Relationship (SCTR) was proposed. On one hand, the potential correlation between multimedia data with the same semantics was exploited to construct multimedia semantic correlation hypergraph. On the other hand, the topological relationship of multimedia data was mined to build multimedia nearest neighbor relationship hypergraph. The main idea was to learn an optimal projection matrix for each media type by combining the semantic correlation and topological relationship of multimedia data, then to project the feature vectors of the multimedia data into a common space to achieve cross-media retrieval. On the XMedia dataset, compared with the average precisions of the Heterogeneous Metric Learning with Joint Graph Regularization (JGRHML) algorithm, Cross Modality Correlation Propagation (CMCP) algorithm, Heterogeneous Similarity measure with Nearest Neighbors (HSNN) algorithm and Joint Representation Learning (JRL) algorithm, the average precision of the proposed algorithm in multiple retrieval tasks is 51.73%, which is increased by 22.73, 15.23, 11.7, 9.11 percentage points respectively. Experimental results prove from many aspects that the proposed algorithm effectively improves the average precision of cross-media retrieval.

Key words: cross-media retrieval, semantic information, nearest neighbor relationship, semi-supervised regularization, semantic correlation, sparse regularization

中图分类号:

TP391.3

代刚, 张鸿. 基于语义相关性与拓扑关系的跨媒体检索算法[J]. 计算机应用, 2018, 38(9): 2529-2534.

DAI Gang, ZHANG Hong. Cross-media retrieval algorithm based on semantic correlation and topological relationship[J]. Journal of Computer Applications, 2018, 38(9): 2529-2534.

参考文献

[1] ATREY P K, HOSSAIN M A, SADDIK A E, et al. Multimodal fusion for multimedia analysis:a survey[J]. Multimedia Systems, 2010, 16(6):345-379.
[2] FU Y, HOSPEDALES T M, XIANG T, et al. Learning multimodal latent attributes[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(2):303-316.
[3] XU C, TAO D, XU C. Large-margin multi-view information bottleneck[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(8):1559-1572.
[4] PENG Y, QI J, HUANG X, et al. CCL:cross-modal correlation learning with multigrained fusion by hierarchical network[J]. IEEE Transactions on Multimedia, 2018, 20(2):405-420.
[5] HARDOON D R, SZEDMAK S R, SHAWE-TAYLOR J R. Canonical correlation analysis:an overview with application to learning methods[J]. Neural Computation, 2014, 16(12):2639-2664.
[6] MROUEH Y, MARCHERET E, GOEL V. Multimodal retrieval with asymmetrically weighted regularized canonical correlation analysis[EB/OL].[2018-01-05]. http://xueshu.baidu.com/s?wd=paperuri%3A%2821c0d6790a49dece4ef4d84bc5b2c279%29&filter=sc_long_sign&tn=SE_xueshusource_2kduw22v&sc_vurl=http%3A%2F%2Farxiv.org%2Fpdf%2F1511.06267&ie=utf-8&sc_us=2302455906385530691.
[7] LI D, DIMITROVA N, LI M, et al. Multimedia content processing through cross-modal association[C]//MULTIMEDIA'03:Proceedings of the 11th ACM International Conference on Multimedia. New York:ACM, 2003:604-611.
[8] ZHAI X, PENG Y, XIAO J. Heterogeneous metric learning with joint graph regularization for cross-media retrieval[C]//AAAI'13:Proceedings of the 27th AAAI Conference on Artificial Intelligence. Menlo Park, CA:AAAI Press, 2013:1198-1204.
[9] ZHAI X, PENG Y, XIAO J. Cross-modality correlation propagation for cross-media retrieval[C]//Proceedings of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway, NJ:IEEE, 2012:2337-2340.
[10] ZHAI X, PENG Y, XIAO J. Effective heterogeneous similarity measure with nearest neighbors for cross-media retrieval[C]//MMM'12:Proceedings of the 18th International Conference on Advances in Multimedia Modeling. Berlin:Springer, 2012:312-322.
[11] ZHAI X, PENG Y, XIAO J. Learning cross-media joint representation with sparse and semisupervised regularization[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2014, 24(6):965-978.
[12] PENG Y, ZHAI X, ZHAO Y, et al. Semi-supervised cross-media feature learning with unified patch graph regularization[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2016, 26(3):583-596.
[13] WANG K, HE R, WANG L, et al. Joint feature selection and subspace learning for cross-modal retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(10):2010-2023.
[14] XIE L, PAN P, LU Y. Analyzing semantic correlation for cross-modal retrieval[J]. Multimedia Systems, 2015, 21(6):525-539.
[15] ZHUANG Y T, YANG Y, WU F. Mining semantic correlation of heterogeneous multimedia data for cross-media retrieval[J]. IEEE Transactions on Multimedia, 2008, 10(2):221-229.
[16] 庄毅,庄越挺,吴飞.一种支持海量跨媒体检索的集成索引结构[J].软件学报,2008,19(10):2667-2680.(ZHUANG Y, ZHUANG Y T, WU F. An integrated indexing structure for large-scale cross-media retrieval[J]. Journal of Software, 2008, 19(10):2667-2680.)
[17] 张鸿,吴飞,庄越挺,等.一种基于内容相关性的跨媒体检索方法[J].计算机学报,2008,31(5):820-826.(ZHANG H, WU F, ZHUANG Y T, et al. Cross-media retrieval method based on content correlations[J]. Chinese Journal of Computers, 2008, 31(5):820-826.)
[18] RASIWASIA N, PEREIRA J C, COVIELLO E, et al. A new approach to cross-modal multimedia retrieval[C]//MM'10:Proceedings of the 18th ACM International Conference on Multimedia. New York:ACM, 2010:251-260.
[19] CHEN D, TIAN X, SHEN Y, et al. On visual similarity based 3D model retrieval[J]. Computer Graphics Forum, 2010, 22(3):223-232.

基于语义相关性与拓扑关系的跨媒体检索算法

Cross-media retrieval algorithm based on semantic correlation and topological relationship

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	林于翔, 吴运兵, 阴爱英, 廖祥文. 基于语义相关性分析的多模态摘要模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 65-72.
[2]	陈佳, 张鸿. 基于特征增强和语义相关性匹配的图像文本检索方法[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 16-23.
[3]	马胜位, 黄瑞章, 任丽娜, 林川. 基于多层语义融合的结构化深度文本聚类模型[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2364-2369.
[4]	王晓雨, 王展青, 熊威. 深度非对称离散跨模态哈希方法[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2461-2470.
[5]	吕潇, 宋慧慧, 樊佳庆. 深浅层表示融合的半监督视频目标分割[J]. 《计算机应用》唯一官方网站, 2022, 42(12): 3884-3890.
[6]	刘长红, 曾胜, 张斌, 陈勇. 基于语义关系图的跨模态张量融合网络的图像文本检索[J]. 《计算机应用》唯一官方网站, 2022, 42(10): 3018-3024.
[7]	吕学强, 彭郴, 张乐, 董志安, 游新冬. 融合BERT与标签语义注意力的文本多标签分类方法[J]. 《计算机应用》唯一官方网站, 2022, 42(1): 57-63.
[8]	吴丽丹, 薛雨阳, 童同, 杜民, 高钦泉. 基于前景语义信息的图像着色算法[J]. 计算机应用, 2021, 41(7): 2048-2053.
[9]	林筠超, 万源. 基于图结构优化的自适应多度量非监督特征选择方法[J]. 计算机应用, 2021, 41(5): 1282-1289.
[10]	郑思诚, 孔令华, 游通飞, 易定容. 动态环境下基于深度学习的语义SLAM算法[J]. 计算机应用, 2021, 41(10): 2945-2951.
[11]	周超然, 赵建平, 马太, 周欣. 基于注意力机制和集成学习的网页黑名单判别方法[J]. 计算机应用, 2021, 41(1): 133-138.
[12]	黄育, 张鸿. 基于潜语义主题加强的跨媒体检索算法[J]. 计算机应用, 2017, 37(4): 1061-1064.
[13]	唐宋, 陈利娟, 陈志贤, 叶茂. 基于目标域局部近邻几何信息的域自适应图像分类方法[J]. 计算机应用, 2017, 37(4): 1164-1168.
[14]	赵知纬顾静航胡亚楠钱龙华周国栋. 基于支持向量机分类和语义信息的中文跨文本指代消解[J]. 计算机应用, 2013, 33(04): 984-987.
[15]	刘丹丹彭成钱龙华周国栋. 词汇语义信息对中文实体关系抽取影响的比较[J]. 计算机应用, 2012, 32(08): 2238-2244.