计算机应用 ›› 2018, Vol. 38 ›› Issue (9): 2529-2534.DOI: 10.11772/j.issn.1001-9081.2018030553

• 数据科学与技术 • 上一篇    下一篇

基于语义相关性与拓扑关系的跨媒体检索算法

代刚1,2, 张鸿1,2   

  1. 1. 武汉科技大学 计算机科学与技术学院, 武汉 430065;
    2. 智能信息处理与实时工业系统湖北省重点实验室(武汉科技大学), 武汉 430065
  • 收稿日期:2018-03-19 修回日期:2018-05-04 出版日期:2018-09-10 发布日期:2018-09-06
  • 通讯作者: 张鸿
  • 作者简介:代刚(1994—),男,湖北京山人,硕士研究生,主要研究方向:机器学习、跨媒体检索;张鸿(1979—),女,湖北襄阳人,教授,博士,主要研究方向:跨媒体检索、机器学习、数据挖掘。

Cross-media retrieval algorithm based on semantic correlation and topological relationship

DAI Gang1,2, ZHANG Hong1,2   

  1. 1. College of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan Hubei 430065, China;
    2. Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System(Wuhan University of Science and Technology), Wuhan Hubei 430065, China
  • Received:2018-03-19 Revised:2018-05-04 Online:2018-09-10 Published:2018-09-06
  • Contact: 张鸿

摘要: 针对如何挖掘不同模态中具有相同语义的特征数据之间的内在相关性的问题,提出了一种基于语义相关性与拓扑关系(SCTR)的跨媒体检索算法。一方面,利用具有相同语义的多媒体数据之间的潜在相关性去构造多媒体语义相关超图;另一方面,挖掘多媒体数据的拓扑关系来构建多媒体近邻关系超图。通过结合多媒体数据语义相关性与拓扑关系去为每种媒体类型学习一个最优的投影矩阵,然后将多媒体数据的特征向量投影到一个共同空间,从而实现跨媒体检索。该算法在XMedia数据集上,对多项跨媒体检索任务的平均查准率为51.73%,与联合图正则化的异构度量学习(JGRHML)、跨模态相关传播(CMCP)、近邻的异构相似性度量(HSNN)、共同的表示学习(JRL)算法相比,分别提高了22.73、15.23、11.7、9.11个百分点。实验结果从多方面证明了该算法有效提高了跨媒体检索的平均查准率。

关键词: 跨媒体检索, 语义信息, 近邻关系, 半监督正则化, 语义相关性, 稀疏正则化

Abstract: Focused on how to mine the intrinsic correlation between feature data with the same semantics in different modalities, a novel cross-media retrieval algorithm based on Semantic Correlation and Topological Relationship (SCTR) was proposed. On one hand, the potential correlation between multimedia data with the same semantics was exploited to construct multimedia semantic correlation hypergraph. On the other hand, the topological relationship of multimedia data was mined to build multimedia nearest neighbor relationship hypergraph. The main idea was to learn an optimal projection matrix for each media type by combining the semantic correlation and topological relationship of multimedia data, then to project the feature vectors of the multimedia data into a common space to achieve cross-media retrieval. On the XMedia dataset, compared with the average precisions of the Heterogeneous Metric Learning with Joint Graph Regularization (JGRHML) algorithm, Cross Modality Correlation Propagation (CMCP) algorithm, Heterogeneous Similarity measure with Nearest Neighbors (HSNN) algorithm and Joint Representation Learning (JRL) algorithm, the average precision of the proposed algorithm in multiple retrieval tasks is 51.73%, which is increased by 22.73, 15.23, 11.7, 9.11 percentage points respectively. Experimental results prove from many aspects that the proposed algorithm effectively improves the average precision of cross-media retrieval.

Key words: cross-media retrieval, semantic information, nearest neighbor relationship, semi-supervised regularization, semantic correlation, sparse regularization

中图分类号: