计算机应用 ›› 2021, Vol. 41 ›› Issue (8): 2187-2192.DOI: 10.11772/j.issn.1001-9081.2020101607

所属专题: 人工智能

• 人工智能 • 上一篇    下一篇

基于多级语义的判别式跨模态哈希检索算法

刘芳名1,2, 张鸿1,2   

  1. 1. 武汉科技大学 计算机科学与技术学院, 武汉 430065;
    2. 智能信息处理与实时工业系统湖北省重点实验室(武汉科技大学), 武汉 430065
  • 收稿日期:2020-10-16 修回日期:2021-01-13 出版日期:2021-08-10 发布日期:2021-08-06
  • 通讯作者: 刘芳名
  • 作者简介:刘芳名(1996-),女,湖北黄冈人,硕士研究生,主要研究方向:跨模态检索、机器学习;张鸿(1979-),女,湖北襄阳人,教授,博士,主要研究方向:跨媒体检索、机器学习、数据挖掘。
  • 基金资助:
    国家自然科学基金资助项目(61373109)。

Cross-modal retrieval algorithm based on multi-level semantic discriminative guided hashing

LIU Fangming1,2, ZHANG Hong1,2   

  1. 1. School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan Hubei 430065, China;
    2. Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System(Wuhan University of Science and Technology), Wuhan Hubei 430065, China
  • Received:2020-10-16 Revised:2021-01-13 Online:2021-08-10 Published:2021-08-06
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61373109).

摘要: 针对大多数跨模态哈希方法采用二进制矩阵表示相关程度,因此无法捕获多标签数据之间更深层的语义信息,以及它们忽略了保持语义结构和数据特征的判别性等问题,提出了一种基于多级语义的判别式跨模态哈希检索算法——ML-SDH。所提算法使用多级语义相似度矩阵发现跨模态数据中的深层关联信息,同时利用平等指导跨模态哈希表示在语义结构和判别分类中的关联关系,不仅实现了对蕴含高级语义信息的多标签数据进行编码的目的,而且构建的保留多级语义的结构能够确保最终学习的哈希码在保持语义相似度的同时又具有判别性。在NUS-WIDE数据集上,哈希码长度为32 bit时,所提算法在两个检索任务中的平均准确率(mAP)比深度跨模态哈希(DCMH)、成对关联哈希(PRDH)、平等指导判别式哈希(EGDH)算法分别高出了19.48,14.50,1.95个百分点和16.32,11.82,2.08个百分点。

关键词: 多级语义, 语义结构, 判别性哈希, 语义指导, 跨模态检索

Abstract: Most cross-modal hashing methods use binary matrix to represent the degree of correlation, which results in high-level semantic information cannot be captured in multi-label data, and those methods ignore maintaining the semantic structure and the discrimination of the data features. Therefore, a cross-modal retrieval algorithm named ML-SDH (Multi-Level Semantics Discriminative guided Hashing) was proposed. In the algorithm, multi-level semantic similarity matrix was used to discover the deeply correlated information in the cross-modal data, and equally guided cross-modal hashing was used to express the correlations in the semantic structure and discriminative classification. As the result, not only the purpose of encoding multi-label data of high-level semantic information was achieved, but also the distinguishability and semantic similarity of the final learned hash codes were ensured by the constructed multi-level semantic structure. On NUS-WIDE dataset, with the hash code length of 32 bit, the mean Average Precision (mAP) of the proposed algorithm in two retrieval tasks is 19.48,14.50,1.95 percentage points and 16.32,11.82,2.08 percentage points higher than those of DCMH (Deep Cross-Modal Hashing), PRDH (Pairwise Relationship guided Deep Hashing) and EGDH (Equally-Guided Discriminative Hashing) algorithms respectively.

Key words: multi-level semantic, semantic structure, discriminative hashing, semantic guidance, cross-modal retrieval

中图分类号: