Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (5): 1349-1354.DOI: 10.11772/j.issn.1001-9081.2022030424

• China Conference on Data Mining 2022 (CCDM 2022) • Previous Articles    

Multi-label cross-modal hashing retrieval based on discriminative matrix factorization

Yu TAN1, Xiaoqin WANG1, Rushi LAN1(), Zhenbing LIU1, Xiaonan LUO2   

  1. 1.Guangxi Key Laboratory of Image and Graphic Intelligent Processing (Guilin University of Electronic Technology),Guilin Guangxi 541004,China
    2.Satellite Navigation Positioning and Location Service National and Local Joint Engineering Research Center (Guilin University of Electronic Technology),Guilin Guangxi 541004,China
  • Received:2022-04-01 Revised:2022-07-19 Accepted:2022-08-03 Online:2023-05-08 Published:2023-05-10
  • Contact: Rushi LAN
  • About author:TAN Yu, born in 1997, M. S. candidate. Her research interests include cross-modal retrieval, machine learning.
    WANG Xiaoqin, born in 1994, M. S. Her research interests include image retrieval, machine learning.
    LAN Rushi, born in 1986, Ph. D., professor. His research interests include artificial intelligence, image processing, medical information processing.
    LIU Zhenbing, born in 1980, Ph. D., professor. His research interests include machine learning, image classification, image restoration.
    LUO Xiaonan, born in 1963, Ph. D., professor. His research interests include machine learning, image classification, image restoration.
  • Supported by:
    National Natural Science Foundation of China(62172120);Guangxi Science and Technology Program(2019GXNSFFA245014);Open Project of Guangxi Key Laboratory of Image and Graphic Intelligent Processing(GIIP2001)

基于判别性矩阵分解的多标签跨模态哈希检索

谭钰1, 王小琴1, 蓝如师1(), 刘振丙1, 罗笑南2   

  1. 1.广西图像图形与智能处理重点实验室(桂林电子科技大学), 广西 桂林 541004
    2.卫星导航定位与位置服务国家地方联合工程研究中心(桂林电子科技大学), 广西 桂林 541004
  • 通讯作者: 蓝如师
  • 作者简介:谭钰(1997—),女,广西南宁人,硕士研究生,主要研究方向:跨模态检索、机器学习
    王小琴(1994—),女,广西桂平人,硕士,主要研究方向:图像检索、机器学习
    蓝如师(1986—),男,广西河池人,教授,博士,主要研究方向:人工智能、图像处理、医学信息处理 rslan2016@163.com
    刘振丙(1980—),男,山东济宁人,教授,博士,主要研究方向:机器学习、图像分类、图像复原
    罗笑南(1963—),男,江西南城人,教授,博士,主要研究方向:机器学习、图像分类、图像复原。
  • 基金资助:
    国家自然科学基金资助项目(62172120);广西科技计划项目(2019GXNSFFA245014);广西图像图形与智能处理重点实验室开发课题(GIIP2001)

Abstract:

Existing cross-modal hashing algorithms underestimate the importance of semantic differences between different class labels and ignore the balance condition of hash vectors, which makes the learned hash codes less discriminative. In addition, some methods utilize the label information to construct similarity matrix and treat multi-label data as single label ones to perform modeling, which causes large semantic loss in multi-label cross-modal retrieval. To preserves accurate similarity relationship between heterogeneous data and the balance property of hash vectors, a novel supervised hashing algorithm, namely Discriminative Matrix Factorization Hashing (DMFH) was proposed. In this method, the Collective Matrix Factorization (CMF) of the kernelized features was used to obtain a shared latent subspace. The proportion of common labels between the data was also utilized to describe the similarity degree of the heterogeneous data. Besides, a balanced matrix was constructed by label balanced information to generate hash vectors with balance property and maximize the inter-class distances among different class labels. By comparing with seven advanced cross-modal hashing retrieval methods on two commonly used multi-label datasets, MIRFlickr and NUS-WIDE, DMFH achieves the best mean Average Precision (mAP) on both I2T (Image to Text) and T2I (Text to Image) tasks, and the mAPs of T2I are better, indicating that DMFH can utilize the multi-label semantic information in text modal more effectively. The validity of the constructed balanced matrix and similarity matrix is also analyzed, verifying that DMFH can maintain semantic information and similarity relations, and is effective in cross-modal hashing retrieval.

Key words: cross-modal retrieval, matrix factorization, hash learning, balanced vector, multi-label data

摘要:

现有的跨模态哈希算法低估了不同类别标签之间语义差异的重要性,忽略了哈希向量的平衡条件,导致所学习到的哈希码的判别性能差。此外,一些方法利用标签信息构造相似性矩阵,并将多标签数据视为单标签数据进行建模,这在多标签跨模态检索中造成了较大的语义损失。为了保留异构数据之间精确的相似程度和哈希向量的平衡特性,提出了一种新的有监督哈希算法——基于判别性矩阵分解的多标签跨模态哈希检索(DMFH)。该方法利用核化特征的协同矩阵分解(CMF)获得了一个共享的隐式子空间;同时利用数据之间共有标签的比例来描述异构数据的相似程度;此外,利用标签的平衡信息构造平衡矩阵,生成具有平衡特性的哈希向量,并最大化不同类别标签之间的类间距。在两个常用多标签数据集MIRFlickr和NUS-WIDE上与7种先进的跨模态哈希方法进行对比,在“以图搜文”(I2T)和“以文搜图”(T2I)任务上,DMFH均取得了最高的平均精度均值(mAP),而且T2I任务的mAP更优,说明DMFH能够更有效地利用文本模态中的多标签语义信息。还分析了所构造的平衡矩阵与相似性矩阵的有效性,验证了DMFH算法能有效保持语义信息和相似性关系,在多标签跨模式检索中是有效的。

关键词: 跨模态检索, 矩阵分解, 哈希学习, 平衡向量, 多标签数据

CLC Number: