Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (5): 1349-1354.DOI: 10.11772/j.issn.1001-9081.2022030424

Special Issue: 第九届中国数据挖掘会议(CCDM 2022)

• China Conference on Data Mining 2022 (CCDM 2022) • Previous Articles     Next Articles

Multi-label cross-modal hashing retrieval based on discriminative matrix factorization

Yu TAN1, Xiaoqin WANG1, Rushi LAN1(), Zhenbing LIU1, Xiaonan LUO2   

  1. 1.Guangxi Key Laboratory of Image and Graphic Intelligent Processing (Guilin University of Electronic Technology),Guilin Guangxi 541004,China
    2.Satellite Navigation Positioning and Location Service National and Local Joint Engineering Research Center (Guilin University of Electronic Technology),Guilin Guangxi 541004,China
  • Received:2022-04-01 Revised:2022-07-19 Accepted:2022-08-03 Online:2023-05-08 Published:2023-05-10
  • Contact: Rushi LAN
  • About author:TAN Yu, born in 1997, M. S. candidate. Her research interests include cross-modal retrieval, machine learning.
    WANG Xiaoqin, born in 1994, M. S. Her research interests include image retrieval, machine learning.
    LAN Rushi, born in 1986, Ph. D., professor. His research interests include artificial intelligence, image processing, medical information processing.
    LIU Zhenbing, born in 1980, Ph. D., professor. His research interests include machine learning, image classification, image restoration.
    LUO Xiaonan, born in 1963, Ph. D., professor. His research interests include machine learning, image classification, image restoration.
  • Supported by:
    National Natural Science Foundation of China(62172120);Guangxi Science and Technology Program(2019GXNSFFA245014);Open Project of Guangxi Key Laboratory of Image and Graphic Intelligent Processing(GIIP2001)


谭钰1, 王小琴1, 蓝如师1(), 刘振丙1, 罗笑南2   

  1. 1.广西图像图形与智能处理重点实验室(桂林电子科技大学), 广西 桂林 541004
    2.卫星导航定位与位置服务国家地方联合工程研究中心(桂林电子科技大学), 广西 桂林 541004
  • 通讯作者: 蓝如师
  • 作者简介:谭钰(1997—),女,广西南宁人,硕士研究生,主要研究方向:跨模态检索、机器学习
  • 基金资助:


Existing cross-modal hashing algorithms underestimate the importance of semantic differences between different class labels and ignore the balance condition of hash vectors, which makes the learned hash codes less discriminative. In addition, some methods utilize the label information to construct similarity matrix and treat multi-label data as single label ones to perform modeling, which causes large semantic loss in multi-label cross-modal retrieval. To preserves accurate similarity relationship between heterogeneous data and the balance property of hash vectors, a novel supervised hashing algorithm, namely Discriminative Matrix Factorization Hashing (DMFH) was proposed. In this method, the Collective Matrix Factorization (CMF) of the kernelized features was used to obtain a shared latent subspace. The proportion of common labels between the data was also utilized to describe the similarity degree of the heterogeneous data. Besides, a balanced matrix was constructed by label balanced information to generate hash vectors with balance property and maximize the inter-class distances among different class labels. By comparing with seven advanced cross-modal hashing retrieval methods on two commonly used multi-label datasets, MIRFlickr and NUS-WIDE, DMFH achieves the best mean Average Precision (mAP) on both I2T (Image to Text) and T2I (Text to Image) tasks, and the mAPs of T2I are better, indicating that DMFH can utilize the multi-label semantic information in text modal more effectively. The validity of the constructed balanced matrix and similarity matrix is also analyzed, verifying that DMFH can maintain semantic information and similarity relations, and is effective in cross-modal hashing retrieval.

Key words: cross-modal retrieval, matrix factorization, hash learning, balanced vector, multi-label data



关键词: 跨模态检索, 矩阵分解, 哈希学习, 平衡向量, 多标签数据

CLC Number: