《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (7): 2015-2021.DOI: 10.11772/j.issn.1001-9081.2021040660

• 人工智能 • 上一篇    

基于元学习的深度哈希检索算法

韩亚茹1, 闫连山2(), 姚涛1   

  1. 1.鲁东大学 信息与电气工程学院,山东 烟台 264025
    2.西南交通大学 信息科学与技术学院,成都 611756
  • 收稿日期:2021-04-25 修回日期:2021-09-01 接受日期:2021-09-07 发布日期:2022-07-15 出版日期:2022-07-10
  • 通讯作者: 闫连山
  • 作者简介:韩亚茹(1995—),女,山东济南人,硕士研究生,主要研究方向:多媒体图像检索、人工智能、机器学习
    姚涛(1981—),男,山东烟台人,副教授,博士,主要研究方向:多媒体分析与计算、计算机视觉、机器学习。
  • 基金资助:
    国家自然科学基金资助项目(61872170)

Deep hashing retrieval algorithm based on meta-learning

Yaru HAN1, Lianshan YAN2(), Tao YAO1   

  1. 1.School of Information and Electrical Engineering,Ludong University,Yantai Shandong 264025,China
    2.School of Information Science and Technology,Southwest Jiaotong University,Chengdu Sichuan 611756,China
  • Received:2021-04-25 Revised:2021-09-01 Accepted:2021-09-07 Online:2022-07-15 Published:2022-07-10
  • Contact: Lianshan YAN
  • About author:HAN Yaru, born in 1995, M. S. candidate. Her research interests include multimedia image retrieval, artificial intelligence, machine learning.
    YAO Tao, born in 1981, Ph. D., associate professor. His research interests include multimedia analysis and computing, computer vision, machine learning.
  • Supported by:
    National Natural Science Foundation of China(61872170)

摘要:

随着移动互联网技术的发展,图像数据的规模越来越大,大规模图像检索任务已经成为了一个紧要的问题。由于检索速度快和存储消耗低,哈希算法受到了研究者的广泛关注。基于深度学习的哈希算法要达到较好的检索性能,需要一定数量的高质量训练数据来训练模型。然而现存的哈希方法通常忽视了数据集存在数据类别非平衡的问题,而这可能会降低检索性能。针对上述问题,提出了一种基于元学习网络的深度哈希检索算法。所提算法可以直接从数据中自动学习加权函数。该加权函数是只有一个隐含层的多层感知机(MLP),在少量无偏差元数据的指导下,加权函数的参数可以和模型训练过程中的参数同时进行优化更新。元学习网络参数的更新方程可以解释为:较符合元学习数据的样本权重将被提高,而不符合元学习数据的样本权重将被减小。基于元学习网络的深度哈希检索算法可以有效减少非平衡数据对图像检索的影响,并可以提高模型的鲁棒性。在CIFAR-10等广泛使用的基准数据集上进行的大量实验表明,在非平衡比率较大时,所提算法的平均准确率均值(mAP)最佳;在非平均比率为200的条件下,所提算法的mAP比中心相似度量化算法、非对称深度监督哈希(ADSH)算法和快速可扩展监督哈希(FSSH)算法分别提高0.54个百分点,30.93个百分点和48.43个百分点。

关键词: 深度学习, 哈希算法, 非平衡数据, 元学习, 图像检索

Abstract:

With the development of mobile Internet technology, the scale of image data is getting larger and larger, and the large-scale image retrieval task has become an urgent problem. Due to the fast retrieval speed and very low storage consumption, the hashing algorithm has received extensive attention from researchers. Deep learning based hashing algorithms need a certain amount of high-quality training data to train the model to improve the retrieval performance. However, the existing hashing methods usually ignore the problem of imbalance of data categories in the dataset, which may reduce the retrieval performance. Aiming at this problem, a deep hashing retrieval algorithm based on meta-learning network was proposed, which can automatically learn the weighting function directly from the data. The weighting function is a Multi-Layer Perceptron (MLP) with only one hidden layer. Under the guidance of a small amount of unbiased meta data, the parameters of the weighting function were able to be optimized and updated simultaneously with the parameters during model training process. The updating equations of the meta-learning network parameters were able to be explained as: increasing the weights of samples which are consistent with the meta-learning data, and reducing the weights of samples which are not consistent with the meta-learning data. The impact of imbalanced data on image retrieval was able to be effectively reduced and the robustness of the model was able to be improved through the deep hashing retrieval algorithm based on meta-learning network. A large number of experiments were conducted on widely used benchmark datasets such as CIFAR-10. The results show that the mean Average Precision (mAP) of the hashing algorithm based on meta-learning network is the highest with large imbalanced rate;especially, under the condition of imbalanced ratio=200, the mAP of the proposed algorithm is 0.54 percentage points,30.93 percentage points and 48.43 percentage points higher than those of central similarity quantization algorithm, Asymmetric Deep Supervised Hashing (ADSH) algorithm and Fast Scalable Supervised Hashing (FSSH) algorithm.

Key words: deep learning, hashing algorithm, imbalanced data, meta-learning, image retrieval

中图分类号: