Unsupervised parallel hash image retrieval based on correlation distance

doi:10.11772/j.issn.1001-9081.2020091472

Abstract

Abstract: To address the problems of insufficient learning of semantic information between image data and the need to retrain the model every time when the hash code length is changed in traditional unsupervised hash image retrieval model, an unsupervised search framework for large-scale image dataset retrieval, the unsupervised parallel hash image retrieval model based on correlation distance, was proposed. First, the Convolutional Neural Network (CNN) was used to learn the high-dimensional feature continuous variables of the image. Second, the pseudo-label matrix was constructed by using the correlation distance measure feature variables, and the hash function was combined with deep learning. Finally, the parallel method was used to gradually approximate the original visual characteristics during the hash code generation, realizing the purpose of generating the multi-length hash codes in one training. Experimental results show that the mean Average Precisions (mAPs) of the proposed model for four of 16 bit, 32 bit, 48 bit and 64 bits hash codes on FLICKR25K dataset are 0.726, 0.736, 0.738, 0.738,respectively, which are 9.4, 8.2, 6.2, 7.3 percentage points higher than those of Semantic Structure-based Unsupervised Deep Hashing (SSDH) model, respectively; and compared with SSDH model, the training time of the proposed model is reduced by 6.6 hours. It can be seen that the proposed model can effectively shorten the training time and improve the retrieval accuracy in large-scale image retrieval.

Key words: image retrieval, Convolutional Neural Network (CNN), hash algorithm, unsupervised, correlation distance

摘要： 针对传统无监督哈希图像检索模型中存在图像数据之间的语义信息学习不足，以及哈希编码长度每换一次模型就需重新训练的问题，提出一种用于大规模图像数据集检索的无监督搜索框架——基于相关度距离的无监督并行哈希图像检索模型。首先，使用卷积神经网络（CNN）学习图像的高维特征连续变量；然后，使用相关度距离衡量特征变量构建伪标签矩阵，并将哈希函数与深度学习相结合；最后，在哈希码生成时使用并行方式逐步逼近原始视觉特征，达到一次训练生成多长度哈希码的目的。实验结果表明，该模型在FLICKR25K数据集上对16 bit、32 bit、48 bit和64 bit的4种不同哈希码的平均精度均值（mAP）分别为0.726、0.736、0.738和0.738，与SSDH模型相比分别提升了9.4、8.2、6.2、7.3个百分点；而在训练时间方面，该模型与SSDH模型相比减少6.6 h。所提模型在大规模图像检索时能够有效缩短训练时间、提升检索精度。

关键词: 图像检索, 卷积神经网络, 哈希算法, 无监督, 相关度距离

CLC Number:

TP181

YANG Su, OUYANG Zhi, DU Nisuo. Unsupervised parallel hash image retrieval based on correlation distance[J]. Journal of Computer Applications, 2021, 41(7): 1902-1907.

杨粟, 欧阳智, 杜逆索. 基于相关度距离的无监督并行哈希图像检索[J]. 计算机应用, 2021, 41(7): 1902-1907.

References

[1] BOWYER K,FLYNN P. A 20th anniversary survey:introduction to "content-based image retrieval at the end of the early years"[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000,22(12):1348-1348.
[2] ANDONI A, INDYK P. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions[J]. Communications of the ACM,2008,51(1):117-122.
[3] OTAIR M. Approximate k-nearest neighbor based spatial clustering using K-D tree[J]. International Journal of Database Management Systems,2013,5(1):97-108.
[4] WANG J,ZHANG T,SONG J,et al. A survey on learning to hash[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,40(4):769-790.
[5] LIU W,WANG J,JI R,et al. Supervised hashing with kernels[C]//Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2012:2074-2081.
[6] SHAO J,WU F,OUYANG C,et al. Sparse spectral hashing[J]. Pattern Recognition Letters,2012,33(3):271-277.
[7] ZHU L,SHEN J,XIE L,et al. Unsupervised visual hashing with semantic assistant for content-based image retrieval[J]. IEEE Transactions on Knowledge and Data Engineering,2017,29(2):472-486.
[8] GU Y, WANG S, ZHANG H, et al. Clustering-driven unsupervised deep hashing for image retrieval[J]. Neurocomputing,2019,368:114-123.
[9] DENG C, YANG E, LIU T, et al. Unsupervised semanticpreserving adversarial hashing for image search[J]. IEEE Transactions on Image Processing,2019,28(8):4032-4044.
[10] 王伯伟, 聂秀山, 马林元, 等. 基于语义相似度的无监督图像哈希方法[J]. 南京大学学报(自然科学版),2019,55(1):41-48. (WANG B W,NIE X S,MA L Y,et al. Unsupervised image hash method based on semantic similarity[J]. Journal of Nanjing University(Natural Science),2019,55(1):41-48.)
[11] HEO J P, LEE Y, HE J, et al. Spherical hashing[C]//Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2012:2957-2964.
[12] XIA R,PAN Y,LAI H,et al. Supervised hashing for image retrieval via image representation learning[C]//Proceedings of the 28th AAAI Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2014:2156-2162.
[13] GIONIS A,INDYK P,MOTWANI R. Similarity search in high dimensions via hashing[C]//Proceedings of 25th International Conference on Very Large Data Bases. San Francisco:Morgan Kaufmann Publishers Inc.,1999:518-529.
[14] SHEN F,SHEN C,LIU W,et al. Supervised discrete hashing[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2015:37-45.
[15] 林计文, 刘华文. 基于伪成对标签的深度无监督哈希学习[J]. 模式识别与人工智能,2020,33(3):258-267.(LIN J W,LIU H W. Deep unsupervised hashing with pseudo pairwise labels[J]. Pattern Recognition and Artificial Intelligence,2020,33(3):258-267.)
[16] SONG J,HE T,GAO L,et al. Binary generative adversarial networks for image retrieval[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto, CA:AAAI Press,2018:394-401.
[17] GOODFELLOW I J,POUGET-ABADIE J,MIRZA M,et al. Generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge:MIT Press,2014:2672-2680.
[18] 王妙, 景军锋. 基于哈希编码和卷积神经网络的图像检索方法[J]. 计算机工程与应用,2019,55(23):194-199.(WANG M, JING J F. Image retrieval based on Hash coding and convolutional neural network[J]. Computer Engineering and Applications, 2019,55(23):194-199.)
[19] 魏永超. 基于相关系数与相关距离的证据合成方法[J]. 计算技术与自动化,2017,36(1):32-35.(WEI Y C. Evidence combination method based on correlation coefficient and correlation distance[J]. Computing Technology and Automation, 2017,36(1):32-35)
[20] CAO Y,LIU B,LONG M,et al. HashGAN:deep learning to hash with pair conditional Wasserstein GAN[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2018:1287-1296.
[21] GAO L,ZHU X,SONG J,et al. Beyond product quantization:deep progressive quantization for image retrieval[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2019:723-729.
[22] SONG J,ZHU X,GAO L,et al. Deep recurrent quantization for generating sequential binary codes[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2019:912-918.
[23] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL].[2020-04-10]. https://arxiv.org/pdf/1409.1556.pdf.
[24] LIU Z,WU J,FU L,et al. Improved kiwifruit detection using pretrained VGG16 with RGB and NIR information fusion[J]. IEEE Access,2020,8:2327-2336.
[25] LOU G,SHI H. Face image recognition based on convolutional neural network[J]. China Communications, 2020, 17(2):117-124.
[26] GONG Y,LAZEBNIK S. Iterative quantization:a procrustean approach to learning binary codes[C]//Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2011:817-824.
[27] WEISS Y,TORRALBA A,FERGUS R. Spectral hashing[C]//Proceedings of the 21st International Conference on Neural Information Processing Systems. Red Hook, NY:Curran Associates Inc.,2008:1753-1760.
[28] JIN Z,LI C,LIN Y,et al. Density sensitive hashing[J]. IEEE Transactions on Cybernetics,2014,44(8):1362-1371.
[29] DAI B,GUO R,KUMAR S,et al. Stochastic generative hashing[C]//Proceedings of the 34th International Conference on Machine Learning. New York:JMLR. org,2017:913-922.
[30] LIN K,LU J,CHEN C S,et al. Learning compact binary descriptors with unsupervised deep neural networks[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2016:1183-1192.
[31] YANG E,DENG C,LIU T,et al. Semantic structure-based unsupervised deep hashing[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2018:1064-1070.