计算机应用 ›› 2019, Vol. 39 ›› Issue (9): 2623-2628.DOI: 10.11772/j.issn.1001-9081.2019020269

• 网络空间安全 • 上一篇    下一篇

基于Simhash的安全密文排序检索方案

李珍1,2, 姚寒冰1,2, 穆逸诚1   

  1. 1. 武汉理工大学 计算机科学与技术学院, 武汉 430063;
    2. 交通物联网技术湖北省重点实验室(武汉理工大学), 武汉 430070
  • 收稿日期:2019-02-18 修回日期:2019-03-29 发布日期:2019-04-25 出版日期:2019-09-10
  • 通讯作者: 李珍
  • 作者简介:李珍(1994-),女,河南商丘人,硕士研究生,主要研究方向:信息安全;姚寒冰(1976-),男,湖北武汉人,副教授,博士,主要研究方向:信息安全;穆逸诚(1998-),男,湖北武汉人,主要研究方向:信息安全。
  • 基金资助:

    内河航运技术湖北省重点实验室基金资助项目(NHHY2017003);交通物联网技术湖北省重点实验室基金资助项目(2017Ⅲ028-002)。

Secure ranked search scheme based on Simhash over encrypted data

LI Zhen1,2, YAO Hanbing1,2, MU Yicheng1   

  1. 1. College of Computer Science and Technology, Wuhan University of Technology, Wuhan Hubei 430063, China;
    2. Hubei Key Laboratory of Transportation Internet of Things(Wuhan University of Technology), Wuhan Hubei 430070, China
  • Received:2019-02-18 Revised:2019-03-29 Online:2019-04-25 Published:2019-09-10
  • Supported by:

    This work is partially supported by the Fund of Hubei Key Laboratory of Inland Shipping Technology (NHHY2017003), the Fund of Hubei Key Laboratory of Transportation Internet of Things (2017Ⅲ028-002).

摘要:

针对密文检索中存在的计算量大、检索效率不高的问题,提出一种基于Simhash的安全密文排序检索方案。该方案基于Simhash的降维思想构建安全多关键词密文排序检索索引(SMRI),将文档处理成指纹和向量,利用分段指纹和加密向量构建B+树,并采用"过滤-精化"策略进行检索和排序,首先通过分段指纹的匹配进行快速检索,得到候选结果集;然后通过计算候选结果集与查询陷门的汉明距离和向量内积进行排序,带密钥的Simhash算法和安全k近邻(SkNN)算法保证了检索过程的安全性。实验结果表明,与基于向量空间模型(VSM)的方案相比,基于SMRI的排序检索方案计算量小,能节约时间和空间成本,检索效率高,适用于海量加密数据的快速安全检索。

关键词: 密文检索, 排序检索, Simhash, 隐私保护, 安全k近邻

Abstract:

Concerning the large computation and low search efficiency in ciphertext retrieval, a secure ranked search scheme based on Simhash was proposed. In this scheme, a Secure Multi-keyword Ranked search Index (SMRI) was constructed based on the dimensionality reduction idea of Simhash, the documents were processed into fingerprints and vectors, the B+ tree was built with the segmented fingerprints and encrypted vectors and the "filter-refine" strategy was adopted to searching and sorting. Firstly, the candidate result set was obtained by matching the segmented fingerprints to perform the quick retrieval, then the top-k results were ranked by calculating the Hamming distance and the vector inner product between candidate result set and query trapdoor, and the Simhash algorithm with secret key and the Secure k-Nearest Neighbors (SkNN) algorithm ensured the security of the retrieval process. Simulation results show that compared with the method based on Vector Space Model (VSM), the SMRI-based ranked search scheme has lower computational complexity, saves time and space cost, and has higher search efficiency. It is suitable for fast and secure retrieval of massive ciphertext data.

Key words: ciphertext retrieval, ranked search, Simhash, privacy-preserving, Secure k-Nearest Neighbors (SkNN)

中图分类号: