计算机应用 ›› 2018, Vol. 38 ›› Issue (2): 343-347.DOI: 10.11772/j.issn.1001-9081.2017071869

• 网络空间安全 • 上一篇    下一篇

云存储环境下的多关键字密文搜索方法

杨宏宇, 王玥   

  1. 中国民航大学 计算机科学与技术学院, 天津 300300
  • 收稿日期:2017-08-01 修回日期:2017-09-26 出版日期:2018-02-10 发布日期:2018-02-10
  • 通讯作者: 杨宏宇
  • 作者简介:杨宏宇(1969-),男,吉林长春人,教授,博士,CCF会员,主要研究方向:网络信息安全;王玥(1991-),女,甘肃张掖人,硕士研究生,主要研究方向:网络信息安全。
  • 基金资助:
    国家自然科学基金资助项目(60776807,61179045);国家科技重大专项(2012ZX03002002);中国民航科技基金资助项目(MHRD201009,MHRD201205)。

Multi-keyword ciphertext search method in cloud storage environment

YANG Hongyu, WANG Yue   

  1. College of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China
  • Received:2017-08-01 Revised:2017-09-26 Online:2018-02-10 Published:2018-02-10
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (60776807, 61179045), the National Science and Technology Major Project (2012ZX03002002), the Science and Technology Foundation of Civil Aviation of China (MHRD201009, MHRD201205).

摘要: 针对现有云存储环境下多关键字密文搜索方法效率较低、缺乏自适应能力的问题,提出一种基于改进质量层次聚类的加密云数据多关键字排序搜索(MRSE-IQHC)方法。首先,采用词频-逆向文件频率(TF-IDF)方法和向量空间模型(VSM)构建文件向量;然后,提出一种改进质量层次聚类(IQHC)算法对文件向量聚类,构建文件索引和聚类索引;其次,采用K最近邻(KNN)查询算法对索引加密;最后,采用用户自定义关键字权值的方法构建搜索请求并在密文状态下搜索出前k个最相关的文件。实验结果表明,该方法与加密的云数据多关键字排序搜索(MRSE)方法以及基于层次聚类索引的加密数据多关键字排序搜索(MRSE-HCI)方法相比,在相同的搜索文件数量、返回文件数量、搜索关键字数量条件下搜索时间平均缩短了44.3%和34.2%、32.4%和13.2%、36.9%和19.4%,准确率提升了10.8%和8.6%。所提方法在云存储环境下的多关键字密文搜索中具有较高的搜索效率和准确性。

关键词: 云存储, 多关键字搜索, 词频-逆向文件频率, 向量空间模型, 聚类, 隐私保护

Abstract: Aiming at the problem of low efficiency and lack of adaptive ability for the existing multi-keyword ciphertext search methods in cloud storage environment, a Multi-keyword Ranked Search over Encrypted cloud data based on Improved Quality Hierarchical Clustering (MRSE-IQHC) method was proposed. Firstly, the document vectors were constructed by Term Frequency-Inverse Document Frequency (TF-IDF) method and Vector Space Model (VSM). Secondly, the Improved Quality Hierarchical Clustering (IQHC) algorithm was proposed to cluster the document vectors, the document index and cluster index were constructed. Thirdly, the K-Nearest Neighbor (KNN) query algorithm was used to encrypt the indexes. Finally, the user-defined keyword weight was used to construct the search request and search for the top k relevant documents in ciphertext state. The experimental results show that compared with the Multi-keyword Ranked Search over Encrypted cloud data (MRSE) method and the Multi-keyword Ranked Search over Encrypted data based on Hierarchical Clustering Index (MRSE-HCI) method, the search time was shortened by 44.3% and 34.2%, 32.4% and 13.2%, 36.9% and 19.4% in the same number of search documents, retrieved documents and search keywords conditions, and the accuracy rate was increased by 10.8% and 8.6%. The proposed method MRSE-IQHC has high search efficiency and accuracy for multi-keyword ciphertext search in cloud storage environment.

Key words: cloud storage, multi-keyword search, Term Frequency-Inverse Document Frequency (TF-IDF), Vector Space Model (VSM), clustering, privacy protection

中图分类号: