Journal of Computer Applications ›› 2018, Vol. 38 ›› Issue (9): 2554-2559.DOI: 10.11772/j.issn.1001-9081.2018020429

Previous Articles     Next Articles

Ciphertext retrieval ranking method based on counting Bloom filter

LI Yong1, XIANG Zhongqi2   

  1. 1. College of Information Engineering, Qujing Normal University, Qujing Yunnan 655011, China;
    2. College of Mathematics and Computer Science, Shangrao Normal University, Shangrao Jiangxi 334001, China
  • Received:2018-03-05 Revised:2018-04-22 Online:2018-09-10 Published:2018-09-06
  • Contact: 李勇
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (11761057).

基于计数型布隆过滤器的可排序密文检索方法

李勇1, 相中启2   

  1. 1. 曲靖师范学院 信息工程学院, 云南 曲靖 655011;
    2. 上饶师范学院 数学与计算机科学学院, 江西 上饶 334001
  • 通讯作者: 李勇
  • 作者简介:李勇(1984—),男,江西新余人,讲师,硕士,主要研究方向:计算机网络、信息安全、云计算;相中启(1979—),男,山东临沂人,副教授,博士,主要研究方向:分形理论、小波分析。
  • 基金资助:
    国家自然科学基金资助项目(11761057)。

Abstract: It is difficult to retrieve ciphertext in cloud computing. Existing searchable encryption schemes have low time efficiency, which file retrieval index does not support update, and retrieval results cannot be sorted accurately. To solve these problems, firstly, file retrieval index was constructed based on counting Bloom filter, through hash mapping files keywords to counting Bloom filter index vector, to realize ciphertext retrieval with keywords, and support updating of the ciphertext retrieval index. Secondly, because the counting Bloom filter does not have semantic function, it cannot achieve the ranking of retrieval results according to the relevance scores of the keywords. in this paper, the relevance scores of keywords were computed by using keyword frequency matrix and Term Frequency-Inverse Document Frequency (TF-IDF) model, to achieve the ranking function of retrieval results with the relevance score. Finally, theoretical and experimental performance analysis show that, this proposed method is secure, updatable, sortable, and efficient.

Key words: cloud computing, counting Bloom filter, Term Frequency-Inverse Document Frequency (TF-IDF) model, relevance score, sortable ciphertext retrieval

摘要: 云计算环境下密文检索困难,已有的可搜索加密方案存在时间效率低、文件检索索引不支持更新、检索结果不能实现按精确度排序等问题。首先基于计数型布隆过滤器构建文件检索索引,将文件集中的关键词哈希映射到计数型布隆过滤器索引向量,实现了按关键词进行密文检索,同时,支持密文检索索引的动态更新。其次,由于计数型布隆过滤器本身不具备语义功能,不能实现按相关度对检索结果排序,引入关键词频率矩阵和词频逆文本频率(TF-IDF)模型计算关键词的相关度分值,以实现按相关度分值对检索结果排序。最后,理论和实验性能分析证明了该方法的安全性、可更新能力、可排序能力和高效性。

关键词: 云计算, 计数型布隆过滤器, 词频逆文本频率模型, 相关度分值, 排序密文检索

CLC Number: