《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (7): 1994-2000.DOI: 10.11772/j.issn.1001-9081.2022071111

• 第39届CCF中国数据库学术会议(NDBC 2022) • 上一篇    

面向云环境密文排序检索的字典划分向量空间模型

陆佳行1, 戴华1,2(), 刘源龙1, 周倩3, 杨庚1,2   

  1. 1.南京邮电大学 计算机学院, 南京 210023
    2.江苏省大数据安全与智能处理重点实验室(南京邮电大学), 南京 210023
    3.南京邮电大学 现代邮政学院, 南京 210023
  • 收稿日期:2022-07-12 修回日期:2022-08-25 接受日期:2022-08-29 发布日期:2023-07-20 出版日期:2023-07-10
  • 通讯作者: 戴华
  • 作者简介:陆佳行(1997—),女,江苏宜兴人,硕士研究生,CCF会员,主要研究方向:数据管理、云计算安全;
    戴华(1982—),男,江苏盐城人,教授,博士,CCF会员,主要研究方向:云计算安全、隐私保护;
    刘源龙(2000—),男,江苏连云港人,硕士研究生,主要研究方向:数据管理、云计算安全;
    周倩(1983—),女,江苏兴化人,讲师,博士,CCF会员,主要研究方向:信息安全、隐私保护;
    杨庚(1961—),男,江苏建湖人,教授,博士,CCF会员,主要研究方向:云计算安全、隐私保护。
  • 基金资助:
    国家自然科学基金资助项目(61872197)

Dictionary partition vector space model for ciphertext ranked search in cloud environment

Jiaxing LU1, Hua DAI1,2(), Yuanlong LIU1, Qian ZHOU3, Geng YANG1,2   

  1. 1.School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing Jiangsu 210023,China
    2.Jiangsu Key Laboratory of Big Data Security and Intelligent Processing (Nanjing University of Posts and Telecommunications),Nanjing Jiangsu 210023,China
    3.School of Modern Posts,Nanjing University of Posts and Telecommunications,Nanjing Jiangsu 210023,China
  • Received:2022-07-12 Revised:2022-08-25 Accepted:2022-08-29 Online:2023-07-20 Published:2023-07-10
  • Contact: Hua DAI
  • About author:LU Jiaxing, born in 1997, M. S. candidate. Her research interests include data management, cloud computing security.
    DAI Hua, born in 1982, Ph. D., professor. His research interests include cloud computing security, privacy protection.
    LIU Yuanlong, born in 2000, M. S. candidate. His research interests include data management, cloud computing security.
    ZHOU Qian, born in 1983, Ph. D., lecturer. Her research interests include information security, privacy protection.
    YANG Geng, born in 1961, Ph. D. professor. His research interests include cloud computing security, privacy protection.
  • Supported by:
    National Natural Science Foundation of China(61872197)

摘要:

针对传统向量空间模型(TVSM)生成的向量维度高,计算文档与检索关键词相关度的向量点积运算耗时长的问题,提出一种面向云环境密文排序检索的字典划分向量空间模型(DPVSM)。首先给出DPVSM的具体定义,并证明了DPVSM中检索关键词与文档的相关度得分与TVSM中的相关度得分完全相等;然后,采用等长字典划分方法,提出加密向量生成算法和文档与检索关键词相关度得分计算算法。实验结果表明,DPVSM文档向量的空间开销远少于TVSM,且文档数量越多开销降低越多;此外,DPVSM的检索向量的空间开销以及相关度得分计算的耗时也远低于TVSM。显然,DPVSM在生成向量的空间效率和相关度得分计算的时间效率上均优于TVSM。

关键词: 云计算, 向量空间模型, 可搜索加密, 字典划分, 多关键词检索

Abstract:

Aiming at the problems that the dimensions of vectors generated by Traditional Vector Space Model (TVSM) are high, and the vector dot product operation to calculate the correlation between the documents and the queried keywords is time-consuming, a Dictionary Partition Vector Space Model (DPVSM) for ciphertext ranked search in cloud environment was proposed. Firstly, the specific definition of DPVSM was given, and it was proved that the relevance score between the queried keywords and the documents in DPVSM was exactly the same as that in TVSM. Then, by adopting the equal-length dictionary partition method, an encrypted vector generation algorithm and a relevance score calculation algorithm between documents and queried keywords were proposed. Experimental results show that the space occupation of document vectors of DPVSM is much lower than that of TVSM, and the more the number of documents, the greater the occupation reduction. In addition, the space occupation of query vectors and the time consumption of relevance score calculation are also much lower than those of TVSM. Obviously, DPVSM is superior to TVSM in both the space efficiency of generated vectors and the efficiency cost of relevance score calculation.

Key words: cloud computing, vector space model, searchable encryption, dictionary partition, multi-keyword search

中图分类号: