《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (8): 2431-2438.DOI: 10.11772/j.issn.1001-9081.2022071108

• 数据科学与技术 • 上一篇    

轻量级缓存策略的关系型数据库全文搜索加强与扩展

杨婷, 莫若玉, 张秀娟, 朱洲森   

  1. 四川师范大学 物理与电子工程学院,成都 610101
  • 收稿日期:2022-07-29 修回日期:2022-09-19 接受日期:2022-09-19 发布日期:2023-01-15 出版日期:2023-08-10
  • 通讯作者: 朱洲森
  • 作者简介:杨婷(1997—),女,四川广元人,硕士研究生,主要研究方向:智能信息处理
    莫若玉(1994—),女,四川广元人,硕士研究生,主要研究方向:智能信息处理
    张秀娟(1996—),女,四川南充人,硕士研究生,主要研究方向:智能信息处理;
  • 基金资助:
    国家社会科学基金资助项目(20BMZ092)

Enhancement and expansion of full-text search in relational databases based on lightweight caching strategy

Ting YANG, Ruoyu MO, Xiujuan ZHANG, Zhousen ZHU   

  1. School of Physics and Electronic Engineering,Sichuan Normal University,Chengdu Sichuan 610101,China
  • Received:2022-07-29 Revised:2022-09-19 Accepted:2022-09-19 Online:2023-01-15 Published:2023-08-10
  • Contact: Zhousen ZHU
  • About author:YANG Ting, born in 1997, M. S. candidate. Her research interests include intelligent information processing.
    MO Ruoyu, born in 1994, M. S. candidate. Her research interests include intelligent information processing.
    ZHANG Xiujuan, born in 1996, M. S. candidate. Her research interests include intelligent information processing.
  • Supported by:
    National Social Science Foundation of China(20BMZ092)

摘要:

针对关系型数据库(RDB)现有的全文搜索方案存在的效率低下、资源占用高的问题,提出一种具有增强式辅助缓存的轻量级关系型数据库全文搜索模型。首先,该模型构建基于Redis的倒排索引,并利用缓存索引缩小搜索范围,从而用内存高效的数据处理能力解决关系型数据库I/O瓶颈,并提升系统整体性能;其次,为保证搜索结果的准确性和时效性,进一步提出索引同步策略,而且设计并实现了增量索引组件来隐藏索引处理细节,从而提高模型的易用性和通用性;最后,对于热点数据提供一种基于访问热度的索引更新机制,以降低倒排索引的内存占用。实验结果表明,所提模型在保证关系型数据库全文搜索响应速度和准确度的前提下,空间资源消耗比MySQL全文索引降低了48.8%~60.9%,比Elasticsearch降低了85.2%~96.2%,证明所提模型在实际应用中可行且有效。

关键词: MySQL, Redis, 全文搜索, 倒排索引, 一致性

Abstract:

Aiming at the problems of low efficiency and high resource consumption in the existing full-text search schemes of Relational DataBase (RDB), a lightweight full-text search model for relational databases with enhanced secondary cache was proposed. Firstly, an inverted index based on Redis was built in the proposed model and cache index was used to reduce the search scope, which solved the I/O bottleneck of relational database with efficient data processing capacity in memory, and the overall performance of the system was improved. Secondly, in order to ensure the accuracy and real time performance of the search results, the index synchronization strategy was further proposed, and the incremental index component was designed and implemented to hide the index processing details, so as to improve the usability and universality of the model. Finally, an index update mechanism based on access heat was provided for hotspot data to reduce memory usage of the inverted index. Experimental results show that on the premise of ensuring the response speed and accuracy of full-text search in relational databases, the space resource consumption of the proposed model is 48.8% - 60.9% lower than that of MySQL full-text index and 85.2% - 96.2% lower than that of Elasticsearch, verifying that the proposed model is feasible and effective in practical applications.

Key words: MySQL, Redis, full-text search, inverted index, consistency

中图分类号: