《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (4): 1172-1180.DOI: 10.11772/j.issn.1001-9081.2023050590

• 先进计算 • 上一篇    

内存高效的持久性分布式文件系统客户端缓存DFS-Cache

倪瑞轩1, 蔡淼1,2(), 叶保留1,2,3   

  1. 1.河海大学 计算机与信息学院,南京 211100
    2.水利部水利大数据重点实验室(河海大学),南京 211100
    3.计算机软件新技术国家重点实验室(南京大学),南京 210023
  • 收稿日期:2023-05-16 修回日期:2023-07-29 接受日期:2023-07-31 发布日期:2023-08-10 出版日期:2024-04-10
  • 通讯作者: 蔡淼
  • 作者简介:倪瑞轩(1999—),男,江苏南京人,硕士研究生,主要研究方向:分布式系统
    蔡淼(1991—),男,江苏盐城人,助理研究员,博士,CCF会员,主要研究方向:内存/存储系统、操作系统 mcai@hhu.edu.cn
    叶保留(1976—),男,江苏南通人,教授,博士,CCF高级会员,主要研究方向:分布式系统、云计算、无线网络。
  • 基金资助:
    国家自然科学基金资助项目(61832005);中央高校业务经费资助项目(B220202073);江苏省自然科学基金资助项目(BK20220973);中国博士后科学基金资助项目(2022M711014);江苏省博士后科研资助计划项目(2021K635C)

DFS-Cache: memory-efficient and persistent client cache for distributed file systems

Ruixuan NI1, Miao CAI1,2(), Baoliu YE1,2,3   

  1. 1.College of Computer and Information,Hohai University,Nanjing Jiangsu 211100,China
    2.Key Laboratory of Water Big Data Technology of Ministry of Water Resources (Hohai University),Nanjing Jiangsu 211100,China
    3.State Key Laboratory for Novel Software Technology (Nanjing University),Nanjing Jiangsu 210023,China
  • Received:2023-05-16 Revised:2023-07-29 Accepted:2023-07-31 Online:2023-08-10 Published:2024-04-10
  • Contact: Miao CAI
  • About author:NI Ruixuan, born in 1999, M. S. candidate. His research interests include distributed system.
    CAI Miao, born in 1991, Ph. D., assistant researcher. His research interests include memory/storage system, operating system.
    YE Baoliu, born in 1976, Ph. D., professor. His research interests include distributed system, cloud computing, wireless network.
  • Supported by:
    National Natural Science Foundation of China(61832005);Fundamental Research Funds for Central Universities(B220202073);Natural Science Foundation of Jiangsu Province(BK20220973);China Postdoctoral Science Foundation(2022M711014);Jiangsu Planned Projects for Postdoctoral Research Funds(2021K635C)

摘要:

为了在数据密集型工作流下有效降低缓存碎片整理开销并提高缓存命中率,提出一种持久性分布式文件系统客户端缓存DFS-Cache(Distributed File System Cache)。DFS-Cache基于非易失性内存(NVM)设计实现,能够保证数据的持久性和崩溃一致性,并大幅减少冷启动时间。DFS-Cache包括基于虚拟内存重映射的缓存碎片整理机制和基于生存时间(TTL)的缓存空间管理策略。前者基于NVM可被内存控制器直接寻址的特性,动态修改虚拟地址和物理地址之间的映射关系,实现零拷贝的内存碎片整理;后者是一种冷热分离的分组管理策略,借助重映射的缓存碎片整理机制,提升缓存空间的管理效率。实验采用真实的Intel傲腾持久性内存设备,对比商用的分布式文件系统MooseFS和GlusterFS,采用Fio和Filebench等标准测试程序,DFS-Cache最高能提升5.73倍和1.89倍的系统吞吐量。

关键词: 非易失性内存, 分布式文件系统, 客户端缓存, 缓存碎片整理, 冷热数据分组, 缓存设计

Abstract:

To effectively reduce cache defragmentation overhead and improve cache hit radio in data-intensive workflows, a persistent client cache for distributed file system was proposed, namely DFS-Cache (Distributed File System Cache), which was designed and implemented based on Non-Volatile Memory (NVM) and was able to ensure data persistence and crash consistency with significantly reducing cold start time. DFS-Cache was consisted of a cache defragmentation mechanism based on virtual memory remapping and a cache space management strategy based on Time-To-Live (TTL). The former was based on the characteristic that NVM could be directly addressed by the memory controller. By dynamically modifying the mapping relationship between virtual addresses and physical addresses, zero-copy memory defragmentation was achieved. The latter was a cold-hot separated grouping management strategy that could enhance cache space management efficiency with the support of the remapping-based cache defragmentation mechanism. Experiments were conducted using real Intel Optane persistent memory devices. Compared with commercial distributed file systems MooseFS and GlusterFS, while employing standard benchmarking programs like Fio and Filebench, the proposed client cache can increase the system throughput by up to 5.73 times and 1.89 times.

Key words: Non-Volatile Memory (NVM), distributed file system, client cache, cache defragmentation, cold-hot data grouping, cache design

中图分类号: