计算机应用 ›› 2018, Vol. 38 ›› Issue (5): 1389-1392.DOI: 10.11772/j.issn.1001-9081.2017102934

• 先进计算 • 上一篇    下一篇

海量小文件系统的可移植操作系统接口兼容技术

陈博1, 何连跃1,2, 严巍巍2, 徐照淼1, 徐俊1   

  1. 1. 国防科技大学 计算机学院, 长沙 410073;
    2. 北京网云飞信息技术有限公司, 北京 100067
  • 收稿日期:2017-12-13 修回日期:2017-12-14 出版日期:2018-05-10 发布日期:2018-05-24
  • 通讯作者: 陈博
  • 作者简介:陈博(1993-),男,湖南湘潭人,硕士研究生,主要研究方向:分布式文件系统、云计算;何连跃(1971-),男,浙江武义人,研究员,博士,主要研究方向:分布式文件系统、信息安全;严巍巍(1991-),男,江苏南通人,硕士,主要研究方向:分布式文件系统;徐照淼(1994-),男,湖南常德人,硕士研究生,主要研究方向:分布式文件系统;徐俊(1993-),男,浙江武义人,硕士研究生,主要研究方向:分布式文件系统。

Portable operating system interface of UNIX compatibility technology in mass small distributed file system

CHEN Bo1, HE Lianyue1,2, YAN Weiwei2, XU Zhaomiao1, XU Jun1   

  1. 1. College of Computer, National University of Defense Technology, Changsha Hunan 410073, China;
    2. Beijing Netclouds Information Technology Corporation Limited, Beijing 100070, China
  • Received:2017-12-13 Revised:2017-12-14 Online:2018-05-10 Published:2018-05-24
  • Contact: 陈博

摘要: 基于Hadoop分布式文件系统(HDFS)研发的海量小文件系统(SMDFS)遗留了HDFS不兼容可移植操作系统接口(POSIX)约束的问题,为解决SMDFS的这一问题,提出基于本地缓存的POSIX兼容技术和基于数据暂存区的元数据高效管理技术。首先,通过设置数据暂存区来实现读写模式文件流的重定向,然后建立异步线程池模型,实现数据暂存区镜像文件的同步,从而完成用户层到存储层的所有POSIX相关的文件操作。此外,借助跳表结构的元数据缓存实现List目录等元数据操作效率优化。测试表明,相较于HDFS的Linux客户端,基于技术成果实现的SMDFS3.0的随机读性能有10倍以上的性能提升,顺序读和顺序写性能有约3~4倍的提升,随机写性能可以达到本地文件系统的20%,基于目录的元数据缓存的设计使目录的List操作效率提升近10倍。但是,由于用户空间文件系统(FUSE)挂载的客户端会引入额外的内核态和用户态切换等带来的开销,因此SMDFS3.0的Linux客户端相对于系统的Java接口会有大约50%的性能损耗。

关键词: 海量小文件系统, 分布式文件系统, 可移植操作系统接口兼容, 元数据缓存, 云存储

Abstract: Focused on the issue that the mass small file system developed based on HDFS (Hadoop Distributed File System), SMDFS (Mass Small Distributed File System), is not compatible with POSIX (Portable Operating System Interface of UNIX) constraints, a POSIX compatible technology based on local cache and an efficient metadata management technology based on temporary data cache were proposed. Firstly, the data storage area was set to realize the redirection of the file flow in the read-write mode, and then an asynchronous thread pool model was established to synchronize the data in temporary cache, thereby completing all POSIX-related file operations from the user layer to the storage layer. In addition, with the help of the metadata cache of the skip list structure, the efficiency of metadata operations such as the List directory was optimized. The test results show that, compared to the Linux client of HDFS, the performance of random read improves ten times more, the sequential read and sequential write improves about three to four times. The performance of random write can reach 20% of the local file system. Besides, the List operation efficiency of the directory improves about 10 times. However, due to the additional switching of kernel-mode and user-mode introduced by FUSE (Filesystem in Userspace), the Linux client of SMDFS3.0 has a performance penalty of about 50% compared to Java interface.

Key words: mass small file system, distributed file system, Portable Operating System Interface of UNIX (POSIX) compatibility, metadata cache, cloud storage

中图分类号: