《计算机应用》唯一官方网站 ›› 2020, Vol. 40 ›› Issue (2): 347-351.DOI: 10.11772/j.issn.1001-9081.2019081366

• 2019年全国开放式分布与并行计算学术年会(DPCS 2019)论文 • 上一篇    下一篇

基于随机森林和遗传算法的Ceph参数自动调优

陈禹, 毛莺池()   

  1. 河海大学 计算机与信息学院,南京 211100
  • 收稿日期:2019-07-31 修回日期:2019-09-17 接受日期:2019-09-23 发布日期:2019-09-29 出版日期:2020-02-10
  • 通讯作者: 毛莺池
  • 作者简介:陈禹(1994—),男,江苏南通人,硕士研究生,主要研究方向:分布式计算、并行处理;
  • 基金资助:
    国家重点研发计划项目(2018YFC0407905);华能集团重点研发课题资助项目(HNKJ17-21)

Automatic tuning of Ceph parameters based on random forest and genetic algorithm

Yu CHEN, Yingchi MAO()   

  1. College of Computer and Information,Hohai University,Nanjing Jiangsu 211100,China
  • Received:2019-07-31 Revised:2019-09-17 Accepted:2019-09-23 Online:2019-09-29 Published:2020-02-10
  • Contact: Yingchi MAO
  • About author:CHEN Yu, born in 1994, M. S. candidate. His research interests include distributed computing, parallel processing.
  • Supported by:
    the National Key Research and Development Program of China(2018YFC0407905);the Key Research and Development Project of China Huaneng Group(HNKJ17-21)

摘要:

Ceph系统性能受Ceph配置参数的显著影响,在Ceph集群的配置优化中,配置参数种类繁多、含义复杂,导致难以实现快速准确寻优。针对以上问题,提出一种基于随机森林(RF)和遗传算法(GA)的参数调优方法,用于自动调整Ceph参数配置以优化Ceph系统性能。该方法使用RF算法为Ceph系统构建性能预测模型,并将预测模型的输出作为GA的输入,通过GA对参数配置方案进行自动迭代优化。仿真结果表明,调优后的参数配置较默认的参数配置相比,使Ceph文件系统的读写性能提高了约1.4倍,并且寻优耗时远低于黑盒参数调优方法。

关键词: Ceph, 参数配置, 随机森林, 遗传算法, 自动调优

Abstract:

The performance of Ceph system is significantly affected by the configuration parameters. In the optimization of configuration of Ceph cluster, there are many kinds of configuration parameters with complex meanings, which makes it difficult to achieve fast and accurate optimization. To solve the above problems, a parameter tuning method based on Random Forest (RF) and Genetic Algorithm (GA) was proposed to automatically adjust the Ceph parameter configuration in order to optimize the Ceph system performance. RF algorithm was used to construct a performance prediction model for the Ceph system, and the output of the prediction model was used as the input of GA, then the parameter configuration scheme was automatically and iteratively optimized by using GA. Simulation results show that compared with the system with default parameter configuration, the Ceph file system with optimized parameter configuration has the read and write performance improved by about 1.4 times, and the optimization time is much lower than that of the black box parameter tuning method.

Key words: Ceph, parameter configuration, Random Forest (RF), Genetic Algorithm (GA), automatic tuning

中图分类号: