计算机应用 ›› 2017, Vol. 37 ›› Issue (3): 613-619.DOI: 10.11772/j.issn.1001-9081.2017.03.613

• 第四届大数据学术会议(CCF BIGDATA2016) •    下一篇

基于纠删码的细粒度云存储调度方案

廖辉, 薛广涛, 钱诗友, 李明禄   

  1. 上海交通大学 计算机科学与工程系, 上海 200240
  • 收稿日期:2016-09-21 修回日期:2016-10-18 出版日期:2017-03-10 发布日期:2017-03-22
  • 通讯作者: 薛广涛
  • 作者简介:廖辉(1991-),男,江西南丰人,硕士研究生,主要研究方向:云计算、大数据;薛广涛(1976-),男,山东济南人,教授,博士,CCF会员,主要研究方向:移动和无线计算、社交网络、分布式计算、无线传感网、云计算、大数据;钱诗友(1977-),男,江苏连云港人,助理研究员,博士,主要研究方向:云计算、大数据;李明禄(1965-),男,重庆人,教授,博士,CCF会员,主要研究方向:车辆自组网、无线传感器网络、云计算、大数据分析。
  • 基金资助:
    国家863计划项目(2015AA01A2020)。

Fine-grained scheduling policy based on erasure code

LIAO Hui, XUE Guangtao, QIAN Shiyou, LI Minglu   

  1. Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
  • Received:2016-09-21 Revised:2016-10-18 Online:2017-03-10 Published:2017-03-22
  • Supported by:
    This work is partially supported by the National High Technology Research and Development Program (863 Program) of China (2015AA01A202).

摘要: 针对云存储系统中数据获取时延长以及数据下载不稳定的问题,提出了一种基于存储节点负载信息和纠删码技术的调度方案。首先,利用纠删码对文件进行编码存储以降低每份数据拷贝的大小,同时利用多个线程并发下载以提高数据获取的速度;其次,通过分析大量存储节点的负载信息确定影响时延的性能指标并对现有的云存储系统架构进行优化,设计了一种基于负载信息的云存储调度算法LOAD-ALGORITHM;最后,利用开源项目OpenStack搭建了一个云计算平台,根据真实的用户请求数据在云平台上进行部署和测试。实验结果表明,相比于现有的工作,调度算法在数据获取时延方面最高能减少15%的平均时延,在数据下载稳定性方面最高能降低40%的时延波动。该调度方案在真实的云平台环境下能有效地提高数据获取速度和稳定性,降低数据获取时延,达到更好的用户体验。

关键词: 云存储系统, 纠删码, 调度算法, 平均时延, 稳定性

Abstract: Aiming at the problems of long data acquisition delay and unstable data download in cloud storage system, a scheduling scheme based on storage node load information and erasure code technique was proposed. Firstly, erasure code was utilized to improve the delay performance of data retrieving in cloud storage, and parallel threads were used to download multiple data copies simultaneously. Secondly, a lot of load information about storage nodes was analyzed to figure out which performance indicators would affect delay performance, and a new scheduling algorithm was proposed based on load information. Finally, the open-source project OpenStack was used to build a real cloud computing platform to test algorithm performance based on real user request tracing and erasure coding. A large number of experiments show that the proposed scheme not only can achieve 15% lower average delay but also reduce 40% volatility of delay compared with other scheduling policies. It proves that the scheduling policy can effectively improve delay performance and stability of data retrieving in real cloud computing platform, achieving a better user experience.

Key words: cloud storage system, erasure code, scheduling algorithm, average delay, stability

中图分类号: