计算机应用 ›› 2017, Vol. 37 ›› Issue (6): 1574-1579.DOI: 10.11772/j.issn.1001-9081.2017.06.1574

• 先进计算 • 上一篇    下一篇

Spark Streaming动态资源分配策略

刘备, 谭新明, 曹文彬   

  1. 武汉理工大学 计算机科学与技术学院, 武汉 430063
  • 收稿日期:2016-11-25 修回日期:2016-12-22 出版日期:2017-06-10 发布日期:2017-06-14
  • 通讯作者: 刘备
  • 作者简介:刘备(1993-),男,湖北仙桃人,硕士研究生,主要研究方向:大数据应用、移动互联网;谭新明(1961-),男,湖北荆州人,教授,博士,主要研究方向:软件工程方法、物联网技术及系统;曹文彬(1991-),男,河南许昌人,硕士研究生,主要研究方向:移动互联网、大数据环境下处理平台。
  • 基金资助:
    湖北省自然科学基金重点项目(2014CFA050)。

Dynamic resource allocation strategy in Spark Streaming

LIU Bei, TAN Xinming, CAO Wenbin   

  1. School of Computer Science & Technology, Wuhan University of Technology, Wuhan Hubei 430063, China
  • Received:2016-11-25 Revised:2016-12-22 Online:2017-06-10 Published:2017-06-14
  • Supported by:
    This work is partially supported by the Key Projects of Hubei Province Natural Science Foundation (2014CFA050).

摘要: 针对Spark Streaming作为混合大数据计算平台流处理组件时资源调整周期长和不能满足多应用多用户个性化需求的问题,提出了一种多应用下动态资源分配策略(DRAM)。该策略增加了应用全局变量来控制动态资源分配过程。首先,获取历史执行数据反馈和应用全局变量;然后,进行资源增减计算;最后,进行资源增减执行。实验结果表明,所提策略能够有效调整应用资源配额,且在稳定数据流和不稳定数据流两种情况下,其处理延时相比原Spark平台的Streaming策略和Core策略都有所降低;同时该策略也能够提高集群资源利用率。

关键词: Spark, 实时数据流, 多应用, 动态资源分配

Abstract: The existing resource allocation strategy has long resource adjustment cycle and cannot sufficiently meet the individual needs of different applications and users when Spark Streaming is selected as stream processing component in hybrid large-scale computing platform. In order to solve the problems, a Dynamic Resource Allocation strategy for Multi-application (DRAM) was proposed. The global variables were added to control the dynamic resource allocation process in DRAM. Firstly, the historical data feedback and the global variables were obtained. Then, whether increasing or decreasing the number of resources in each application was determined. Finally, the increase or decrease of resources was implemented. The experimental results show that, the proposed strategy can effectively adjust the resource quota, and reduce the processing delay compared with the original Spark platform strategies such as Streaming and Core under both cases of the stable data stream and the unstable data stream. The proposed strategy can also improve the utilization rate of the cluster resources.

Key words: Spark, real-time data stream, multi-application, dynamic resource allocation

中图分类号: