计算机应用 ›› 2013, Vol. 33 ›› Issue (08): 2158-2162.

• 先进计算 • 上一篇    下一篇

Hadoop任务分配策略的改进

黄承真,王雷,刘小龙,况亚萍   

  1. 中国科学技术大学 自动化系,合肥 230027
  • 收稿日期:2013-03-05 修回日期:2013-04-12 出版日期:2013-08-01 发布日期:2013-09-11
  • 通讯作者: 王雷
  • 作者简介:黄承真(1987-),男,重庆人,硕士研究生,主要研究方向:云计算、大数据、Hadoop;
    王雷(1972-),男,安徽宿州人,副教授,博士,主要研究方向:计算机网络、媒体处理、云计算与云存储;
    刘小龙(1989-),男,重庆人,硕士研究生,主要研究方向:流媒体、网络传播与控制;
    况亚萍(1991-),女,安徽淮北人,硕士研究生,主要研究方向:云计算、虚拟化、网络传播与控制。
  • 基金资助:

    中央高校基本科研业务费专项资金资助项目

Tasks assignment optimization in Hadoop

HUANG Chengzhen,WANG Lei,LIU Xiaolong,KUANG Yaping   

  1. Department of Automation, University of Science and Technology of China, Hefei Anhui 230027, China
  • Received:2013-03-05 Revised:2013-04-12 Online:2013-09-11 Published:2013-08-01
  • Contact: WANG Lei

摘要: Hadoop广泛应用于大数据的并行处理,其现有的任务分配策略多面向同构环境,或者没有充分利用集群的全局信息,或者在异构环境下无法兼顾执行效率与算法复杂度。针对这些问题,提出异构环境下的任务分配算法λ-Flow算法,将原先一次完成的任务分配过程划分成多轮,每轮基于当前集群状态,以及上轮任务的执行情况,动态进行任务分配,直至全部任务分配结束,以期达到最优执行效率。通过与其他算法对比实验表明,λ-Flow算法能够更好地适应集群的动态变化,有效减少作业执行时间。

关键词: Hadoop, MapReduce, 任务分配, 异构环境, 最小费用最大流

Abstract: Hadoop has been widely used in large data parallel processing. The existing tasks assignment strategies are almost oriented to a homogenous environment, but ignore the global cluster state, or not take into account the efficiency of the implementation and the complexity of the algorithm in a heterogeneous environment. To solve these problems, a new tasks assignment algorithm named λ-Flow which was oriented to a heterogeneous environment was proposed. In λ-Flow, the tasks assignment was divided into several rounds. In each round, λ-Flow collected the cluster states and the execution result of the last round dynamically, and assigned tasks in accordance with these states and the result. The comparative experimental result shows that the λ-Flow algorithm performs better in a dynamic changing cluster than the existing algorithms, and reduces the execution time of a job effectively.

Key words: Hadoop, MapReduce, tasks assignment, heterogeneous environment, min cost max flow

中图分类号: