计算机应用 ›› 2015, Vol. 35 ›› Issue (12): 3383-3386.DOI: 10.11772/j.issn.1001-9081.2015.12.3383

• 先进计算 • 上一篇    下一篇

Hadoop下资源匹配最大集作业调度算法

朱洁1,2, 李雯睿1,2, 赵红1,2, 李滢1,2   

  1. 1. 南京晓庄学院信息工程学院, 南京 211171;
    2. 可信云计算与大数据分析重点实验室, 南京 211171
  • 收稿日期:2015-06-10 修回日期:2015-07-20 出版日期:2015-12-10 发布日期:2015-12-10
  • 通讯作者: 朱洁(1979-),女,江苏泰州人,讲师,硕士,主要研究方向:云计算、分布式计算
  • 作者简介:李雯睿(1981-),女,河南开封人,副教授,博士,主要研究方向:云计算、服务计算;赵红(1982-),女,黑龙江哈尔滨人,讲师,博士,主要研究方向:人工智能、分布式计算;李滢(1975-),女,黑龙江哈尔滨人,讲师,博士,主要研究方向:服务计算。
  • 基金资助:
    国家自然科学基金资助项目(61202136);江苏省科技项目(BY2013095-3-11);江苏省高校自然科学研究项目(13KJD520007);南京晓庄学院科研项目(2012NXY14,2013NXY99)。

Resource matching maximum set job scheduling algorithm under Hadoop

ZHU Jie1,2, LI Wenrui1,2, ZHAO Hong1,2, LI Ying1,2   

  1. 1. School of Information Engineering, Nanjing Xiaozhuang University, Nanjing Jiangsu 211171, China;
    2. Key Laboratory of Trusted Cloud Computing and Big Data Analysis, Nanjing Jiangsu 211171, China
  • Received:2015-06-10 Revised:2015-07-20 Online:2015-12-10 Published:2015-12-10

摘要: 针对目前层级队列作业调度算法中资源占比高的作业执行效率低的问题,提出一种资源匹配最大集算法。该算法分析作业特征,引入完成度、等待时间、优先级、重调度次数为紧迫值因子,优先考虑资源占比高或等待时间长的作业,以改善作业公平性;采用双队列结构在可用资源总量内优先选择高紧迫值作业,在不同资源占比作业集比较中选择作业数最大集,以实现调度平衡。在与最大最小公平(Max-min fairness)算法的实例对比中发现,该算法可降低作业集平均等待时间、提高资源利用率。实验对比结果表明,该算法可将不同资源占比的单一类型作业集执行时间缩短18.73%,其中资源占比高的作业执行时间缩短27.26%;在混合型作业集中对应的执行时间可分别缩短22.36%与30.28%。所提算法能有效减少资源占比高作业的等待,提高作业整体执行效率。

关键词: Hadoop, 层级队列, 作业调度, 最大集, 最大最小公平算法

Abstract: Concerning the problem that jobs of high proportion of resources execute inefficiently in job scheduling algorithms of the present hierarchical queues structure, the resource matching maximum set algorithm was proposed. The proposed algorithm analysed job characteristics, introduced the percentage of completion, waiting time, priority and rescheduling times as urgent value factors. Jobs with high proportion of resources or long waiting time were preferentially considered to improve jobs fairness. Under the condition of limited amount of available resources, the double queues was applied to preferentially select jobs with high urgent values, select the maximum job set from job sets with different proportion of resources in order to achieve scheduling balance. Compared with the Max-min fairness algorithm, it is shown that the proposed algorithm can decrease average waiting time and improve resource utilization. The experimental results show that by using the proposed algorithm, the running time of the same type job set which consisted of jobs of different proportion of resources is reduced by 18.73%, and the running time of jobs of high proportion of resources is reduced by 27.26%; the corresponding percentages of reduction of the running time of the mixed-type job set are 22.36% and 30.28%. The results indicate that the proposed algorithm can effectively reduce the waiting time of jobs of high proportion of resources and improve the overall jobs execution efficiency.

Key words: Hadoop, hierarchical queue, job scheduling, maximum set, Max-min fairness

中图分类号: