计算机应用 ›› 2010, Vol. 30 ›› Issue (8): 2066-2069.

• 先进计算 • 上一篇    下一篇

可靠的网格作业调度机制

陶永才1,石磊2   

  1. 1. 郑州大学
    2. 郑州大学南校区信息工程学院
  • 收稿日期:2010-02-23 修回日期:2010-03-17 发布日期:2010-07-30 出版日期:2010-08-01
  • 通讯作者: 陶永才
  • 基金资助:
    面向医学图像处理的武汉高性能网格结点建设

Dependable grid job scheduling mechanism

Yong-Cai Tao,   

  • Received:2010-02-23 Revised:2010-03-17 Online:2010-07-30 Published:2010-08-01
  • Contact: Yong-Cai Tao

摘要: 针对网格环境的动态性特征,提出了一种可靠的网格作业调度机制(DGJS)。按照作业完成时间期限,DGJS将作业分为:高QoS级、低QoS级和无QoS级,不同QoS级作业有不同的调度优先权;基于资源可用性预测,DGJS采用基于可靠性代价的作业调度策略,将作业尽可能调度到可靠性高的资源节点;另外,DGJS对不同QoS级作业采用不同的容错策略,在保证故障容错的同时,节省网格资源。实验表明:在动态的网格环境下,较之传统的网格作业调度算法,DGJS提高了作业成功率,减少了作业完成时间。

关键词: 作业调度, 网格, 资源故障, 容错, 马尔可夫链

Abstract: With regard to the dynamic feature of grid, a Dependable Grid Job Scheduling (DGJS) mechanism was proposed in this paper. According to the deadline of job finish time, DGJS classified the submitted jobs into three levels with different priority: high QoS level, low QoS level and no QoS level. Based on the resource availability prediction, DGJS exploited reliability costbased job scheduling strategy, striving to schedule jobs to the resource nodes with high reliability. In addition, DGJS exploited different faulttolerant strategies for jobs with different QoS levels. The experimental results show that in the dynamic grid environments, DGJS increases the job success ratio and reduces the job finish time.

Key words: job scheduling, grid, resource failure, fault tolerance, Markov chain