计算机应用 ›› 2012, Vol. 32 ›› Issue (01): 158-162.DOI: 10.3724/SP.J.1087.2012.00158

• 先进计算 • 上一篇    下一篇

基于深度优先贪婪搜索的可重构硬件任务划分算法

陈乃金1,2   

  1. 1. 安徽工程大学 计算机与信息学院,安徽 芜湖 241000
    2. 嵌入式系统与服务计算教育部重点实验室(同济大学),上海 201804
  • 收稿日期:2011-07-26 修回日期:2011-09-22 发布日期:2012-02-06 出版日期:2012-01-01
  • 通讯作者: 陈乃金
  • 作者简介:陈乃金(1972-),男,安徽合肥人,讲师,博士研究生,主要研究方向:可重构计算、时域划分。
  • 基金资助:

    国家863计划项目(2009AA011705);芜湖市科技计划自然科学资金资助项目(芜科计字[2009]190号);安徽省教育厅自然科学资金资助项目(KJ2007B247, KJ2010B018);安徽省高等学校青年教师科研资助计划自然科学基金资助项目(2007jq1086)

Reconfigurable hardware task partitioning algorithm based on depth first greedy search

CHEN Nai-jin1,2   

  1. 1. College of Computer and Information Engineering, Anhui Polytechnic University, Wuhu Anhui 241000, China
    2. Key Laboratory of Embedded Systems and Service Computing (Tongji University), Ministry of Education, Shanghai 201804, China
  • Received:2011-07-26 Revised:2011-09-22 Online:2012-02-06 Published:2012-01-01
  • Contact: CHEN Nai-jin

摘要: 针对可重构计算硬件任务划分通信成本较小化的问题,提出了一种基于深度优先贪婪搜索划分(DFGSP)算法。首先,从待调度的就绪队列中取出队首任务,在某一硬件面积约束下,按深度优先搜索(DFS)方式扫描一个计算密集型任务转换来的有向无环图(DAG),逐个划入满足要求的节点;然后,一遇到不满足面积要求的任务节点时,就计算当前划分模块间输出边数(可量化为通信成本);最后,跳过当前不满足要求的任务节点,继续搜索该点之后处于就绪状态的节点,当搜索到满足要求的点时,按加入该点后不增加当前划分块间输出边数和尽可能填满可重构运算阵列的原则进行。实验结果表明,与现有的簇划分(CBP)、簇层次敏感两种划分算法相比,提出的算法获得了最小划分模块数和平均跨模块间I/O边数最小的均值,通过实际验证,算法显著地改善了硬件任务的划分效果,而且运行开销没有明显增加。

关键词: 可重构计算, 时域划分, 深度优先贪婪搜索, 通信成本, 资源约束, 硬件碎片

Abstract: This paper proposed a hardware task partitioning algorithm according to the problems of communication cost minimum in reconfigurable computing, called DFGSP (Depth First Greedy Search Partitioning). At first, the front task was taken from the ready queue, a Directed Acyclic Graph (DAG), which was transformed from a computing-intensive task, was scanned and partitioned by Depth First Search (DFS). Then, the number of outputting-edges (quantized to communication cost) of current partitioning module was computed when the task node did not meet the area constraints. Finally, the ready task node, which considered sufficiently partitioning module outputting-edges which were not increasing and made good use of reconfigurable resources hardware fragment as soon as possible, was scanned and partitioned, after skipping the task node which did not meet the area constraints. In comparison with the Cluster-Based Partitioning (CBP) algorithm and Level Sensitive Cluster-Based Partitioning (LSCBP) algorithm, the experimental results show that the proposed algorithm can obtain the least number of partitioning modules and the least average number of input-output edges crossing modules, and the practical results indicate the proposed algorithm gets a prominent improvement in hardware task partitioning performance over previous algorithms, while the run-time efficiency is preserved.

Key words: reconfigurable computing, temporal partitioning, depth first greedy search, communication cost, resource restraint, hardware fragment

中图分类号: