计算机应用 ›› 2013, Vol. 33 ›› Issue (08): 2171-2176.

• 先进计算 • 上一篇    下一篇

基于循环分块的流水粒度优化算法

刘晓娴1,2,赵荣彩1,2,丁锐1,2,李雁冰1,2   

  1. 1. 数学工程与先进计算国家重点实验室,郑州 450002
    2. 信息工程大学,郑州 450002;
  • 收稿日期:2013-02-18 修回日期:2013-03-25 出版日期:2013-08-01 发布日期:2013-09-11
  • 通讯作者: 刘晓娴
  • 作者简介:刘晓娴(1985-),女,江西宜丰人,博士研究生,主要研究方向:并行编译、高性能计算;
    赵荣彩(1957-),男,河南洛阳人,教授,博士生导师,CCF高级会员,主要研究方向:并行编译、高性能计算、反编译技术;
    丁锐(1984-),男,河南滑县人,博士研究生,主要研究方向:并行编译、高性能计算;
    李雁冰(1989-),男,甘肃陇西人,硕士研究生,主要研究方向:并行编译。
  • 基金资助:

    “核高基”国家科技重大专项

Pipelining granularity optimization algorithm based on loop tiling

LIU Xiaoxian1,2,ZHAO Rongcai1,2,DING Rui1,2,LI Yanbing1,2   

  1. 1. .Information Engineering University, Zhengzhou Henan 450002, China
    2. State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou Henan 450002, China
  • Received:2013-02-18 Revised:2013-03-25 Online:2013-09-11 Published:2013-08-01
  • Contact: LIU Xiaoxian
  • Supported by:

    CHB National Major Science and Technology Project Foundation of China under Grant

摘要: 当计算划分层迭代数目较大,或是循环体单次迭代工作量较大,但可用的并行线程数目较小时,传统的基于循环分块的流水粒度优化方法无法进行处理。为此,提出一种基于循环分块减小流水粒度的方法,并根据流水并行循环的代价模型实现最优流水粒度的求解,设计实现了一个流水计算粒度的优化算法。对有限差分松弛法(FDR)的波前循环和时域有限差分法(FDTD)中典型循环的测试表明,与传统的流水粒度选择方法相比,所提算法能够得到更优的循环分块大小。

关键词: 自动并行化, 流水并行, 流水粒度, 循环分块, 代价模型

Abstract: When the pipelining loop has a great number of iterations, or the size of its body is large, but the number of available threads is small, the workload between two synchronizations of a thread is so heavy, which produces pretty low degree of parallelism. The traditional trade-off approach based on loop tiling cannot handle the above situation. To solve this problem, a pipelining granularity decreasing approach based on loop tiling was proposed. The optimal pipelining granularity was obtained by building the cost model for pipelining loop and a pipelining granularity optimizing algorithm was implemented. By measuring the wavefront loops of Finite Difference Relaxation (FDR) and the representative loops of Finite Difference Time Domain (FDTD), the loops show better performance improvement by using the proposed algorithm than the traditional one.

Key words: automatic parallelization, pipelining parallelization, pipelining granularity, loop tiling, cost model

中图分类号: