《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (11): 3327-3333.DOI: 10.11772/j.issn.1001-9081.2022111760

• 2022年全国开放式分布与并行计算学术年会(DPCS 2022) • 上一篇    

网格化分布式新安江模型并行计算算法

刘乾1, 张洋铭1,2, 万定生1()   

  1. 1.河海大学 计算机与信息学院,南京 211100
    2.南京银行股份有限公司,南京 210019
  • 收稿日期:2022-11-24 修回日期:2023-02-15 接受日期:2023-02-17 发布日期:2023-03-08 出版日期:2023-11-10
  • 通讯作者: 万定生
  • 作者简介:刘乾(1998—),男,江苏南京人,硕士研究生,CCF会员,主要研究方向:分布式水文模型并行计算
    张洋铭(1997—),男,江苏徐州人,硕士研究生,CCF会员,主要研究方向:分布式水文模型并行计算
    万定生(1963—),男,江苏溧阳人,教授,CCF会员,主要研究方向:数据管理、数据挖掘。 dshwan@hhu.edu.cn
  • 基金资助:
    国家重点研发计划项目(2018YFC1508106)

Parallel computing algorithm of grid-based distributed Xin’anjiang hydrological model

Qian LIU1, Yangming ZHANG1,2, Dingsheng WAN1()   

  1. 1.College of Computer and Information Engineering,Hohai University,Nanjing Jiangsu 211100,China
    2.Bank of Nanjing Company Limited,Nanjing Jiangsu 210019,China
  • Received:2022-11-24 Revised:2023-02-15 Accepted:2023-02-17 Online:2023-03-08 Published:2023-11-10
  • Contact: Dingsheng WAN
  • About author:LIU Qian, born in 1998, M. S. candidate. His research interests include parallel computing of distributed hydrological models.
    ZHANG Yangming, born in 1997, M. S. candidate. His research interests include parallel computing of distributed hydrological models.
    WAN Dingsheng, born in 1963, professor. His research interests include data management, data mining.
  • Supported by:
    National Key Research and Development Program of China(2018YFC1508106)

摘要:

近年来,网格化分布式新安江模型(GXM)在洪水预报中发挥了重大作用,但在进行洪水过程模拟时,模型数据量与计算量巨大,GXM的计算时间随着模型预热期的增加呈指数增长,严重影响GXM的计算效率。因此,提出一种基于网格流向划分与动态优先级有向无环图(DAG)调度的GXM并行算法。首先,对模型参数、模型构件、模型计算过程进行分析;其次,从空间并行性的角度提出了基于网格流向划分的GXM并行算法以提高模型的计算效率;最后,提出一种基于动态优先级的DAG任务调度算法,通过构建网格计算节点的DAG并动态更新计算节点的优先级以实现GXM计算过程中的任务调度,减少模型计算中数据倾斜现象的产生。在陕西省大理河流域与安徽省屯溪流域对提出的算法进行实验,在预热期为30 d、数据分辨率为1 km的情况下,相较于传统的串行算法,所提算法的最大加速比分别达到了4.03和4.11,有效提升了GXM的计算速度与资源利用率。

关键词: 网格化分布式新安江模型, 网格流向划分, 并行计算, 有向无环图, 任务调度

Abstract:

In recent years, the Grid-based distributed Xin’anjiang hydrological Model (GXM) has played an important role in flood forecasting, but when simulating the flooding process, due to the vast amount of data and calculation of the model, the computing time of GXM increases exponentially with the increase of the model warm-up period, which seriously affects the computational efficiency of GXM. Therefore, a parallel computing algorithm of GXM based on grid flow direction division and dynamic priority Directed Acyclic Graph (DAG) scheduling was proposed. Firstly, the model parameters, model components, and model calculation process were analyzed. Secondly, a parallel algorithm of GXM based on grid flow direction division was proposed from the perspective of spatial parallelism to improve the computational efficiency of the model. Finally, a DAG task scheduling algorithm based on dynamic priority was proposed to reduce the occurrence of data skew in model calculation by constructing the DAG of grid computing nodes and dynamically updating the priorities of computing nodes to achieve task scheduling during GXM computation. Experimental results on Dali River basin of Shaanxi Province and Tunxi basin of Anhui Province show that compared with the traditional serial computing method, the maximum speedup ratio of the proposed algorithm reaches 4.03 and 4.11, respectively, the computing speed and resource utilization of GXM were effectively improved when the warm-up period is 30 days and the data resolution is 1 km.

Key words: Grid-based distributed Xin’anjiang hydrological Model (GXM), grid flow direction division, parallel computing, Directed Acyclic Graph (DAG), task scheduling

中图分类号: