计算机应用 ›› 2013, Vol. 33 ›› Issue (06): 1540-1552.DOI: 10.3724/SP.J.1087.2013.01540

• 先进计算 • 上一篇    下一篇

GPU集群下第一原理非局部映射势能计算

付继芸1,2,贾伟乐1,2,曹宗雁1,王龙1,叶煌1,迟学斌1   

  1. 1. 中国科学院 计算机网络信息中心,北京 100190
    2. 中国科学院研究生院,北京 100190
  • 收稿日期:2012-12-27 修回日期:2013-02-22 出版日期:2013-06-01 发布日期:2013-06-05
  • 通讯作者: 付继芸
  • 作者简介:付继芸(1985-),女,河南安阳人,硕士研究生,主要研究方向:计算机软件与理论、并行计算、高性能计算;贾伟乐(1985-),男,河北衡水人,硕士研究生,主要研究方向:计算机软件与理论、并行计算、高性能计算;曹宗雁(1981-),男,湖南湘潭人,博士研究生,主要研究方向:高性能计算机系统。
  • 基金资助:

    国家自然科学基金资助项目(61202054);国家863计划项目(2010AA012301);中国科学院知识创新工程项目(CNIC_ZR_201202);中国科学院“十二五”信息化专项

First-principle nonlocal projector potential calculation on GPU cluster

FU Jiyun1,2,JIA Weile1,2,CAO Zongyan1,WANG Long1,YE Huang1,CHI Xuebin1   

  1. 1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
    2. Graduate School of Chinese Academy of Sciences, Beijing 100190, China
  • Received:2012-12-27 Revised:2013-02-22 Online:2013-06-05 Published:2013-06-01
  • Contact: FU Jiyun

摘要: 平面波赝势密度泛函(PWP-DFT)计算是材料计算中应用最广泛的方法,其中映射计算是PWP-DFT方法求解自洽迭代中重要的一部分。针对映射势能计算成为软件加速的瓶颈,提出了针对该部分的图形处理器(GPU)加速算法,其中考虑GPU的特点:1)使用了新的并行机制求解非局部映射势能;2)重新设计了数据分布结构;3)减少内存的使用;4)提出了一种解决算法中数据相关问题的方法。最终获得了18~57倍加速,使每步分子动力学模拟最终降为12s。详细分析了该模块在GPU平台上的测试时间,同时对该算法在GPU集群上的计算瓶颈进行了讨论。

关键词: 第一性原理, 密度泛函理论, 赝势平面波, 非局部映射势能, GPU加速

Abstract: Plane Wave Pseudopotential (PWP) Density Functional Theory (DFT) calculation is the most widely used method for material calculation. The projector calculation plays an important part in PWP-DFT calculation for the self-consistent iteration solution, while it often becomes a hinder to the speed-up of software. Therefore, according to the features of Graphic Processing Unit (GPU), a speed-up algorithm was proposed: 1) using a new parallel mechanism to solve the potential energy of nonlocal projector, 2) redesigning the distribution structure of data, 3) reducing the use of computer memory, 4) Proposing a solution to the related data problems of the algorithm. Eventually got 18-57 times acceleration, and reached the 12 seconds per step of the molecular dynamics simulation. In this paper, the testing time of running this model on GPU platform was analysed in detail, meanwhile the calculation bottleneck of the implementation of this method into GPU clusters was discussed

Key words: first-principle, Density Functional Theory (DFT), Plane Wave Pseudopotential (PWP), nonlocal projector potential, GPU speedup

中图分类号: