基于图形处理器的球面Voronoi图生成算法优化

doi:10.11772/j.issn.1001-9081.2015.06.1564

计算机应用 ›› 2015, Vol. 35 ›› Issue (6): 1564-1566.DOI: 10.11772/j.issn.1001-9081.2015.06.1564

基于图形处理器的球面Voronoi图生成算法优化

王磊¹, 王鹏飞¹, 赵学胜¹, 卢立托²

1. 中国矿业大学(北京) 地球科学与测绘工程学院, 北京 100083;
2. 中国石油集团工程设计有限责任公司北京分公司, 北京 100085

收稿日期:2015-01-07 修回日期:2015-04-02 发布日期:2015-06-12
通讯作者: 王磊(1989-),男,安徽宿州人,博士研究生,主要研究方向:球面Voronoi图、GPU并行计算;wl890627@163.com
作者简介:王鹏飞(1991-),男,山东德州人,硕士研究生,主要研究方向:三维建模、海量三维模型可视化;赵学胜(1967-),男,山东菏泽人,教授,博士,主要研究方向:三维GIS、数字地球空间建模;卢立托(1986-),男,河北石家庄人,助理工程师,硕士,主要研究方向:并行计算、全球定位系统。
基金资助:
国家自然科学基金资助项目(41171306);高等学校博士学科点专项科研基金资助项目(20130023110001)。

Optimization of spherical Voronoi diagram generating algorithm based on graphic processing unit

WANG Lei¹, WANG Pengfei¹, ZHAO Xuesheng¹, LU Lituo²

1. College of Geoscience and Surveying Engineering, China University of Mining and Technology (Beijing), Beijing 100083, China;
2. Beijing Company, China Petroleum Engineering Company Limited, Beijing 100085, China

Received:2015-01-07 Revised:2015-04-02 Online:2015-06-12

摘要/Abstract

摘要：

基于四元三角格网(QTM)之间距离计算与比较的球面Voronoi图生成算法相对于扩张算法具有较高的精度,但由于需要计算并比较每个格网到所有种子点的距离,致使算法效率较低。针对这一问题,利用图形处理器(GPU)并行计算对算法进行实现,然后从GPU共享内存、常量内存、寄存器等三种内存的访问方面进行优化,最后用C++语言和统一计算设备架构(CUDA)开发了实验系统,对优化前后算法的效率进行对比。实验结果表明,不同内存的合理使用能在很大程度上提高算法的效率,且数据规模越大,所获得的加速比越高。

关键词: 球面Voronoi图, 统一计算设备架构, 共享内存, 常量内存, 寄存器

Abstract:

Spherical Voronoi diagram generating algorithm based on distance computation and comparison of Quaternary Triangular Mesh (QTM) has a higher precision relative to dilation algorithm. However, massive distance computation and comparison lead to low efficiency. To improve efficiency, Graphic Processing Unit (GPU) parallel computation was used to implement the algorithm. Then, the algorithm was optimized with respect to the access to GPU shared memory, constant memory and register. At last, an experimental system was developed by using C++ and Compute Unified Device Architecture (CUDA) to compare the efficiency before and after the optimization. The experimental results show that efficiency can be improved to a great extent by using different GPU memories reasonably. In addition, a higher speed-up ratio can be acquired when the data scale is larger.

Key words: spherical Voronoi diagram, Compute Unified Device Architecture (CUDA), shared memory, constant memory, register

中图分类号:

王磊, 王鹏飞, 赵学胜, 卢立托. 基于图形处理器的球面Voronoi图生成算法优化[J]. 计算机应用, 2015, 35(6): 1564-1566.

WANG Lei, WANG Pengfei, ZHAO Xuesheng, LU Lituo. Optimization of spherical Voronoi diagram generating algorithm based on graphic processing unit[J]. Journal of Computer Applications, 2015, 35(6): 1564-1566.

参考文献

[1] LUKATELA H. Ellipsoidal area computations of large terrestrial objects [C/OL]//Proceedings of the First International Conference on Discrete Grids. [2014-12-01]. http://ncgia.ucsb.edu/globalgrids-book/eac/.
[2] LUKATELA H. Hipparchus geopositioning model: an overview [C/OL]//Proceedings of the Eighth International Symposium on Computer-Assisted Cartography, 1987. [2014-12-01]. http://www.geodyssey.com/papers/hlauto8.html.
[3] MOSTAFAVI M A, GOLD C. A global kinetic spatial data structure for a marine simulation [J]. International Journal of Geographical Information Science, 2004, 18(3):211-227.
[4] DUTTON G. Modeling locational uncertainty via hierarchical tessellation [M]. Accuracy of Spatial Databases. London: Taylor & Francis, 1989:125-140.
[5] WANG L, ZHAO X, CAO W, et al. A GPU-based algorithm for the generation of spherical Voronoi diagram in QTM mode [J]. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2013, XL-4/W2:45-50.
[6] NVIDIA. CUDA C programming guide Version 6.5 [EB/OL]. [2014-12-01]. http://docs.nvidia.com/cuda/pdf/CUDA_C_ Programming_Guide.pdf.
[7] COOK S. CUDA programming: a developer's guide to parallel computing with GPUs [M]. SU T, LI D, LI S, et al. translated. Beijing: China Machine Press, 2014: 12-13. (COOK S.GPU并行程序设计——GPU编程指南[M].苏统华, 李东, 李松泽, 等译.北京:机械工业出版社, 2014:12-13.)
[8] HE Y, YE C, LIU Z, et al. Parallel simulation and optimization of CUDA-based real-time huge crowd behavior[J]. Journal of Computer Applications, 2012, 32(9): 2466-2469. (贺毅辉, 叶晨, 刘志忠, 等. 基于CUDA的大规模群体行为实时仿真并行实现及优化[J].计算机应用, 2012, 32(9):2466-2469.)
[9] XU S, ZHANG E. CUDA-based parallel visualization of 3D data[J]. CT Theory and Applications, 2011, 20(1): 47-54. (徐赛花, 张二华. 基于CUDA的三维数据并行可视化[J].CT理论与应用研究, 2011, 20(1):47-54.)
[10] DESCHIZEAUX B, BLANC J-Y. Imaging earth's subsurface using CUDA [C/OL]//GPU Gems. 2007. [2014-12-01]. http://http.developer.nvidia.com/GPUGems3/gpugems3_ch38.html.
[11] SANDERS J, KANDROT E. CUDA by example: an introduction to general-purpose GPU programming [M]. NIE X, et al. translated. Beijing: China Machine Press, 2011: 54-55, 78.(SANDERS J, KANDROT E.GPU高性能编程——CUDA实战[M]. 聂雪军, 等译. 北京:机械工业出版社, 2011:54-55, 78.)
[12] ZHAN S, ZHU Y, ZHAO K, et al. GPU high performance computation -CUDA [M]. Beijing: China Water & Power Press, 2009: 46-47, 141-142. (张舒, 褚艳利, 赵开勇, 等.GPU高性能运算之CUDA[M].北京:中国水利水电出版社, 2009:46-47, 141-142.)

[1]	李金金, 桑国明, 张益嘉. APK-CNN和Transformer增强的多域虚假新闻检测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2674-2682.
[2]	涂进兴, 李志雄, 黄建强. 基于GPU对角稀疏矩阵向量乘法的动态划分算法[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3521-3529.
[3]	陈田, 鲁建勇, 刘军, 梁华国, 鲁迎春. 基于三维线性反馈移位寄存器的三维堆叠集成电路可重构测试方案[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 949-955.
[4]	刘丽, 陈长波. 带状稀疏矩阵乘法及高效GPU实现[J]. 《计算机应用》唯一官方网站, 2023, 43(12): 3856-3867.
[5]	杨先凤, 贵红军, 傅春常. 统一计算设备架构下的F-X域预测滤波并行算法[J]. 计算机应用, 2021, 41(2): 486-491.
[6]	潘国腾, 欧国东, 晁张虎, 李梦君. Lite寄存器模型的设计与实现[J]. 计算机应用, 2020, 40(5): 1369-1373.
[7]	姬丽娜, 陈庆奎, 陈圆金, 赵德玉, 方玉玲, 赵永涛. 基于GPU的视频流人群实时计数[J]. 计算机应用, 2017, 37(1): 145-152.
[8]	张硕, 何发智, 周毅, 鄢小虎. 基于自适应线程束的GPU并行粒子群优化算法[J]. 计算机应用, 2016, 36(12): 3274-3279.
[9]	李金静, 陈庆奎, 刘宝平, 刘伯成. 基于图形处理器的视频二值概率分割[J]. 计算机应用, 2015, 35(11): 3187-3193.
[10]	闫国昌何炎祥李清安. 降低寄存器软错误的静态寄存器重分配方法[J]. 计算机应用, 2014, 34(9): 2730-2733.
[11]	蒋烈辉陈慧超董卫宇张彦文. 基于静态寄存器分配的系统仿真协同优化方法[J]. 计算机应用, 2014, 34(5): 1404-1407.
[12]	陈修亮梁英杰郭福亮. 基于CUDA粒子系统的烟花仿真[J]. 计算机应用, 2013, 33(07): 2059-2062.
[13]	杜欣刘大刚张开活申远赵康倪友聪. 基于统一计算设备架构和基因表达式编程的自动聚类算法[J]. 计算机应用, 2013, 33(07): 1890-1893.
[14]	朱贺新王正鹏刘业辉方水平. 基于统一可扩展固件接口的可信密码模块驱动研究与设计[J]. 计算机应用, 2013, 33(06): 1646-1649.
[15]	张健飞沈德飞. 基于GPU的稀疏线性系统的预条件共轭梯度法[J]. 计算机应用, 2013, 33(03): 825-829.

基于图形处理器的球面Voronoi图生成算法优化

Optimization of spherical Voronoi diagram generating algorithm based on graphic processing unit

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics