Parallel design and implementation of scale invariant feature transform algorithm based on OpenCL

doi:10.11772/j.issn.1001-9081.2016.07.1801

Journal of Computer Applications ›› 2016, Vol. 36 ›› Issue (7): 1801-1806.DOI: 10.11772/j.issn.1001-9081.2016.07.1801

Previous Articles Next Articles

Parallel design and implementation of scale invariant feature transform algorithm based on OpenCL

XU Chuanpei^1,2, WANG Guang^1,2

1. School of Electrical Engineering and Automation, Guilin University of Electronic Technology, Guilin Guangxi 541004, China;
2. Guangxi Key Laboratory of Automatic Detecting Technology and Instruments, Guilin University of Electronic Technology, Guilin Guangxi 541004, China

Received:2015-12-10 Revised:2016-02-22 Online:2016-07-14 Published:2016-07-10

基于OpenCL的尺度不变特征变换算法的并行设计与实现

许川佩^1,2, 王光^1,2

1. 桂林电子科技大学电子工程与自动化学院, 广西桂林 541004;
2. 广西自动检测技术与仪器重点实验室(桂林电子科技大学), 广西桂林 541004

通讯作者: 王光
作者简介:许川佩(1968-),女,广西桂林人,教授,博士,主要研究方向:自动测试总线与系统、计算机辅助设计、测试技术;王光(1989-),男,河南商丘人,硕士研究生,主要研究方向:图像处理。

Abstract

Abstract: The real-time performance of Scale Invariant Feature Transform (SIFT) algorithm is excessively bad. To solve the problem, a parallel optimized SIFT algorithm using the Open Computing Language (OpenCL) was proposed. Firstly, all steps of the original algorithm were split and combined; in addition, the indexing method of feature points in memory was restructured. Thus the middle calculation results could be made completely to finish interaction in the memory. Then, each step of the original algorithm was designed in parallel to improve the efficiency of data reading and reduce the transmission delay by multiplexing global memory object, sharing local memory and optimizing memory access. Finally, a fine-grained parallel accelerated SIFT algorithm was completed on Graphics Processing Unit (GPU) platform using OpenCL and the transplant was completed on the Central Processing Unit (CPU) platform. The parallel algorithm speeded up 10.51-19.33 and 2.34-4.74 times in feature extraction on GPU and CPU platform when the registration result was close to the original algorithm. The experimental results show that the parallel accelerated SIFT algorithm using OpenCL can improve the real-time performance of image registration and overcome the disadvantages of that Compute Unified Device Architecture (CUDA) is difficult to be transplanted so that it can not make full use of the multiple computing cores in heterogeneous systems.

Key words: Scale Invariant Feature Transform (SIFT) algorithm, Open Computing Language (OpenCL), multiplexed memory object, fine-grained parallelism, heterogeneous system

摘要： 针对尺度不变特征变换（SIFT）算法实时性差的问题，提出了利用开放式计算语言（OpenCL）并行优化的SIFT算法。首先，通过对原算法各步骤进行组合拆分、重构特征点在内存中的数据索引等方式对原算法进行并行化重构，使得算法的中间计算结果能够完全在显存中完成交互；然后，采用复用全局内存对象、共享局部内存、优化内存读取等策略对原算法各步骤进行并行设计，提高数据读取效率，降低传输延时；最后，利用OpenCL语言在图形处理单元（GPU）上实现了SIFT算法的细粒度并行加速，并在中央处理器（CPU）上完成了移植。与原SIFT算法配准效果相近时，并行化的算法在GPU和CPU平台上特征提取速度分别提升了10.51～19.33和2.34～4.74倍。实验结果表明，利用OpenCL并行加速的SIFT算法能够有效提高图像配准的实时性，并能克服统一计算设备架构（CUDA）因移植困难而不能充分利用异构系统中多种计算核心的缺点。

关键词: 尺度不变特征变换算法, 开放式计算语言, 复用内存对象, 细粒度并行, 异构系统

CLC Number:

TP391.4

XU Chuanpei, WANG Guang. Parallel design and implementation of scale invariant feature transform algorithm based on OpenCL[J]. Journal of Computer Applications, 2016, 36(7): 1801-1806.

许川佩, 王光. 基于OpenCL的尺度不变特征变换算法的并行设计与实现[J]. 计算机应用, 2016, 36(7): 1801-1806.

References

[1] LOWE D G.Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2):91-110.
[2] KE Y, SUKTHANKAR R. PCA-SIFT:a more distinctive representation for local image descriptors[C]//CVPR 2004:Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2004:Ⅱ506-Ⅱ513.
[3] BAY H, TUYTELAARS T, VAN GOOL L. SURF:speeded up robust features[C]//ECCV 2006:Proceedings of the 9th European Conference on Computer Vision, Part Ⅰ. Berlin:Springer, 2006:404-417.
[4] LUO J, GWUN O. A comparison of SIFT, PCA-SIFT and SURF[J]. Journal of Business Education, 2009, 3(4):143-152.
[5] 张杰,柴志雷,喻津.基于GPU的图像特征并行计算方法[J].计算机科学,2015,42(10):297-324.(ZHANG J, CHAI Z L, YU J. Parallel computation method of image features based on GPU[J]. Computer Science, 2015, 42(10):297-324.)
[6] 肖汉,郭运宏,周清雷.面向CPU+GPU异构计算的SIFT特征匹配并行算法[J].同济大学学报:自然科学版,2013,41(11):1732-1737.(XIAO H, GUO Y H, ZHOU Q L. Parallel algorithm of CPU and GPU-oriented heterogeneous computation in SIFT feature matching[J]. Journal of Tongji University (Natural Science), 2013, 41(11):1732-1737.)
[7] LU M. Fast implementation of scale invariant feature transform based on CUDA[J]. Applied Mathematics & Information Sciences, 2013, 7(2):717-722.
[8] WANG G, RISTER B, CAVALLARO J R. Workload analysis and efficient OpenCL-based implementation of SIFT algorithm on a smartphone[C]//GlobalSIP 2013:Proceedings of the 2013 IEEE Global Conference on Signal and Information Processing. Piscataway, NJ:IEEE, 2013:759-762.
[9] RISTER B, WANG G, WU M, et al. A fast and efficient SIFT detector using the mobile GPU[C]//ICASSP 2013:Proceedings of 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway, NJ:IEEE, 2013:2674-2678.
[10] 董小社,刘超,王恩东,等.面向GPU异构并行系统的多任务流编程模型[J].计算机学报,2014,37(7):1638-1646.(DONG X S, LIU C, WANG E D, et al. A multi task-stream programing model for GPU based on heterogeneous parallel system[J]. Chinese Journal of Computers, 2014, 37(7):1638-1646.)
[11] PENNYCOOK S J, HAMMOND S D, WRIGHT S A, et al. An investigation of the performance portability of OpenCL[J]. Journal of Parallel & Distributed Computing, 2013, 73(11):1439-1450.
[12] TIAN L, MENG C, ZHOU F. A two-level task scheduler on multiple DSP system for OpenCL[J]. Advances in Mechanical Engineering, 2014:Article ID 754835.
[13] 陈刚,吴百锋.面向OpenCL模型的GPU性能优化[J].计算机辅助设计与图形学学报,2011,23(4):571-581.(CHEN G, WU B F. GPU performance optimization targeting OpenCL model[J]. Journal of Computer-Aided Design & Computer Graphics, 2011, 23(4):571-581.)
[14] YAN W, SHI X, YAN X, et al. Computing OpenSURF on OpenCL and general purpose GPU[J]. International Journal of Advanced Robotic Systems, 2013, 10(4):301-319.
[15] SANCHEZ L M, FERNANDEZ J, SOTOMAYOR R, et al. A comparative study and evaluation of parallel programming models for shared-memory parallel architectures[J]. New Generation Computing, 2013, 31(3):139-161.
[16] JANG B, CHOI M, KIM K K. Algorithmic GPGPU memory optimization[C]//ISOCC 2013:Proceedings of the 2013 International SoC Design Conference. Piscataway, NJ:IEEE, 2013:154-157.
[17] 肖汉,马歌,周清雷.面向OpenCL架构的Harris角点检测算法[J].计算机科学,2014,41(7):306-309.(XIAO H, MA G, ZHOU Q L. Harris corner detection algorithm on OpenCL architecture[J]. Computer Science, 2014, 41(7):306-309.)
[18] FANG J, SIPS H, VARBANESCU A L. Aristotle:a performance impact indicator for the OpenCL kernels using local memory[J]. Scientific Programming, 2014, 22(3):239-257.
[19] 闫钧华,杭谊青,许俊峰,等.基于CUDA的高分辨率数字视频图像配准快速实现[J].仪器仪表学报,2014,35(2):380-386.(YAN J H, HANG Y Q, XU J F, et al. Quick realization of CUDA-based registration of high-resolution digital video images[J]. Chinese Journal of Scientific Instrument, 2014, 35(2):380-386.)

Parallel design and implementation of scale invariant feature transform algorithm based on OpenCL

基于OpenCL的尺度不变特征变换算法的并行设计与实现

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 5

Recommended Articles

Metrics

[1]	JIANG Zetao, XU Juanjuan. Heterogenous cross-domain identity authentication scheme based on signcryption in cloud environment [J]. Journal of Computer Applications, 2020, 40(3): 740-746.
[2]	BO Dan, LI Zongchun, WANG Xiaonan, QIAO Hanwen. Accelerated KAZE-SIFT feature extraction algorithm for oblique images [J]. Journal of Computer Applications, 2019, 39(7): 2093-2097.
[3]	LIN Zecheng, ZHU Jianqing, LIAO Shengcai, LI Stan Z.. Uniform SILTP based background modeling and its implementation on Intel HD graphics [J]. Journal of Computer Applications, 2015, 35(8): 2274-2279.
[4]	ZHAO Xia, ZHU Qing, XIAO Xiongwu, LI Deren, GUO Bingxuan, ZHANG Peng, HU Han, DING Yulin. Automatic matching method for aviation oblique images based on homography transformation [J]. Journal of Computer Applications, 2015, 35(6): 1720-1725.
[5]	Li-min ZHANG Shang-bo ZHOU. Feature matching of scale invariant feature transform images based on fractional differential approach [J]. Journal of Computer Applications, 2011, 31(04): 1019-1023.