[1] LOWE D G.Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2):91-110. [2] KE Y, SUKTHANKAR R. PCA-SIFT:a more distinctive representation for local image descriptors[C]//CVPR 2004:Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2004:Ⅱ506-Ⅱ513. [3] BAY H, TUYTELAARS T, VAN GOOL L. SURF:speeded up robust features[C]//ECCV 2006:Proceedings of the 9th European Conference on Computer Vision, Part Ⅰ. Berlin:Springer, 2006:404-417. [4] LUO J, GWUN O. A comparison of SIFT, PCA-SIFT and SURF[J]. Journal of Business Education, 2009, 3(4):143-152. [5] 张杰,柴志雷,喻津.基于GPU的图像特征并行计算方法[J].计算机科学,2015,42(10):297-324.(ZHANG J, CHAI Z L, YU J. Parallel computation method of image features based on GPU[J]. Computer Science, 2015, 42(10):297-324.) [6] 肖汉,郭运宏,周清雷.面向CPU+GPU异构计算的SIFT特征匹配并行算法[J].同济大学学报:自然科学版,2013,41(11):1732-1737.(XIAO H, GUO Y H, ZHOU Q L. Parallel algorithm of CPU and GPU-oriented heterogeneous computation in SIFT feature matching[J]. Journal of Tongji University (Natural Science), 2013, 41(11):1732-1737.) [7] LU M. Fast implementation of scale invariant feature transform based on CUDA[J]. Applied Mathematics & Information Sciences, 2013, 7(2):717-722. [8] WANG G, RISTER B, CAVALLARO J R. Workload analysis and efficient OpenCL-based implementation of SIFT algorithm on a smartphone[C]//GlobalSIP 2013:Proceedings of the 2013 IEEE Global Conference on Signal and Information Processing. Piscataway, NJ:IEEE, 2013:759-762. [9] RISTER B, WANG G, WU M, et al. A fast and efficient SIFT detector using the mobile GPU[C]//ICASSP 2013:Proceedings of 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway, NJ:IEEE, 2013:2674-2678. [10] 董小社,刘超,王恩东,等.面向GPU异构并行系统的多任务流编程模型[J].计算机学报,2014,37(7):1638-1646.(DONG X S, LIU C, WANG E D, et al. A multi task-stream programing model for GPU based on heterogeneous parallel system[J]. Chinese Journal of Computers, 2014, 37(7):1638-1646.) [11] PENNYCOOK S J, HAMMOND S D, WRIGHT S A, et al. An investigation of the performance portability of OpenCL[J]. Journal of Parallel & Distributed Computing, 2013, 73(11):1439-1450. [12] TIAN L, MENG C, ZHOU F. A two-level task scheduler on multiple DSP system for OpenCL[J]. Advances in Mechanical Engineering, 2014:Article ID 754835. [13] 陈刚,吴百锋.面向OpenCL模型的GPU性能优化[J].计算机辅助设计与图形学学报,2011,23(4):571-581.(CHEN G, WU B F. GPU performance optimization targeting OpenCL model[J]. Journal of Computer-Aided Design & Computer Graphics, 2011, 23(4):571-581.) [14] YAN W, SHI X, YAN X, et al. Computing OpenSURF on OpenCL and general purpose GPU[J]. International Journal of Advanced Robotic Systems, 2013, 10(4):301-319. [15] SANCHEZ L M, FERNANDEZ J, SOTOMAYOR R, et al. A comparative study and evaluation of parallel programming models for shared-memory parallel architectures[J]. New Generation Computing, 2013, 31(3):139-161. [16] JANG B, CHOI M, KIM K K. Algorithmic GPGPU memory optimization[C]//ISOCC 2013:Proceedings of the 2013 International SoC Design Conference. Piscataway, NJ:IEEE, 2013:154-157. [17] 肖汉,马歌,周清雷.面向OpenCL架构的Harris角点检测算法[J].计算机科学,2014,41(7):306-309.(XIAO H, MA G, ZHOU Q L. Harris corner detection algorithm on OpenCL architecture[J]. Computer Science, 2014, 41(7):306-309.) [18] FANG J, SIPS H, VARBANESCU A L. Aristotle:a performance impact indicator for the OpenCL kernels using local memory[J]. Scientific Programming, 2014, 22(3):239-257. [19] 闫钧华,杭谊青,许俊峰,等.基于CUDA的高分辨率数字视频图像配准快速实现[J].仪器仪表学报,2014,35(2):380-386.(YAN J H, HANG Y Q, XU J F, et al. Quick realization of CUDA-based registration of high-resolution digital video images[J]. Chinese Journal of Scientific Instrument, 2014, 35(2):380-386.) |