[1] 高伟,赵荣彩,韩林,等.SIMD自动向量化编译优化概述[J].软件学报,2015,26(6):1265-1284. (GAO W, ZHAO R C, HAN L,et al. Research on SIMD auto-vectorization compiling optimization[J]. Journal of Software, 2015,26(6):1265-1284.) [2] 彭飞,顾乃杰,高翔,等.龙芯3B的SIMD编译优化及分析[J].小型微型计算机系统,2012,33(12):2733-2737. (PENG F, GU N J, GAO X, et al. SIMD compiler optimization and analysis based on Godson-3B processor[J]. Journal of Chinese Computer Systems, 2012, 33(12):2733-2737.) [3] 陈书明,刘胜,万江华,等.协同多核DSP YHFT-QMBase:体系结构及实现[J].中国科学:信息科学,2015,45(4):560-573. (CHEN S M, LIU S, WAN J H, et al. Coordinate multi-core DSP YHFT-QMBase:architecture and implementation[J]. SCIENCE CHINA (Informationis), 2015, 45(4):560-573.) [4] 王向前,洪一,王昊,等.魂芯DSP的编译器设计与优化[J].电子学报,2015,43(8):1656-1661. (WANG X Q, HONG Y, WANG H, et al. Compiler design and optimization for BWDSP[J]. Acta Electronica Sinica, 2015, 43(8):1656-1661.) [5] CHEN L, JIANG P, AGRAWAL G. Exploiting recent SIMD architectural advances for irregular applications[C]//Proceedings of the 2016 IEEE/ACM International Symposium on Code Generation and Optimization. Piscataway, NJ:IEEE, 2016:47-58. [6] LEIßA R, HAFFNER I, HACK S. Sierra:a SIMD extension for C++[C]//WPMVP' 14:Proceedings of the 2014 Workshop on Programming Models for SIMD/Vector Processing. New York:ACM, 2014:17-24. [7] HUO X, REN B, AGRAWAL G. A programming system for Xeon Phis with runtime SIMD parallelization[C]//ICS' 14:Proceedings of the 28th ACM International Conference on Supercomputing. New York:ACM, 2014:283-292. [8] EVANS G C, ABRAHAM S, KUHN B, et al. Vector seeker:a tool for finding vector potential[C]//WPMVP' 14:Proceedings of the 2014 Workshop on Programming Models for SIMD/Vector Processing. New York:ACM, 2014:41-48. [9] KENNEDY K, ALLEN J R. Optimizing Compilers for Modern Architectures:A Dependence-based Approach[M]. San Francisco, CA:Morgan Kaufmann, 2002. [10] NUZMAN D, ZAKS A. Outer-loop vectorization:revisited for short SIMD architectures[C]//PACT' 08:Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques. Piscataway, NJ:IEEE, 2008:2-11. [11] TRIFUNOVIC K, NUZMAN D, COHEN A, et al. Polyhedral-model guided loop-nest auto-vectorization[C]//PACT' 09:Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques. Piscataway, NJ:IEEE, 2009:327-337. [12] KONG M, VERAS R, STOCK K, et al. When polyhedral transformations meet SIMD code generation[C]//PLDI' 13:Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design & Implementation. New York:ACM, 2013:127-138. [13] LARSEN S, AMARASINGHE S. Exploiting superword level parallelism with multimedia instruction sets[J]. ACM Sigplan Notices, 2000, 35(5):145-156. [14] WANG Y, WANG D, CHEN S, et al. Iteration interleaving-based SIMD lane partition[J]. ACM Transactions on Architecture & Code Optimization, 2016, 12(4):Article No. 58. |