[1] 沈志宇, 胡子昂, 廖湘科,等. 并行编译方法[M]. 北京:国防工业出版社,2000.[2] LIAO C H. A compile-time OpenMP cost model[D]. Houston: University of Houston, 2007.[3] TRIFUNOVIC K, NUZMAN D, COHEN A, et al. Polyhedral-model guided loop-nest auto-vectorization[C]// Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques. Washington, DC: IEEE Computer Society,2009:327-337.[4] BONDHUGULA U, GUNLUK O, DASH S, et al. A model for fusion and code motion in an automatic parallelizing compiler[C]// Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques. Washington, DC:IEEE Computer Society,2010:343-352.[5] SHARAPOV I, KROEGER R, DELAMATER G, et al. A case study in top-down performance estimation for a large-scale parallel application[C]// Proceedings of the 11th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. New York:ACM,2006:81-89.[6] CONG J, YUAN B. Energy-efficient scheduling on heterogeneous multi-core architecture[C]// Proceedings of the 2012 ACM/IEEE International Symposium on Low Power Electronics and Design.New York:ACM,2012: 345-350.[7] CHEN T, RAGHAVAN R, DALE J N, et al. Cell broadband engine architecture and its first implementation-a performance view[J]. IBM Journal of Research and Development, 2007, 51(5): 559-572.[8] SKOVHEDE K, LARSEN M N,VINTER B. Extending distributed shared memory for the cell broadband engine to a channel model[C]// Proceedings of the 10th International Conference on Applied Parallel and Scientific Computing. Berlin:Springer-Verlag, 2012, 7133: 108-118.[9] UJVAL J K, RIXNER S, WILLIAN J D, et al. Programmable stream processors[J]. Computer, 2003, 36(8): 54-62.[10] KINDRATENKO V V. Novel computing architecture[J]. Computing in Science & Engineering, 2009, 11(3): 54-57.[11] BLAGOJEVIC F, FENG X Z, CAMERON K W, et al. Modeling multigrain parallelism on heterogeneous multi-core processors: a case study of the cell BE[C]// Proceedings of the 2008 International Conference on High-Performance Embedded Architectures and Computers. Berlin: Springer,2008:38-52.[12] SHAN H Z, BLAGOJEVIC F, MI S J, et al. A programming model performance study using the NAS parallel benchmarks[J]. Scientific Programming, 2010, 18(3/4): 153-167. |