[1]BENOIT A, MELHEM R, RENAUD-GOUD P, et al. Power-aware Manhattan routing on chip multiprocessors [C]// Proceedings of 2012 IEEE 26th International Parallel and Distributed Processing Symposium. Piscataway: IEEE, 2012:189-200.[2]JIN H Q, JESPEREN D, MEHROTRA P, et al. High performance computing using MPI and OpenMP on multi-core parallel systems [J]. Parallel Computing, 2011, 37(9):562-575.[3]BONDHUGULA U K R. Effective automatic parallelization and locality optimization using the polyhedral model [D]. Ohio: The Ohio State University, 2008.[4]AKHTER S, ROBERTS J. Multi-core programming: increasing performance through software multi-threading [M]. Hillsboro: Intel Corporation, 2006: 13-27.[5]CYTRON R. Doacross: beyond vectorization for multiprocessors[C]// Proceedings of the 1986 International Conference on Parallel Processing. Piscataway: IEEE, 1986: 836-844.[6]CHEN D-K, YEW P-C. An empirical study on DOACROSS loops [C]// Proceedings of Supercomputing. New York: ACM, 1991:620-632.[7]HURSON A R, LIM J T, KAVI K M, et al. Parallelization of DOALL and DOACROSS loops — a survey [J]. Advances in Computers, 1997, 45:53-103.[8]LIN Y-T, WANG S-C, SHIH W-L, et al. Enable OpenCL compiler with Open64 infrastructures [C]// 2011 IEEE 13th International Conference on High Performance Computing and Communications. Piscataway: IEEE, 2011:863-868.[9]富弘毅, 丁滟, 宋伟,等. 一种利用并行复算实现的OpenMP容错机制[J].软件学报,2012, 23(2): 411-427.[10]THOMAN P, JORDAN H, PELLEGRINI S, et al. Automatic OpenMP loop scheduling: a combined compiler and runtime approach [C]// IWOMP12: Proceedings of 8th International Conference on OpenMP in a Heterogeneous World. Berlin: Springer-Verlag, 2012:88-101.[11]ALLEN R, KENNEDY K. Optimizing compilers for modern architectures: a dependence-based approach[M]. San Francisco: Morgan Kaufmann Publisher, 2001: 63-68.[12]TAFLOVE A. Computational electrodynamics [M]. London: Artech House Publishers, 1995.[13]马琳.反馈指导的流水计算性能调优[D].北京:中国科学院计算技术研究所,2005. |