[1] 张朋. 多核平台下串行程序的并行化改造[D]. 成都:电子科技大学,2015:1-3.(ZHANG P. Parallel transformation of serialprogram under multi-core platform[D]. Chengdu:University of Electronic Science and Technology of China,2015:1-3.) [2] 张潇, 支天. 面向多核处理器的机器学习推理框架[J]. 计算机研究与发展,2019,56(9):1977-1987. (ZHANG X,ZHI T. Machine learning inference framework on multi-core processor[J]. Journal of Computer Research and Development,2019,56(9):1977-1987.) [3] HIRATA H, NUNOME A. Decoupling computation and result write-back for thread-level parallelization[J]. International Journal of Software Innovation,2020,8(3):19-34. [4] 马巧梅. 基于程序特征的线程划分方法的研究[J]. 计算机科学与探索,2018,12(6):872-885.(MA Q M. Research of thread partitioning approach based on program characteristics[J]. Journal Frontiers of Computer Science and Technology,2018,12(6):872-885.) [5] 刘聪. 面向多核体系结构的并行优化关键技术研究[D]. 长沙:国防科技大学,2014:1-7. (LIU C. Research on the key techniques of parallelization and optimization for multi-core architecture[D]. Changsha:National University of Defense Technology,2014:1-7.) [6] VIJAYKUMAR T N,SOHI G S. Task selection for a multiscalar processor[C]//Proceedings of the 31st Annual ACM/IEEE International Symposium on Microarchitecture. Piscataway:IEEE, 1998:81-92. [7] HAMMOND L,HUBBERT B A,SIU M,et al. The Stanford Hydra CMP[J]. IEEE Micro,2000,20(2):71-84. [8] HAMMOND L,CARLSTROM B D,WONG V,et al. Transactional coherence and consistency:simplifying parallel hardware and software[J]. IEEE Micro,2004,24(6):92-103. [9] STEFFAN J G,COLOHAN C,ZHAI A,et al. The STAMPede approach to thread-level speculation[J]. ACM Transactions on Computer Systems,2005,23(3):253-300. [10] 刘斌, 赵银亮, 韩博, 等. 基于性能预测的推测多线程循环选择方法[J]. 电子与信息学报,2014,36(11):2768-2774.(LIU B, ZHAO Y L,HAN B,et al. A loop selection approach based on performance prediction for speculative multithreading[J]. Journal of Electronics and Information Technology,2014,36(11):2768-2774.) [11] 李美蓉, 赵银亮. 一种基于推测代价评估的推测多线程并行粒度调节方法[J]. 计算机应用与软件,2019,36(4):29-36,90. (LI M R,ZHAO Y L. A parallel granularity tuning approach for speculative multithreading based on speculative cost evaluation[J]. Computer Applications and Software,2019,36(4):29-36,90.) [12] SALAMANCA J,AMARAL J N,ARAUJO G. Using hardwaretransactional-memory support to implement thread-level speculation[J]. IEEE Transactions on Parallel and Distributed Systems,2018,29(2):466-480. [13] FALK H,ALTMEYER S,HELLINCKX P,et al. TACLeBench:a benchmark collection to support worst-case execution time research[C]//Proceedings of the 16th International Workshop on Worst-Case Execution Time Analysis. Wadern:Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik,2016:No. 2. [14] CARVALHO H,NELISSEN G,ZAYKOV P. mcQEMU:timeaccurate simulation of multi-core platforms using QEMU[C]//Proceedings of the 23rd Euromicro Conference on Digital System Design. Piscataway:IEEE,2020:81-88. [15] POORHOSSEINI M,NEBEL W,GRÜTTNER K. A compiler comparison in the RISC-V ecosystem[C]//Proceedings of the 2020 International Conference on Omni-layer Intelligent Systems. Piscataway:IEEE,2020:1-6. [16] SEGARRA J,CORTADELLA J,TEJERO R G,et al. Automatic safe data reuse detection for the WCET analysis of systems with data caches[J]. IEEE Access,2020,8:192379-192392. [17] 李颖颖, 庞建民, 李雁冰, 等. 一种面向众核处理器的嵌套循环多维并行识别方法[J]. 计算机应用研究,2018,35(11):3311-3314.(LI Y Y,PANG J M,LI Y B,et al. Multi-dimensional parallelism recognition method of nested loop for many-core processors[J]. Application Research of Computers,2018,35(11):3311-3314.) [18] 王耀彬, 安虹, 郭锐, 等. 用线程级推测技术在多核体系结构上并行化科学计算应用[J]. 小型微型计算机统,2010,31(2):264-270.(WANG Y B,AN H,GUO R,et al. Exposing threadlevel speculation parallelism in scientific applications on multicore architecture[J]. Journal of Chinese Computer Systems,2010,31(2):264-270.) [19] SCHOEBERL M, NIELSEN C. A stack cache for real-time systems[C]//Proceedings of the IEEE 19th International Symposium on Real-Time Distributed Computing. Piscataway:IEEE,2016:150-157. |