Machine learning based online mapping approach for heterogeneous multi-core processor system
AN Xin1,2, ZHANG Ying1,2, KANG An1,2, CHEN Tian1,2, LI Jianhua1,2
1. School of Computer and Information, Hefei University of Technology, Hefei Anhui 230601, China; 2. Anhui Provincial Key Laboratory of Affective Computing and Advanced Intelligent Machine(Hefei University of Technology), Hefei Anhui 230601, China
Abstract:Heterogeneous Multi-core Processors (HMPs) platform has become the mainstream solution for modern embedded system design, and online mapping or scheduling plays a vital role in making full use of the advantages of high performance and low power consumption. Aiming at the dynamic mapping problem of application tasks in HMPs, a mapping and scheduling approach based on machine learning prediction model was proposed. On the one hand, a machine learning model was constructed to predict and evaluate the performance of different mapping strategies rapidly and efficiently, so as to provide support for online scheduling. On the other hand, the machine learning model was integrated with genetic algorithm to find out the optimal resource allocation strategy efficiently. Finally, an Motion-Join Photographic Experts Group (M-JPEG) decoder was used to verify the effectiveness of the proposed approach. The experimental results show that, compared with the Round Robin Scheduler (RRS) and sampling scheduling approaches, the proposed online mapping/scheduling approach has reduced the average execution time by about 19% and 28% respectively.
[1] GREENHALGH A P. Big. LITTLE processing with ARM cortexTM-A15& cortex-A7[EB/OL].[2018-09-19]. https://www.arm.com/files/downloads/b-igLITTLE Final Final.pdf. [2] Apple Inc. A12-bionic[EB/OL].[2018-09-12]. https://www.apple.com/cn/iphone-xs/a12-bionic/. [3] LI C V, PETRUCCI V, MOSSE D. Predicting thread profiles across core types via machine learning on heterogeneous multiprocessors[C]//Proceedings of the 2016 VI Brazilian Symposium on Computing Systems Engineering. Piscataway, NJ:IEEE, 2016:56-62. [4] LE SUEUR E, HEISER G. Dynamic voltage and frequency scaling:The laws of diminishing returns[C]//HotPower 2010:Proceedings of the 2010 International Conference on Power Aware Computing and Systems. Berkeley, CA:USENIX Association, 2010:Article No. 1-8. [5] GOLI M, MCCALL J, BROWN C, et al. Mapping parallel programs to heterogeneous CPU/GPU architectures using a Monte Carlo tree search[C]//CEC 2013:Proceedings of the 2013 IEEE Congress on Evolutionary Computation. Piscataway, NJ:IEEE, 2013:2932-2939. [6] NEMIROVSKY D, ARKOSE T, MARKOVIC N, et al. A machine learning approach for performance prediction and scheduling on heterogeneous CPUs[C]//Proceedings of the 2017 IEEE 29th International Symposium on Computer Architecture and High Performance Computing. Piscataway, NJ:IEEE, 2017:121-128. [7] ZHANG Y Q, LAURENZANO M A, MARS J, et al. SMiTe:precise QoS prediction on real-system SMT processors to improve utilization in warehouse scale computers[C]//Proceedings of the 201447th Annual IEEE/ACM International Symposium on Microarchitecture. Washington, DC:IEEE Computer Society, 2014:406-418. [8] MICHALSKA M, CASALE-BRUNET S, BEZATI E, et al. High-precision performance estimation for the design space exploration of dynamic dataflow programs[J]. IEEE Transactions on Multi-Scale Computing Systems, 2018, 4(2):127-140. [9] SAYADI H, PATEL N, SASAN A, et al. Machine learning-based approaches for energy-efficiency prediction and scheduling in composite cores architectures[C]//Proceedings of the 201735th IEEE International Conference on Computer Design. Piscataway, NJ:IEEE, 2017:129-136. [10] WANG L, LIU S L, LU C, et al. Stable matching scheduler for single-ISA heterogeneous multi-core processors[C]//APPT 2015:Proceedings of the 2015 International Workshop on Advanced Parallel Processing Technologies, LNCS 9231. Cham:Springer, 2015:45-59. [11] ULLMAN J D. NP-complete scheduling problems[J]. Journal of Computer and System Sciences, 1975, 10(3):384-393. [12] ROY P, ALAM M M U, DAS N. Heuristic based task scheduling in multiprocessor systems with genetic algorithm by choosing the eligible processor[J]. International Journal of Distributed and Parallel Systems, 2012, 3(4):111-121. [13] CHATTERJEE N, PAUL S, MUKHERJEE P, et al. Deadline and energy aware dynamic task mapping and scheduling for network-on-chip based multi-core platform[J]. Journal of Systems Architecture, 2017, 74:61-77. [14] GHARSELLAOUI H, KTATA I, KHARROUBI N, et al. Real-time reconfigurable scheduling of multiprocessor embedded systems using hybrid genetic based approach[C]//Proceedings of the 2015 IEEE/ACIS 14th International Conference on Computer and Information Science. Piscataway, NJ:IEEE, 2015:605-609. [15] SINGH A K, SHAFIQUE M, KUMAR A, et al. Mapping on multi/many-core systems:survey of current and emerging trends[C]//DAC 2013:Proceedings of the 201350th ACM/EDAC/IEEE Design automation conference. Piscataway, NJ:IEEE, 2013:1-10. [16] CAI E, JUAN D C, GARG S, et al. Learning-based power/performance optimization for many-core systems with extended-range voltage/frequency scaling[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2016, 35(8):1318-1331. [17] MICOLET P J, SMITH A, DUBACH C. A machine learning approach to mapping streaming workloads to dynamic multicore processors[C]//LCTES 2016:Proceedings of the 201617th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, Tools, and Theory for Embedded Systems. New York:ACM, 2016:113-122. [18] GAMATIE A, URSU R, SELVA M, et al. Performance prediction of application mapping in manycore systems with artificial neural networks[C]//Proceedings of the 2016 IEEE 10th International Symposium on Embedded Multicore/Many-core Systems-on-Chip. Piscataway, NJ:IEEE, 2016:185-192. [19] WEN Y, WANG Z, O'BOYLE M F P. Smart multi-task scheduling for OpenCL programs on CPU/GPU heterogeneous platforms[C]//HiPC 2014:Proceedings of the 201421st International Conference on High Performance Computing. Piscataway, NJ:IEEE, 2014:1-10. [20] TAYLOR B, MARCO V S, WANG Z. Adaptive optimization for OpenCL programs on embedded heterogeneous systems[C]//LCTES 2017:Proceedings of the 18th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems. New York:ACM, 2017:11-20. [21] BITIRGEN R, IPEK E, MARTINEZ J F. Coordinated management of multiple interacting resources in chip multiprocessors:a machine learning approach[C]//MICRO 41:Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture. Washington, DC:IEEE Computer Society, 2008:318-329. [22] 袁景凌,缪旭阳,杨敏龙,等.基于神经网络的多核功耗预测策略[J].计算机科学,2014,41(6A):47-51.(JIAO J L, MIAO X Y, YANG M L, et al. Neural network based power prediction strategy for multi-core architecture[J]. Computer Science, 2014, 41(6A):47-51.) [23] 王彦华,乔建忠,林树宽,等.基于SVM的CPU-GPU异构系统任务分配模型[J].东北大学学报(自然科学版),2016,37(8):1089-1094.(WANG Y H, QIAO J Z, LIN S K, et al. A Task allocation model for CPU-GPU heterogeneous system based on SVMs[J]. Journal of Northeastern University (Natural Science), 2016, 37(8):1089-1094.) [24] AN X, BOUMEDIEN S, GAMATIE A, et al. CLASSY:a clock analysis system for rapid prototyping of embedded applications on MPSoCs[C]//SCOPES 2012:Proceedings of the 201215th International Workshop on Software and Compilers for Embedded Systems. New York:ACM, 2012:3-12. [25] HAYKIN S S. Neural Networks:a Comprehensive Foundation[M]. Upper Saddle River, NJ:Prentice Hall, 1994:133-147. [26] KINGMA D P, BA J L. Adam:a method for stochastic optimization[EB/OL].[2018-09-19]. https://www.docin.com/p-1725989690.html. [27] HOLLAND J H. Adaptation in Natural and Artificial Systems:an Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence[M]. Cambridge, MA:MIT press, 1992:32-58. [28] AN X, GAMATIE A, RUTTEN E. High-level design space exploration for adaptive applications on multiprocessor systems-on-chip[J]. Journal of Systems Architecture, 2015, 61(3/4):172-184. [29] LEE E A, SANGIOVANNI-VINCENTELLI A. A framework for comparing models of computation[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 1998, 17(12):1217-1229. [30] MARKOVIC N, NEMIROVSKY D, MILUTINOVIC V, et al. Hardware round-robin scheduler for single-ISA asymmetric multi-core[C]//Euro-Par 2015:Proceedings of the 2015 European Conference on Parallel Processing, LNCS 9233. Berlin:Springer, 2015:122-134.