基于机器学习的异构多核处理器系统在线映射方法

doi:10.11772/j.issn.1001-9081.2018112311

计算机应用 ›› 2019, Vol. 39 ›› Issue (6): 1753-1759.DOI: 10.11772/j.issn.1001-9081.2018112311

基于机器学习的异构多核处理器系统在线映射方法

安鑫^1,2, 张影^1,2, 康安^1,2, 陈田^1,2, 李建华^1,2

1. 合肥工业大学计算机与信息学院, 合肥 230601;
2. 情感计算与先进智能机器安徽省重点实验室(合肥工业大学), 合肥 230601

收稿日期:2018-11-21 修回日期:2019-01-07 发布日期:2019-06-17 出版日期:2019-06-10
通讯作者: 安鑫
作者简介:安鑫(1987-),男,山东潍坊人,副教授,博士,CCF会员,主要研究方向:嵌入式系统设计和验证、机器学习;张影(1994-),女,安徽亳州人,硕士研究生,主要研究方向:嵌入式系统、片上系统、机器学习;康安(1995-),男,河北邢台人,硕士研究生,主要研究方向:嵌入式软件及应用、机器学习;陈田(1974-),女,安徽合肥人,副教授,博士,CCF高级会员,主要研究方向:超大规模集成电路/系统芯片低功耗测试、可测试性设计、可穿戴计算;李建华(1985-),男,安徽肥西人,副研究员,博士,主要研究方向:计算机体系结构、非易失性存储器、片上系统。
基金资助:
国家自然科学基金资助项目（61502140，61474035）。

Machine learning based online mapping approach for heterogeneous multi-core processor system

AN Xin^1,2, ZHANG Ying^1,2, KANG An^1,2, CHEN Tian^1,2, LI Jianhua^1,2

1. School of Computer and Information, Hefei University of Technology, Hefei Anhui 230601, China;
2. Anhui Provincial Key Laboratory of Affective Computing and Advanced Intelligent Machine(Hefei University of Technology), Hefei Anhui 230601, China

Received:2018-11-21 Revised:2019-01-07 Online:2019-06-17 Published:2019-06-10
Supported by:
This work is partially supported by the National Natural Science Foundation of China (61502140, 61474035).

摘要/Abstract

摘要： 异构多核处理器（HMPs）平台已成为现代嵌入式系统的主流解决方案，其中在线映射或调度对充分发挥其高性能和低功耗的优势起着至关重要的作用。针对HMPs的应用任务动态映射问题，提出了一种基于机器学习预测模型的在线映射调度解决方案。一方面，构建了一个可以快速高效地预测和评估不同映射方案性能的机器学习模型，为在线调度提供支持；另一方面，将该机器学习模型整合到遗传算法中以高效地找到（接近）最优的资源分配方案。最后，通过一个M-JPEG解码器验证了所提方法的有效性。实验结果表明，该方法的平均执行时间相较于常见的轮询调度和抽样调度方法分别降低了28%和19%左右。

关键词: 异构多核处理器, 机器学习, 动态资源分配, 性能预测, 映射和调度

Abstract: Heterogeneous Multi-core Processors (HMPs) platform has become the mainstream solution for modern embedded system design, and online mapping or scheduling plays a vital role in making full use of the advantages of high performance and low power consumption. Aiming at the dynamic mapping problem of application tasks in HMPs, a mapping and scheduling approach based on machine learning prediction model was proposed. On the one hand, a machine learning model was constructed to predict and evaluate the performance of different mapping strategies rapidly and efficiently, so as to provide support for online scheduling. On the other hand, the machine learning model was integrated with genetic algorithm to find out the optimal resource allocation strategy efficiently. Finally, an Motion-Join Photographic Experts Group (M-JPEG) decoder was used to verify the effectiveness of the proposed approach. The experimental results show that, compared with the Round Robin Scheduler (RRS) and sampling scheduling approaches, the proposed online mapping/scheduling approach has reduced the average execution time by about 19% and 28% respectively.

Key words: Heterogeneous Multi-core Processors (HMPs), machine learning, dynamic resource allocation, performance prediction, mapping and scheduling

中图分类号:

TP302.7

安鑫, 张影, 康安, 陈田, 李建华. 基于机器学习的异构多核处理器系统在线映射方法[J]. 计算机应用, 2019, 39(6): 1753-1759.

AN Xin, ZHANG Ying, KANG An, CHEN Tian, LI Jianhua. Machine learning based online mapping approach for heterogeneous multi-core processor system[J]. Journal of Computer Applications, 2019, 39(6): 1753-1759.

参考文献

[1] GREENHALGH A P. Big. LITTLE processing with ARM cortex^TM-A15& cortex-A7[EB/OL].[2018-09-19]. https://www.arm.com/files/downloads/b-igLITTLE Final Final.pdf.
[2] Apple Inc. A12-bionic[EB/OL].[2018-09-12]. https://www.apple.com/cn/iphone-xs/a12-bionic/.
[3] LI C V, PETRUCCI V, MOSSE D. Predicting thread profiles across core types via machine learning on heterogeneous multiprocessors[C]//Proceedings of the 2016 VI Brazilian Symposium on Computing Systems Engineering. Piscataway, NJ:IEEE, 2016:56-62.
[4] LE SUEUR E, HEISER G. Dynamic voltage and frequency scaling:The laws of diminishing returns[C]//HotPower 2010:Proceedings of the 2010 International Conference on Power Aware Computing and Systems. Berkeley, CA:USENIX Association, 2010:Article No. 1-8.
[5] GOLI M, MCCALL J, BROWN C, et al. Mapping parallel programs to heterogeneous CPU/GPU architectures using a Monte Carlo tree search[C]//CEC 2013:Proceedings of the 2013 IEEE Congress on Evolutionary Computation. Piscataway, NJ:IEEE, 2013:2932-2939.
[6] NEMIROVSKY D, ARKOSE T, MARKOVIC N, et al. A machine learning approach for performance prediction and scheduling on heterogeneous CPUs[C]//Proceedings of the 2017 IEEE 29th International Symposium on Computer Architecture and High Performance Computing. Piscataway, NJ:IEEE, 2017:121-128.
[7] ZHANG Y Q, LAURENZANO M A, MARS J, et al. SMiTe:precise QoS prediction on real-system SMT processors to improve utilization in warehouse scale computers[C]//Proceedings of the 201447th Annual IEEE/ACM International Symposium on Microarchitecture. Washington, DC:IEEE Computer Society, 2014:406-418.
[8] MICHALSKA M, CASALE-BRUNET S, BEZATI E, et al. High-precision performance estimation for the design space exploration of dynamic dataflow programs[J]. IEEE Transactions on Multi-Scale Computing Systems, 2018, 4(2):127-140.
[9] SAYADI H, PATEL N, SASAN A, et al. Machine learning-based approaches for energy-efficiency prediction and scheduling in composite cores architectures[C]//Proceedings of the 201735th IEEE International Conference on Computer Design. Piscataway, NJ:IEEE, 2017:129-136.
[10] WANG L, LIU S L, LU C, et al. Stable matching scheduler for single-ISA heterogeneous multi-core processors[C]//APPT 2015:Proceedings of the 2015 International Workshop on Advanced Parallel Processing Technologies, LNCS 9231. Cham:Springer, 2015:45-59.
[11] ULLMAN J D. NP-complete scheduling problems[J]. Journal of Computer and System Sciences, 1975, 10(3):384-393.
[12] ROY P, ALAM M M U, DAS N. Heuristic based task scheduling in multiprocessor systems with genetic algorithm by choosing the eligible processor[J]. International Journal of Distributed and Parallel Systems, 2012, 3(4):111-121.
[13] CHATTERJEE N, PAUL S, MUKHERJEE P, et al. Deadline and energy aware dynamic task mapping and scheduling for network-on-chip based multi-core platform[J]. Journal of Systems Architecture, 2017, 74:61-77.
[14] GHARSELLAOUI H, KTATA I, KHARROUBI N, et al. Real-time reconfigurable scheduling of multiprocessor embedded systems using hybrid genetic based approach[C]//Proceedings of the 2015 IEEE/ACIS 14th International Conference on Computer and Information Science. Piscataway, NJ:IEEE, 2015:605-609.
[15] SINGH A K, SHAFIQUE M, KUMAR A, et al. Mapping on multi/many-core systems:survey of current and emerging trends[C]//DAC 2013:Proceedings of the 201350th ACM/EDAC/IEEE Design automation conference. Piscataway, NJ:IEEE, 2013:1-10.
[16] CAI E, JUAN D C, GARG S, et al. Learning-based power/performance optimization for many-core systems with extended-range voltage/frequency scaling[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2016, 35(8):1318-1331.
[17] MICOLET P J, SMITH A, DUBACH C. A machine learning approach to mapping streaming workloads to dynamic multicore processors[C]//LCTES 2016:Proceedings of the 201617th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, Tools, and Theory for Embedded Systems. New York:ACM, 2016:113-122.
[18] GAMATIE A, URSU R, SELVA M, et al. Performance prediction of application mapping in manycore systems with artificial neural networks[C]//Proceedings of the 2016 IEEE 10th International Symposium on Embedded Multicore/Many-core Systems-on-Chip. Piscataway, NJ:IEEE, 2016:185-192.
[19] WEN Y, WANG Z, O'BOYLE M F P. Smart multi-task scheduling for OpenCL programs on CPU/GPU heterogeneous platforms[C]//HiPC 2014:Proceedings of the 201421st International Conference on High Performance Computing. Piscataway, NJ:IEEE, 2014:1-10.
[20] TAYLOR B, MARCO V S, WANG Z. Adaptive optimization for OpenCL programs on embedded heterogeneous systems[C]//LCTES 2017:Proceedings of the 18th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems. New York:ACM, 2017:11-20.
[21] BITIRGEN R, IPEK E, MARTINEZ J F. Coordinated management of multiple interacting resources in chip multiprocessors:a machine learning approach[C]//MICRO 41:Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture. Washington, DC:IEEE Computer Society, 2008:318-329.
[22] 袁景凌,缪旭阳,杨敏龙,等.基于神经网络的多核功耗预测策略[J].计算机科学,2014,41(6A):47-51.(JIAO J L, MIAO X Y, YANG M L, et al. Neural network based power prediction strategy for multi-core architecture[J]. Computer Science, 2014, 41(6A):47-51.)
[23] 王彦华,乔建忠,林树宽,等.基于SVM的CPU-GPU异构系统任务分配模型[J].东北大学学报(自然科学版),2016,37(8):1089-1094.(WANG Y H, QIAO J Z, LIN S K, et al. A Task allocation model for CPU-GPU heterogeneous system based on SVMs[J]. Journal of Northeastern University (Natural Science), 2016, 37(8):1089-1094.)
[24] AN X, BOUMEDIEN S, GAMATIE A, et al. CLASSY:a clock analysis system for rapid prototyping of embedded applications on MPSoCs[C]//SCOPES 2012:Proceedings of the 201215th International Workshop on Software and Compilers for Embedded Systems. New York:ACM, 2012:3-12.
[25] HAYKIN S S. Neural Networks:a Comprehensive Foundation[M]. Upper Saddle River, NJ:Prentice Hall, 1994:133-147.
[26] KINGMA D P, BA J L. Adam:a method for stochastic optimization[EB/OL].[2018-09-19]. https://www.docin.com/p-1725989690.html.
[27] HOLLAND J H. Adaptation in Natural and Artificial Systems:an Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence[M]. Cambridge, MA:MIT press, 1992:32-58.
[28] AN X, GAMATIE A, RUTTEN E. High-level design space exploration for adaptive applications on multiprocessor systems-on-chip[J]. Journal of Systems Architecture, 2015, 61(3/4):172-184.
[29] LEE E A, SANGIOVANNI-VINCENTELLI A. A framework for comparing models of computation[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 1998, 17(12):1217-1229.
[30] MARKOVIC N, NEMIROVSKY D, MILUTINOVIC V, et al. Hardware round-robin scheduler for single-ISA asymmetric multi-core[C]//Euro-Par 2015:Proceedings of the 2015 European Conference on Parallel Processing, LNCS 9233. Berlin:Springer, 2015:122-134.

基于机器学习的异构多核处理器系统在线映射方法

Machine learning based online mapping approach for heterogeneous multi-core processor system

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	陈学斌, 任志强, 张宏扬. 联邦学习中的安全威胁与防御措施综述[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1663-1672.
[2]	姚梓豪, 栗远明, 马自强, 李扬, 魏良根. 基于机器学习的多目标缓存侧信道攻击检测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1862-1871.
[3]	佘维, 李阳, 钟李红, 孔德锋, 田钊. 基于改进实数编码遗传算法的神经网络超参数优化[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 671-676.
[4]	郑毅, 廖存燚, 张天倩, 王骥, 刘守印. 面向城区的基于图去噪的小区级RSRP估计方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 855-862.
[5]	李博, 黄建强, 黄东强, 王晓英. 基于异构平台的稀疏矩阵向量乘自适应计算优化[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3867-3875.
[6]	陈学斌, 屈昌盛. 面向联邦学习的后门攻击与防御综述[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3459-3469.
[7]	孙仁科, 皇甫志宇, 陈虎, 李仲年, 许新征. 神经架构搜索综述[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 2983-2994.
[8]	柴汶泽, 范菁, 孙书魁, 梁一鸣, 刘竟锋. 深度度量学习综述[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 2995-3010.
[9]	尹春勇, 周永成. 双端聚类的自动调整聚类联邦学习[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3011-3020.
[10]	崔昊阳, 张晖, 周雷, 杨春明, 李波, 赵旭剑. 有序规范实数对多相似度K最近邻分类算法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2673-2678.
[11]	钟静, 林晨, 盛志伟, 张仕斌. 基于汉明距离的量子K-Means算法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2493-2498.
[12]	蓝梦婕, 蔡剑平, 孙岚. 非独立同分布数据下的自正则化联邦学习优化方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2073-2081.
[13]	黄晓辉, 杨凯铭, 凌嘉壕. 基于共享注意力的多智能体强化学习订单派送[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1620-1624.
[14]	郝劭辰, 卫孜钻, 马垚, 于丹, 陈永乐. 基于高效联邦学习算法的网络入侵检测模型[J]. 《计算机应用》唯一官方网站, 2023, 43(4): 1169-1175.
[15]	孙晓飞, 朱静远, 陈斌, 游恒志. 融合多模态数据的药物合成反应的虚拟筛选[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 622-629.