基于累计工作量的在线大数据分析作业调度算法

doi:10.11772/j.issn.1001-9081.2019010073

计算机应用 ›› 2019, Vol. 39 ›› Issue (8): 2431-2437.DOI: 10.11772/j.issn.1001-9081.2019010073

• 应用前沿、交叉与综合 • 上一篇下一篇

基于累计工作量的在线大数据分析作业调度算法

李叶飞¹, 徐超², 许道强³, 邹云峰², 张晓达¹, 钱柱中¹

1. 南京大学计算机科学与技术系, 南京 210023;
2. 国网江苏省电力有限公司电力科学研究院, 南京 210000;
3. 国网江苏省电力有限公司, 南京 210000

收稿日期:2019-01-11 修回日期:2019-04-12 发布日期:2019-04-17 出版日期:2019-08-10
通讯作者: 钱柱中
作者简介:李叶飞(1979-),男,安徽池州人,工程师,硕士,主要研究方向:大数据处理;徐超(1989-),男,山东莱芜人,工程师,硕士,主要研究方向:大数据处理、信息安全;许道强(1978-),男,辽宁丹东人,高级工程师,主要研究方向:网络虚拟化、电力信息化;邹云峰(1977-),男,江西丰城人,高级工程师,硕士,主要研究方向:电力信息技术、大数据;张晓达(1991-),男,河北邢台人,博士研究生,主要研究方向:分布式大数据处理;钱柱中(1980-),男,江苏常熟人,副教授,博士,主要研究方向:分布式系统、数据中心网络。
基金资助:
国家自然科学基金项目（61472181）；江苏省自然科学基金项目（BK20151392）。

Online task scheduling algorithm for big data analytics based on cumulative running work

LI Yefei¹, XU Chao², XU Daoqiang³, ZOU Yunfeng², ZHANG Xiaoda¹, QIAN Zhuzhong¹

1. Department of Computer Science and Technology, Nanjing University, Nanjing Jiangsu 210023, China;
2. Electric Power Research Institute, State Grid Jiangsu Electric Power Company Limited, Nanjing Jiangsu 210000, China;
3. State Grid Jiangsu Electric Power Company Limited, Nanjing Jiangsu 210000, China

Received:2019-01-11 Revised:2019-04-12 Online:2019-04-17 Published:2019-08-10
Supported by:
This work is partially supported by the Natural Science Foundation of China (61472181), the Natural Science Foundation of Jiangsu Province (BK20151392).

摘要/Abstract

摘要： 针对Hadoop和Spark等大数据分析系统中无先验知识任务的高效执行问题，设计了基于累计工作量（CRW）的任务调度器CRWScheduler。该调度器根据CRW将任务在低权重队列与高权重队列间切换；在为作业分配资源时，同时考虑到作业所在的队列和其瞬时占用资源量，无需作业先验知识即显著提升系统性能。基于Apache Hadoop YARN实现了CRWScheduler原型，在28个节点的基准测试集群上的实验表明，与YARN的公平调度机制相比，作业流时间（JFT）平均降低21%，其中95百分位的作业流时间（JFT）最多降低了35%，并且在与任务级调度程序协作时可获得进一步的性能提升。

关键词: 数据分析系统, 作业流时间, 公平性, 饥饿避免

Abstract: A Cumulative Running Work (CRW) based task scheduler CRWScheduler was proposed to effectively process tasks without any prior knowledge for big data analytics platform like Hadoop and Spark. The running job was moved from a low-weight queue to a high-weight one based on CRW. When resources were allocated to a job, both the queue of the job and the instantaneous resource utilization of the job were considered, significantly improving the overall system performance without prior knowledge. The prototype of CRWScheduler was implemented based on Apache Hadoop YARN. Experimental results on 28-node benchmark testing cluster show that CRWScheduler reduces average Job Flow Time (JFT) by 21% and decreases JFT of 95th percentile by up to 35% compared with YARN fair scheduler. Further improvements can be obtained when CRWScheduler cooperates with task-level schedulers.

Key words: data analytics system, Job Flow Time (JFT), fairness, starvation-free

中图分类号:

TP316.4

李叶飞, 徐超, 许道强, 邹云峰, 张晓达, 钱柱中. 基于累计工作量的在线大数据分析作业调度算法[J]. 计算机应用, 2019, 39(8): 2431-2437.

LI Yefei, XU Chao, XU Daoqiang, ZOU Yunfeng, ZHANG Xiaoda, QIAN Zhuzhong. Online task scheduling algorithm for big data analytics based on cumulative running work[J]. Journal of Computer Applications, 2019, 39(8): 2431-2437.

参考文献

[1] Apache Software Foundation. Apache Tez[EB/OL].[2017-12-21]. http://tez.apache.org/.
[2] DEAN J, GHEMAWAT S. MapReduce:simplified data processing on large clusters[C]//Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation. Berkeley, CA:USENIX Association, 2004, 6:10-10.
[3] ISARD M, BUDIU M, YU Y, et al. Dryad:distributed data-parallel programs from sequential building blocks[C]//Proceedings of the 2nd ACM Special Interest Groups in Operating Systems (SIGOPS)/European Conference on Computer Systems. New York:ACM, 2007:59-72.
[4] ZAHARIA M, CHOWDHURY M, DAS T, et al. Resilient distributed datasets:a fault-tolerant abstraction for in-memory cluster computing[C]//Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation. Berkeley, CA:USENIX Association, 2012:15-28.
[5] GHODSI A, ZAHARIA M, HINDMAN B, et al. Dominant resource fairness:Fair allocation of multiple resource types[C]//Proceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation. Berkeley, CA:USENIX Association, 2011:323-336.
[6] GHODSI A, ZAHARIA M, SHENKER S, et al. Choosy:max-min fair sharing for datacenter jobs with constraints[C]//Proceedings of the 8th ACM European Conference on Computer Systems. New York:ACM, 2013:365-378.
[7] HINDMAN B, KONWINSKI A, ZAHARIA M, et al. Mesos:A platform for fine-grained resource sharing in the data center[C]//Proceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation. Berkeley, CA:USENIX Association, 2011:295-308.
[8] VAVILAPALLI V K, MURTHY A C, DOUGLAS C, et al. Apache hadoop YARN:yet another resource negotiator[C]//Proceedings of the 4th Annual Symposium on Cloud Computing. New York:ACM, 2013:1-16.
[9] WANG W, LIANG B, LI B. Multi-resource fair allocation in heterogeneous cloud computing systems[J]. IEEE Transactions on Parallel and Distributed Systems, 2015, 26(10):2822-2835.
[10] ZAHARIA M, BORTHAKUR D, SARMA J S, et al. Delay scheduling:a simple technique for achieving locality and fairness in cluster scheduling[C]//Proceedings of the 5th ACM European Conference on Computer Systems. New York:ACM, 2013:265-278.
[11] Apache Software Foundation. Hadoop MapReduce Next Generation-Fair Scheduler[EB/OL].[2018-10-21]. http://tinyurl.com/j9vzsl9.
[12] GRANDL R, CHOWDHURY M, AKELLA A, et al. Altruistic scheduling in multi-resource clusters[C]//Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation. Berkeley, CA:USENIX Association, 2016:65-80.
[13] GRANDL R, KANDULA S, RAO S, et al. Graphene:packing and dependency-aware scheduling for data-parallel clusters[C]//Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation. Berkeley, CA:USENIX Association, 2016:81-97.
[14] AGRAWAL K, LI J, LU K, et al. Scheduling parallel DAG jobs online to minimize average flow time[C]//Proceedings of the 27th annual ACM-SIAM Symposium on Discrete algorithms. Philadelphia, PA:Society for Industrial and Applied Mathematics, 2016:176-189.
[15] FERGUSON A D, BODIK P, KANDULA S, et al. Jockey:guaranteed job latency in data parallel clusters[C]//Proceedings of the 7th ACM European Conference on Computer Systems. New York:ACM, 2012:99-112.
[16] GRANDL R, ANANTHANARAYANAN G, KANDULA S, et al. Multi-resource packing for cluster schedulers[C]//Proceedings of the 2014 ACM Conference on SIGCOMM. New York:ACM, 2014:455-466.
[17] JALAPARTI V, BODIK P, MENACHE I, et al. Network-aware scheduling for data-parallel jobs:Plan when you can[C]//Proceedings of the 2015 ACM Conference on SIGCOMM. New York:ACM, 2015:407-420.
[18] RAI I A, URVOY-KELLER G, BIERSACK E W. Analysis of LAS scheduling for job size distributions with high variance[C]//Proceedings of the 2003 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems. New York:ACM, 2003:218-228.
[19] BAI W, CHEN L, CHEN K, et al. Information-agnostic flow scheduling for commodity data centers[C]//Proceedings of the 12th USENIX Symposium on Networked Systems Design and Implementation. Berkeley, CA:USENIX Association, 2015:455-468.
[20] CHOWDHURY M, STOICA I. Efficient coflow scheduling without prior knowledge[C]//Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication. New York:ACM, 2015:393-406.
[21] Apache Software Foundation. Apache Hadoop NextGen MapReduce (YARN)[EB/OL].[2017-12-21]. http://tinyurl.com/zyy8kbc.
[22] 吴信东,嵇圣硙.MapReduce与Spark用于大数据分析值比较[J].软件学报,2018,29(6):1770-1791. (WU X D, JI S W. Comparive study on MapReduce and Spark for bid data analytics[J]. Journal of Software, 2018, 29(6):1770-1791.)
[23] ISARD M, PRABHAKARAN V, CURREY J, et al. Quincy:fair scheduling for distributed computing clusters[C]//Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles. New York:ACM, 2009:261-276.
[24] AHMAD F, CHAKRADHAR S, RAGHUNATHAN A, et al. Shufflewatcher:Shuffle-aware scheduling in multi-tenant mapreduce clusters[C]//USENIX Proceedings of 2014 USENIX Annual Technical Conference. Berkeley, CA:USENIX Association, 2014:1-12.
[25] BLUMOFE R D, LEISERSON C E. Scheduling multithreaded computations by work stealing[J]. Journal of the ACM. 1999, 46(5):720-748.
[26] EDMONDS J, PRUHS K. Scalably scheduling processes with arbitrary speedup curves[J]. ACM Transactions on Algorithms, 2012, 8(3):256-265.
[27] 王习特,申德荣,于戈,等.MapReduce集群中最大收益问题的研究[J].计算机学报,2015,38(1):109-121. (WANG X T, SHEN D R, YU G, et al. Research on maximum benefit problem in a MapReduce cluster[J]. Chinese Journal of Computers, 2015, 38(1):109-121.)
[28] VAZIRANI V V. Approximation Algorithms[M]. Berlin:Springer, 2003:74-78.
[29] NAIR J, WIERMAN A, ZWART B. The fundamentals of heavy-tails:properties, emergence, and identification[C]//Proceedings of the ACM SIGMETRICS/International Conference on Measurement and Modeling of Computer Systems. New York:ACM, 2013:387-388.
[30] WIKIPEDIA. Max-min fairness[EB/OL].[2017-12-21]. http://tinyurl.com/krkdmho.
[31] Apache Software Foundation. Hadoop MapReduce Next Generation-Capacity Scheduler[EB/OL].[2018-12-01]. http://tinyurl.com/j739ojm.
[32] Apache Software Foundation. Apache Spark[EB/OL].[2018-11-07]. http://spark.apache.org/.

基于累计工作量的在线大数据分析作业调度算法

Online task scheduling algorithm for big data analytics based on cumulative running work

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	杜雪灵, 孟学雷, 杨贝, 汤霖. 考虑公平性的面向多灾点需求应急资源调度[J]. 计算机应用, 2018, 38(7): 2089-2094.
[2]	尼俊红, 申振涛, 杨会峰. 蜂窝网络下基于max-min公平性的D2D功率分配[J]. 计算机应用, 2017, 37(4): 945-947.
[3]	詹金珍, 郭达伟, 滑维鑫. 基于公平性的D2D时隙调度算法[J]. 计算机应用, 2017, 37(3): 711-716.
[4]	柯尊旺, 于炯, 廖彬. 适应异构集群的Mesos多资源调度DRF增强算法[J]. 计算机应用, 2016, 36(5): 1216-1221.
[5]	朱清超, 陈靖, 龚水清. 移动自组网多速率MAC协议吞吐量分析及优化[J]. 计算机应用, 2016, 36(10): 2664-2669.
[6]	薛胜军, 邱爽, 许小龙. 云环境下能耗感知的公平性提升资源调度策略[J]. 计算机应用, 2016, 36(10): 2692-2697.
[7]	朱清超, 陈靖, 龚水清, 石婷. 移动自组网媒体接入控制协议吞吐量与公平性均衡设计[J]. 计算机应用, 2015, 35(11): 3275-3279.
[8]	刘奎梁向前李晓琳. 用基于身份的环签密构造的并发签名方案[J]. 计算机应用, 2013, 33(05): 1386-1390.
[9]	莫礼平乐晓波周恺卿张兆海. 带抑制弧Petri网的保性变换[J]. 计算机应用, 2012, 32(11): 3071-3074.
[10]	李慧郭爱煌. 交通信号的实时公平调度及其仿真[J]. 计算机应用, 2012, 32(04): 1161-1164.
[11]	田硕高仲合. 改进的CHOKe公平性主动队列管理算法[J]. 计算机应用, 2011, 31(11): 2905-2908.
[12]	赵文波孙小科马草川. 基于非线性窗口增长的TCP Westwood改进算法[J]. 计算机应用, 2011, 31(09): 2344-2348.
[13]	李明吴燕玲杨雷韩清涛. 移动WiMAX网络中的VoIP调度机制[J]. 计算机应用, 2011, 31(05): 1162-1165.
[14]	严靖琳唐伦陈前斌陈波. 基于多用户QoS的中继系统功率分配算法[J]. 计算机应用, 2011, 31(03): 606-608.
[15]	李威煌吕品陈颖文徐明. 多速率无线Mesh网络环境下功率控制与调度机制——PSMR[J]. 计算机应用, 2011, 31(01): 208-211.