Efficient failure recovery method for stream data processing system

doi:10.11772/j.issn.1001-9081.2021122108

Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (11): 3337-3345.DOI: 10.11772/j.issn.1001-9081.2021122108

• CCF Bigdata 2021 • Previous Articles

Efficient failure recovery method for stream data processing system

Yang LIU¹^,²^,³, Yangyang ZHANG¹^,², Haoyi ZHOU¹^,²^,⁴()

^1.Beijing Advanced Innovation Center for Big Data and Brain Computing，Beihang University，Beijing 100191，China
^2.School of Computer Science and Engineering，Beihang University，Beijing 100191，China
^3.ShenYuan Honors College，Beihang University，Beijing 100191，China
^4.College of Software，Beihang University，Beijing 100191，China

Received:2021-12-15 Revised:2022-02-27 Accepted:2022-03-04 Online:2022-04-18 Published:2022-11-10
Contact: Haoyi ZHOU
About author:LIU Yang， born in 1999， Ph. D. candidate. His research interests include distributed systems， graph processing systems.
ZHANG Yangyang， born in 1991， Ph. D. candidate. His research interests include distributed systems， machine learning， graph processing.
ZHOU Haoyi， born in 1991， Ph. D.， lecturer. His research interests include big data system， machine learning.
Supported by:
National Natural Science Foundation of China(U20B2053);Open Project of State Key Laboratory of Software Development Environment(SKLSDE?2020ZX?12)

面向流式数据处理系统的高效故障恢复方法

刘阳¹^,²^,³, 张扬扬¹^,², 周号益¹^,²^,⁴()

^1.北京航空航天大学大数据科学与脑机智能高精尖创新中心, 北京 100191
^2.北京航空航天大学计算机学院, 北京 100191
^3.北京航空航天大学未来空天技术学院/高等理工学院, 北京 100191
^4.北京航空航天大学软件学院, 北京 100191

通讯作者: 周号益
作者简介:刘阳（1999—），男，山西大同人，博士研究生，CCF会员，主要研究方向：分布式系统、图计算系统
张扬扬（1991—），男，河北保定人，博士研究生，CCF会员，主要研究方向：分布式系统、机器学习、图计算
周号益（1991—），男，四川德阳人，讲师，博士，CCF会员，主要研究方向：大数据系统、机器学习。haoyi@buaa.edu.cn
基金资助:
国家自然科学基金资助项目(U20B2053);软件开发环境国家重点实验室开放课题(SKLSDE?2020ZX?12)

Abstract

Abstract:

Focusing on the issue that the single point of failure cannot be efficiently handled by streaming data processing system Flink， a new fault?tolerant system based on incremental state and backup， Flink+， was proposed. Firstly， backup operators and data paths were established in advance. Secondly， the output data in the data flow diagram was cached， and disks were used if necessary. Thirdly， task state synchronization was performed during system snapshots. Finally， backup tasks and cached data were used to recover calculation in case of system failure. In the system experiment and test， Flink+ dose not significantly increase the additional fault tolerance overhead during fault?free operation； when dealing with the single point of failure in both single?machine and distributed environments， compared with Flink system， the proposed system has the failure recovery time reduced by 96.98% in single?machine 8?task parallelism and by 88.75% in distributed 16?task parallelism. Experimental results show that using incremental state and backup method together can effectively reduce the recovery time of the single point of failure of the stream system and enhance the robustness of the system.

Key words: stream data processing system, failure recovery, distributed checkpoint, state backup, Apache Flink

摘要：

针对流式数据处理系统Flink无法高效处理单点故障的问题，提出了一种基于增量状态和备份的故障容错系统Flink+。首先，提前建立备份算子和数据通路；然后，对数据流图中的输出数据进行缓存，必要时使用磁盘；其次，在系统快照时进行任务状态同步；最后，在系统故障时使用备份任务和缓存的数据恢复计算。在系统实验测试中，Flink+在无故障运行时没有显著增加额外容错开销；而在单机和分布式环境下处理单点故障时，与Flink系统相比，所提系统在单机8任务并行度下故障恢复时间减少了96.98%，在分布式16任务并行度下故障恢复时间减少了88.75%。实验结果表明，增量状态和备份方法一起使用可以有效减少流式系统单点故障的恢复时间，增强系统的鲁棒性。

关键词: 流式数据处理系统, 故障恢复, 分布式检查点, 状态备份, Apache Flink

CLC Number:

TP311.5

Yang LIU, Yangyang ZHANG, Haoyi ZHOU. Efficient failure recovery method for stream data processing system[J]. Journal of Computer Applications, 2022, 42(11): 3337-3345.

刘阳, 张扬扬, 周号益. 面向流式数据处理系统的高效故障恢复方法[J]. 《计算机应用》唯一官方网站, 2022, 42(11): 3337-3345.

Figures/Tables 9

Fig. 1 System architecture of Flink+

Fig. 2 Backup?based snapshot

Fig. 3 Upstream backup

Fig. 4 Fault recovery

Fig. 5 Flow of Flink system after single point of failure

Fig. 6 Flow of Flink+ system after single point of failure

Tab. 1 Recovery time comparison of Flink and Flink+ systems in standalone mode

任务并行度	恢复时间/ms		恢复时间减小比例/%
任务并行度	Flink	Flink+	恢复时间减小比例/%
1	20 45.0	440.0	78.48
2	1 539.8	150.0	90.26
4	1 393.6	139.5	89.99
8	1 457.6	44.0	96.98
16	1 663.6	56.0	96.63

Tab. 2 Recovery time comparison of Flink and Flink+ systems in distributed environment

任务并行度	恢复时间/ms		恢复时间减小比例/%
任务并行度	Flink	Flink+	恢复时间减小比例/%
4	1 406.0	202.0	85.63
8	1 325.2	157.7	88.10
16	1 319.5	148.5	88.75
32	1 417.4	199.7	85.91

Fig. 7 Comparison of CPU usage between Flink and Flink+

References 26

1	ZHANG Y Y， LI J X， ZHANG Y M， et al. FreeLauncher： lossless failure recovery of parameter servers with ultralight replication［C］// Proceedings of the IEEE 41st International Conference on Distributed Computing Systems. Piscataway： IEEE， 2021： 472-482. 10.1109/icdcs51616.2021.00052
2	ZHANG Y Y， LI J X， SUN C G， et al. HotML： a DSM‑based machine learning system for social networks［J］. Journal of Computational Science， 2018， 26： 478-487. 10.1016/j.jocs.2017.09.006
3	CARBONE P， KATSIFODIMOS A， EWEN S， et al. Apache Flink： stream and batch processing in a single engine［J］. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering， 2015， 38（4）： 28-38.
4	CARBONE P， EWEN S， FÓRA G， et al. State management in Apache Flink： consistent stateful distributed stream processing［J］. Proceedings of the VLDB Endowment， 2017， 10（12）： 1718-1729. 10.14778/3137765.3137777
5	GARCÍA‑GIL D， RAMÍREZ‑GALLEGO S， GARCÍA S， et al. A comparison on scalability for batch big data processing on Apache Spark and Apache Flink［J］. Big Data Analytics， 2017， 2： No.1. 10.1186/s41044-016-0020-2
6	MENG X R， BRADLEY J， YAVUZ B， et al. MLlib： machine learning in Apache Spark［J］. Journal of Machine Learning Research， 2016， 17： 1-7.
7	ZAHARIA M， DAS T， LI H Y， et al. Discretized streams： fault‑tolerant streaming computation at scale［C］// Proceedings of the 24th ACM Symposium on Operating Systems Principles. New York： ACM， 2013： 423-438. 10.1145/2517349.2522737
8	ZAHARIA M， CHOWDHURY M， DAS T， et al. Resilient distributed datasets： a fault‑tolerant abstraction for in‑memory cluster computing［C］// Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation. Berkeley： USENIX Association， 2012： 15-28.
9	ZAHARIA M， DAS T， LI H Y， et al. Discretized streams： an efficient and fault‑tolerant model for stream processing on large clusters［C］// Proceedings of the 4th USENIX Workshop on Hot Topics in Cloud Computing. Berkeley： USENIX Association， 2012： No.10. 10.21236/ada575859
10	TOSHNIWAL A， TANEJA S， SHUKLA A， et al. Storm@Twitter［C］// Proceedings of the 2014 ACM SIGMOD international Conference on Management of Data. New York： ACM， 2014： 147-156. 10.1145/2588555.2595641
11	IQBAL M H， SOOMRO T R. Big data analysis： Apache Storm perspective［J］. International Journal of Computer Trends and Technology， 2015， 19（1）： 9-14. 10.14445/22312803/ijctt-v19p103
12	NOGHABI S A， PARAMASIVAM K， PAN Y， et al. Samza： stateful scalable stream processing at LinkedIn［J］. Proceedings of the VLDB Endowment， 2017， 10（12）： 1634-1645. 10.14778/3137765.3137770
13	CHANDY K M， LAMPORT L. Distributed snapshots： Determining global states of distributed systems［J］. ACM Transactions on Computer Systems， 1985， 3（1）： 63-75. 10.1145/214451.214456
14	CARBONE P， FÓRA G， EWEN S， et al. Lightweight asynchronous snapshots for distributed dataflows［EB/OL］. （2015-06-29）［2021-12-15］.. 10.14778/3137765.3137777
15	段泽源. 大数据流式处理系统负载均衡与容错机制的研究［D］. 北京：华北电力大学， 2017： 28-30.
	DUAN Z Y. Research on load balancing and fault tolerant mechanism of big data stream processing system［D］. Beijing： North China Electric Power University， 2017：28-30.
16	孙大为，张广艳，郑纬民. 大数据流式计算：关键技术及系统实例［J］. 软件学报， 2014， 25（4）：839-862. 10.13328/j.cnki.jos.004558
	SUN D W， ZHANG G Y， ZHENG W M. Big data stream computing： technologies and instances［J］. Journal of Software， 2014， 25（4）： 839-862. 10.13328/j.cnki.jos.004558
17	LI H L， WU J， JIANG Z， et al. Integrated recovery and task allocation for stream processing［C］// Proceedings of the IEEE 36th International Performance Computing and Communications Conference. Piscataway： IEEE， 2017： 1-8. 10.1109/pccc.2017.8280443
18	LI H L， WU J， JIANG Z， et al. Task allocation for stream processing with recovery latency guarantee［C］// Proceedings of the 2017 IEEE International Conference on Cluster Computing. Piscataway： IEEE， 2017： 379-383. 10.1109/cluster.2017.10
19	AKIDAU T， BALIKOV A， BEKIROĞLU K， et al. MillWheel： fault‑tolerant stream processing at Internet scale［J］. Proceedings of the VLDB Endowment， 2013， 6（11）： 1033-1044. 10.14778/2536222.2536229
20	GUO J， AGRAWAL G. Smart Streaming： a high‑throughput fault‑ tolerant online processing system［C］// Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops. Piscataway： IEEE， 2020： 396‑405. 10.1109/ipdpsw50202.2020.00075
21	LIN C F， ZHAN J J， CHEN H H， et al. Ares： a high performance and fault‑tolerant distributed stream processing system［C］// Proceedings of the IEEE 26th International Conference on Network Protocols. Piscataway： IEEE， 2018： 176-186. 10.1109/icnp.2018.00027
22	VENKATARAMAN S， PANDA A， OUSTERHOUT K， et al. Drizzle： fast and adaptable stream processing at scale［C］// Proceedings of the 26th Symposium on Operating Systems Principles. New York： ACM， 2017： 374-389. 10.1145/3132747.3132750
23	LIU P C， XU H L， SILVA D DA， et al. FP4S： fragment‑based parallel state recovery for stateful stream applications［C］// Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium. Piscataway： IEEE， 2020： 1102-1111. 10.1109/ipdps47924.2020.00116
24	XU H L， LIU P C， CRUZ‑DIAZ S， et al. SR3： customizable recovery for stateful stream processing systems［C］// Proceedings of the 21st International Middleware Conference. New York： ACM， 2020： 251-264. 10.1145/3423211.3425681
25	van RENESSE R， SCHNEIDER F B. Chain replication for supporting high throughput and availability［C］// Proceedings of the 6th Symposium on Operating Systems Design and Implementation. Berkeley： USENIX Association， 2004： 91-104.
26	TERRACE J， FREEDMAN M J. Object storage on CRAQ： high‑ throughput chain replication for read‑mostly workloads［C］// Proceedings of the 2009 USENIX Annual Technical Conference. Berkeley： USENIX Association， 2009： No.11. 10.1109/msp.2009.28

[1]	Jingyu LIU, Qiuxia NIU, Xiaoyan LI, Qiaoshuo SHI, Youxi WU. Fast failure recovery method based on local redundant hybrid code [J]. Journal of Computer Applications, 2022, 42(4): 1244-1252.
[2]	LI Ziyang, YU Jiong, BIAN Chen, LU Liang, PU Yonglin. Dynamic task dispatching strategy for stream processing based on flow network [J]. Journal of Computer Applications, 2018, 38(9): 2560-2567.
[3]	LI Ziyang, YU Jiong, BIAN Chen, WANG Yuefei, LU Liang. Dynamic data stream load balancing strategy based on load awareness [J]. Journal of Computer Applications, 2017, 37(10): 2760-2766.

Efficient failure recovery method for stream data processing system

面向流式数据处理系统的高效故障恢复方法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 9

References 26

Related Articles 3

Recommended Articles

Metrics