| 1 | ZHANG Y Y, LI J X, ZHANG Y M, et al. FreeLauncher: lossless failure recovery of parameter servers with ultralight replication[C]// Proceedings of the IEEE 41st International Conference on Distributed Computing Systems. Piscataway: IEEE, 2021: 472-482.  10.1109/icdcs51616.2021.00052 | 
																													
																						| 2 | ZHANG Y Y, LI J X, SUN C G, et al. HotML: a DSM‑based machine learning system for social networks[J]. Journal of Computational Science, 2018, 26: 478-487.  10.1016/j.jocs.2017.09.006 | 
																													
																						| 3 | CARBONE P, KATSIFODIMOS A, EWEN S, et al. Apache Flink: stream and batch processing in a single engine[J]. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 2015, 38(4): 28-38. | 
																													
																						| 4 | CARBONE P, EWEN S, FÓRA G, et al. State management in Apache Flink: consistent stateful distributed stream processing[J]. Proceedings of the VLDB Endowment, 2017, 10(12): 1718-1729.  10.14778/3137765.3137777 | 
																													
																						| 5 | GARCÍA‑GIL D, RAMÍREZ‑GALLEGO S, GARCÍA S, et al. A comparison on scalability for batch big data processing on Apache Spark and Apache Flink[J]. Big Data Analytics, 2017, 2: No.1.  10.1186/s41044-016-0020-2 | 
																													
																						| 6 | MENG X R, BRADLEY J, YAVUZ B, et al. MLlib: machine learning in Apache Spark[J]. Journal of Machine Learning Research, 2016, 17: 1-7. | 
																													
																						| 7 | ZAHARIA M, DAS T, LI H Y, et al. Discretized streams: fault‑tolerant streaming computation at scale[C]// Proceedings of the 24th ACM Symposium on Operating Systems Principles. New York: ACM, 2013: 423-438.  10.1145/2517349.2522737 | 
																													
																						| 8 | ZAHARIA M, CHOWDHURY M, DAS T, et al. Resilient distributed datasets: a fault‑tolerant abstraction for in‑memory cluster computing[C]// Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation. Berkeley: USENIX Association, 2012: 15-28. | 
																													
																						| 9 | ZAHARIA M, DAS T, LI H Y, et al. Discretized streams: an efficient and fault‑tolerant model for stream processing on large clusters[C]// Proceedings of the 4th USENIX Workshop on Hot Topics in Cloud Computing. Berkeley: USENIX Association, 2012: No.10.  10.21236/ada575859 | 
																													
																						| 10 | TOSHNIWAL A, TANEJA S, SHUKLA A, et al. Storm@Twitter[C]// Proceedings of the 2014 ACM SIGMOD international Conference on Management of Data. New York: ACM, 2014: 147-156.  10.1145/2588555.2595641 | 
																													
																						| 11 | IQBAL M H, SOOMRO T R. Big data analysis: Apache Storm perspective[J]. International Journal of Computer Trends and Technology, 2015, 19(1): 9-14.  10.14445/22312803/ijctt-v19p103 | 
																													
																						| 12 | NOGHABI S A, PARAMASIVAM K, PAN Y, et al. Samza: stateful scalable stream processing at LinkedIn[J]. Proceedings of the VLDB Endowment, 2017, 10(12): 1634-1645.  10.14778/3137765.3137770 | 
																													
																						| 13 | CHANDY K M, LAMPORT L. Distributed snapshots: Determining global states of distributed systems[J]. ACM Transactions on Computer Systems, 1985, 3(1): 63-75.  10.1145/214451.214456 | 
																													
																						| 14 | CARBONE P, FÓRA G, EWEN S, et al. Lightweight asynchronous snapshots for distributed dataflows[EB/OL]. (2015-06-29) [2021-12-15]..  10.14778/3137765.3137777 | 
																													
																						| 15 | 段泽源. 大数据流式处理系统负载均衡与容错机制的研究[D]. 北京:华北电力大学, 2017: 28-30. | 
																													
																						|  | DUAN Z Y. Research on load balancing and fault tolerant mechanism of big data stream processing system[D]. Beijing: North China Electric Power University, 2017:28-30. | 
																													
																						| 16 | 孙大为,张广艳,郑纬民. 大数据流式计算:关键技术及系统实例[J]. 软件学报, 2014, 25(4):839-862.  10.13328/j.cnki.jos.004558 | 
																													
																						|  | SUN D W, ZHANG G Y, ZHENG W M. Big data stream computing: technologies and instances[J]. Journal of Software, 2014, 25(4): 839-862.  10.13328/j.cnki.jos.004558 | 
																													
																						| 17 | LI H L, WU J, JIANG Z, et al. Integrated recovery and task allocation for stream processing[C]// Proceedings of the IEEE 36th International Performance Computing and Communications Conference. Piscataway: IEEE, 2017: 1-8.  10.1109/pccc.2017.8280443 | 
																													
																						| 18 | LI H L, WU J, JIANG Z, et al. Task allocation for stream processing with recovery latency guarantee[C]// Proceedings of the 2017 IEEE International Conference on Cluster Computing. Piscataway: IEEE, 2017: 379-383.  10.1109/cluster.2017.10 | 
																													
																						| 19 | AKIDAU T, BALIKOV A, BEKIROĞLU K, et al. MillWheel: fault‑tolerant stream processing at Internet scale[J]. Proceedings of the VLDB Endowment, 2013, 6(11): 1033-1044.  10.14778/2536222.2536229 | 
																													
																						| 20 | GUO J, AGRAWAL G. Smart Streaming: a high‑throughput fault‑ tolerant online processing system[C]// Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops. Piscataway: IEEE, 2020: 396‑405.  10.1109/ipdpsw50202.2020.00075 | 
																													
																						| 21 | LIN C F, ZHAN J J, CHEN H H, et al. Ares: a high performance and fault‑tolerant distributed stream processing system[C]// Proceedings of the IEEE 26th International Conference on Network Protocols. Piscataway: IEEE, 2018: 176-186.  10.1109/icnp.2018.00027 | 
																													
																						| 22 | VENKATARAMAN S, PANDA A, OUSTERHOUT K, et al. Drizzle: fast and adaptable stream processing at scale[C]// Proceedings of the 26th Symposium on Operating Systems Principles. New York: ACM, 2017: 374-389.  10.1145/3132747.3132750 | 
																													
																						| 23 | LIU P C, XU H L, SILVA D DA, et al. FP4S: fragment‑based parallel state recovery for stateful stream applications[C]// Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium. Piscataway: IEEE, 2020: 1102-1111.  10.1109/ipdps47924.2020.00116 | 
																													
																						| 24 | XU H L, LIU P C, CRUZ‑DIAZ S, et al. SR3: customizable recovery for stateful stream processing systems[C]// Proceedings of the 21st International Middleware Conference. New York: ACM, 2020: 251-264.  10.1145/3423211.3425681 | 
																													
																						| 25 | van RENESSE R, SCHNEIDER F B. Chain replication for supporting high throughput and availability[C]// Proceedings of the 6th Symposium on Operating Systems Design and Implementation. Berkeley: USENIX Association, 2004: 91-104. | 
																													
																						| 26 | TERRACE J, FREEDMAN M J. Object storage on CRAQ: high‑ throughput chain replication for read‑mostly workloads[C]// Proceedings of the 2009 USENIX Annual Technical Conference. Berkeley: USENIX Association, 2009: No.11.  10.1109/msp.2009.28 |