[1] DEAN J, GHEMAWAT S. MapReduce: simplified data processing on large clusters[J]. Communications of the ACM, 2008, 51(1): 107-113. [2] ZHENG Q. Improving MapReduce fault tolerance in the cloud [C]//Proceedings of the 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum. Piscataway: IEEE Press, 2010: 1-6. [3] DONG X. Hadoop internals: in-depth study of MapReduce [M]. Beijing: China Machine Press, 2013. (董西成. Hadoop 技术内幕: 深入解析 MapReduce 架构设计与实现原理[M]. 北京: 机械工业出版社, 2013.) [4] KO S Y, HOQUE I, CHO B, et al. On availability of intermediate data in cloud computations [C]//HotOS: Proceedings of the 12th Conference on Hot Topics in Operating Systems. Berkeley: USENIX Association, 2009: 6-6. [5] WANG G, BUTT A R, PANDEY P, et al. A simulation approach to evaluating design decisions in MapReduce setups [C]//MASCOTS 2009: Proceedings of the 2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems. Piscataway: IEEE Press, 2009: 1-11. [6] DINU F, NG T E. Understanding the effects and implications of compute node related failures in Hadoop [C]//HPDC 2012: Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing. New York: ACM Press, 2012: 187-198. [7] ZHU H. Fault tolerance for MapReduce in the cloud environment [D]. Shanghai: Shanghai Jiao Tong University, 2012. (朱浩. 云环境下MapReduce容错技术的研究[D]. 上海:上海交通大学, 2012.) [8] HU P, DAI W. Enhancing fault tolerance based on Hadoop cluster [J]. International Journal of Database Theory and Application, 2014, 7(1): 37-48. [9] CHEN H, LUO W, LI M. Survey of primary/backup copy based real-time and fault-tolerant scheduling in distributed systems[J]. Application Research of Computers, 2012, 29(11):4017-4022. (陈晗鸣, 罗威, 李明辉. 分布式系统中基于主/副版本的实时容错调度综述[J]. 计算机应用研究, 2012, 29(11):4017-4022.) [10] ZHANG Z, LI Y. Analysis and study on under cloud computing multiple sets of fault tolerance strategy architecture of MapReduce [J]. Microelectronics and Computer, 2014, 31(1): 52-55. (张治斌, 李燕歌. 云计算下MapReduce多组容错机制架构的分析与研究[J]. 微电子学与计算机, 2014, 31(1): 52-55.) [11] DEN P, LI M, HE C. Research on namenode single point of fault solution[J]. Computer Engineering, 2012, 38(21): 40-44. (邓鹏, 李枚毅, 何诚. Namenode单点故障解决方案研究[J]. 计算机工程, 2012, 38(21): 40-44.) [12] LIAO F, WANG C, CHEN S. Fault-tolerant scheduling algorithm for cloud computing based on task backup[J]. Computer Engineering, 2012,38(24): 17-20. (廖福蓉, 王成良, 陈蜀宇. 基于任务备份的云计算容错调度算法[J]. 计算机工程, 2012, 38(24): 17-20.) |