[1]
GUPTA B, RAHIMI S, YANG Y. A novel roll-back mechanism for performance enhancement of asynchronous checkpointing and recovery[J]. Informatica,2007,3(11):1-13.
[2]
ELNOZAHY E N, ALVISI L, WANG Y M, et al.A survey of rollback-recovery protocols in message-passing systems[J]. ACM Computing Surveys, 2002, 34(3): 375-408.
[3]
WANG Y M, CHUNG P Y, LIN I J, et al.Checkpoint space reclamation for uncoordinated checkpointing in message-passing systems[J].IEEE Transactions on Parallel and Distributed Systems,1995, 6(5): 546-554.
[4]
RUSCIO J F, HEFFNER M A, VARADARJAN S. DejaVu: Transparent user-level checkpointing, migration, and recovery for distributed systems[C]// SC06: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing. New York: ACM, 2006:158.
[5]
MALONEY A, GOSCINSKI A. A survey and review of the current state of rollback-recovery for cluster systems[J]. Concurrency and Computation: Practice and Experience, 2009, 21(12): 1632-1666.
[6]
TRIPATHY M, TRIPATHY C R. A new coordinated checkpointing and rollback recovery scheme for distributed shared memory clusters[J]. International Journal of Distributed and Parallel Systems,2011, 2(1): 49-58.
[7]
PRIYA S B, RAVICHANDRAN T. Fault tolerance and recovery for grid application reliability using check pointing mechanism[J]. International Journal of Computer Applications,2011, 26(5): 32-37.
[8]
BOUTEILLER A, HERAULT T, BOSILCA G, et al.Correlated set coordination in fault tolerant message logging protocols[C]// Euro-Par11: Proceedings of the 17th International Conference on Parallel Processing.Berlin: Springer-Verlag,2011: 51-64.
[9]
CHANDY K M, LAMPORT L. Distributed snapshots: Determining global states of distributed systems[J]. ACM Transactions on Computer Systems,1985, 3(1): 63-75.
[10]
ELNOZAHY E N, JOHNSON D B, ZWAENEPOEL W. The performance of consistent checkpointing[C]// Proceedings of the 11th Symposium on Reliable Distributed Systems.[S.l.]:IEEE, 1992:39-47.
[11]
魏晓辉, 鞠九滨. 分布式系统中的检查点算法[J]. 计算机学报,1998, 21(4): 367-375.
[12]
慈轶为, 张展, 左德承, 等. 可扩展的多周期检查点设置[J]. 软件学报,2010, 21(2): 218-230.
[13]
RANA M, PANGHAL A, PANGHAL S. Checkpointing based rollback recovery in distributed systems[J]. Journal of Current Computer Science and Technology,2011, 1(6): 45-49.
[14]
汪东升, 沈美明, 郑纬民, 等. 一种基于检查点的卷回恢复与进程迁移系统[J]. 软件学报,1999, 10(1): 68-73.
[15]
汪东升, 邵明珑. 具有O(n)消息复杂度的协调检查点设置算法[J]. 软件学报,2003, 14(1):43-48. |