1 |
J. T. Daly, "A higher order estimate of the optimum checkpoint interval for restart dumps," Future Generation Computer Systems, vol. 22, no. 3, pp. 303-312, 2004.
DOI
|
2 |
A. Benoit, A. Cavelan, Y. Robert, and H. Sun, "Multi-level checkpointing and silent error detection for linear workflows," Journal of Computational Science, vol. 28, pp. 398-415, Arp. 2017.
DOI
|
3 |
Y. Du, L. Marchal, G. Pallez, and Y. Robert, "Optimal checking strategies for iterative applications," IEEE Transactions on Parallel and Distributed Systems, vol. 33, no. 3, pp. 507-522, Mar. 2022.
DOI
|
4 |
A. Benoit, A. Cavelan, F. Cappello, P. Raghavan, Y. Robert, and H. Sun, "Coping with silent and fail-stop errors at scale by combining replication and checkpointing," Journal of Parallel and Distributed Computing, vol. 122, no. 1, pp. 209-225, Aug. 2018.
DOI
|
5 |
Y. Lee, "Reliability analysis of checkpointing model with multiple verification mechanism," Bulletin of the Korean Mathematical Society, vol. 56, no. 6, pp. 1435-1445, Nov. 2019.
DOI
|
6 |
J. W. Young, "A first order approximation to the optimal checkpoint interval," Communications of the ACM, vol. 17, no. 9, pp. 530-531, Sept. 1974.
DOI
|
7 |
M. S. Bouguerra, D. Trystram, and F. Wagner, "Complexity analysis of checkpoint scheduling with variable costs," IEEE Transactions on Computers, vol. 62, no. 6, pp. 1269-1275, Mar. 2013.
DOI
|