Skip to main content

Research Repository

Advanced Search

Energy-efficient checkpointing in high-throughput cycle-stealing distributed systems

Forshaw, Matthew; McGough, A. Stephen; Thomas, Nigel

Energy-efficient checkpointing in high-throughput cycle-stealing distributed systems Thumbnail


Authors

Matthew Forshaw

A. Stephen McGough

Nigel Thomas



Abstract

Checkpointing is a fault-tolerance mechanism commonly used in High Throughput Computing (HTC) environments to allow the execution of long-running computational tasks on compute resources subject to hardware or software failures as well as interruptions from resource owners and more important tasks. Until recently many researchers have focused on the performance gains achieved through checkpointing, but now with growing scrutiny of the energy consumption of IT infrastructures it is increasingly important to understand the energy impact of checkpointing within an HTC environment. In this paper we demonstrate through trace-driven simulation of real-world datasets that existing checkpointing strategies are inadequate at maintaining an acceptable level of energy consumption whilst maintaing the performance gains expected with checkpointing. Furthermore, we identify factors important in deciding whether to exploit checkpointing within an HTC environment, and propose novel strategies to curtail the energy consumption of checkpointing approaches whist maintaining the performance benefits.

Citation

Forshaw, M., McGough, A. S., & Thomas, N. (2015). Energy-efficient checkpointing in high-throughput cycle-stealing distributed systems. . https://doi.org/10.1016/j.entcs.2014.12.013

Conference Name Seventh International Workshop on Practical Applications of Stochastic Modelling (PASM)
Conference Location Newcastle, UK
Publication Date Jan 5, 2015
Deposit Date Jan 11, 2015
Publicly Available Date Feb 3, 2015
Volume 310
Pages 65-90
Series ISSN 1571-0661
DOI https://doi.org/10.1016/j.entcs.2014.12.013
Keywords Energy efficiency, Checkpointing, Migration, Fault tolerance, Desktop Grids.

Files




You might also like



Downloadable Citations