Skip to main content

Research Repository

Advanced Search

On Energy-efficient Checkpointing in High-throughput Cycle-stealing Distributed Systems

Forshaw, Matthew; McGough, A. Stephen; Thomas, Nigel; Helfert, Markus; Krempels, Karl-Heinz; Donnellan, Brian

On Energy-efficient Checkpointing in High-throughput Cycle-stealing Distributed Systems Thumbnail


Authors

Matthew Forshaw

A. Stephen McGough

Nigel Thomas

Markus Helfert

Karl-Heinz Krempels

Brian Donnellan



Abstract

Checkpointing is a fault-tolerance mechanism commonly used in High Throughput Computing (HTC) environments to allow the execution of long-running computational tasks on compute resources subject to hardware and software failures and interruptions from resource owners. With increasing scrutiny of the energy consumption of IT infrastructures, it is important to understand the impact of checkpointing on the energy consumption of HTC environments. In this paper we demonstrate through trace-driven simulation on real-world datasets that existing checkpointing strategies are inadequate at maintaining an acceptable level of energy consumption whilst reducing the makespan of tasks. Furthermore, we identify factors important in deciding whether to employ checkpointing within an HTC environment, and propose novel strategies to curtail the energy consumption of checkpointing approaches.

Citation

Forshaw, M., McGough, A. S., Thomas, N., Helfert, M., Krempels, K., & Donnellan, B. (2014). On Energy-efficient Checkpointing in High-throughput Cycle-stealing Distributed Systems. In Proceedings of the 3rd International Conference on Smart Grids and Green IT Systems (SMARTGREENS 2014), 3-4 April 2014, Barcelona, Spain (262-272). https://doi.org/10.5220/0004958302620267

Conference Name Proceedings of the 3rd International Conference on Smart Grids and Green IT Systems
Publication Date Jan 1, 2014
Deposit Date Jan 11, 2015
Publicly Available Date Feb 3, 2015
Pages 262-272
Book Title Proceedings of the 3rd International Conference on Smart Grids and Green IT Systems (SMARTGREENS 2014), 3-4 April 2014, Barcelona, Spain.
DOI https://doi.org/10.5220/0004958302620267
Keywords Energy efficiency, Checkpointing, Migration, Fault tolerance, Desktop grids.

Files




You might also like



Downloadable Citations