Matthew Forshaw
Energy-efficient checkpointing in high-throughput cycle-stealing distributed systems
Forshaw, Matthew; McGough, A. Stephen; Thomas, Nigel
Authors
A. Stephen McGough
Nigel Thomas
Abstract
Checkpointing is a fault-tolerance mechanism commonly used in High Throughput Computing (HTC) environments to allow the execution of long-running computational tasks on compute resources subject to hardware or software failures as well as interruptions from resource owners and more important tasks. Until recently many researchers have focused on the performance gains achieved through checkpointing, but now with growing scrutiny of the energy consumption of IT infrastructures it is increasingly important to understand the energy impact of checkpointing within an HTC environment. In this paper we demonstrate through trace-driven simulation of real-world datasets that existing checkpointing strategies are inadequate at maintaining an acceptable level of energy consumption whilst maintaing the performance gains expected with checkpointing. Furthermore, we identify factors important in deciding whether to exploit checkpointing within an HTC environment, and propose novel strategies to curtail the energy consumption of checkpointing approaches whist maintaining the performance benefits.
Citation
Forshaw, M., McGough, A. S., & Thomas, N. (2015). Energy-efficient checkpointing in high-throughput cycle-stealing distributed systems. . https://doi.org/10.1016/j.entcs.2014.12.013
Conference Name | Seventh International Workshop on Practical Applications of Stochastic Modelling (PASM) |
---|---|
Conference Location | Newcastle, UK |
Publication Date | Jan 5, 2015 |
Deposit Date | Jan 11, 2015 |
Publicly Available Date | Feb 3, 2015 |
Volume | 310 |
Pages | 65-90 |
Series ISSN | 1571-0661 |
DOI | https://doi.org/10.1016/j.entcs.2014.12.013 |
Keywords | Energy efficiency, Checkpointing, Migration, Fault tolerance, Desktop Grids. |
Files
Published Conference Proceeding
(426 Kb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by-nc-sa/3.0
Copyright Statement
© 2015 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-SA license (http://creativecommons.org/licenses/by-nc-sa/3.0/).
You might also like
Using Machine Learning in Trace-driven Energy-Aware Simulations of High-Throughput Computing Systems
(2017)
Conference Proceeding
Efficient Comparison of Massive Graphs Through The Use Of 'Graph Fingerprints'
(2016)
Conference Proceeding
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search