WestminsterResearch

Checkpointing of parallel applications in a grid environment

Sajadah, Kreeteeraj and Terstyanszky, Gabor and Winter, Stephen and Kacsuk, Peter K. (2008) Checkpointing of parallel applications in a grid environment. In: Kacsuk, Peter K. and Lovas, Robert and Nemeth, Zsolt, (eds.) Distributed and parallel systems: in focus: desktop grid computing. Springer, Boston, MA, pp. 179-187. ISBN 9780387794471

Full text not available from this repository.

Official URL: http://dx.doi.org/10.1007/978-0-387-79448-8

Abstract

Jobs in Grid workflows are exposed to different types of failure. It is important to develop fault tolerant mechanisms to ensure a good level of reliability during the execution of Grid jobs. While checkpointing is the most common method to achieve fault tolerance, there still is a lot of work to be done to improve the efficiency of the mechanism. The paper gives an overview of a checkpoint solution for checkpointing parallel applications executed on multiple sites in the Grid environment. The checkpointing mechanism is an improvement of the PGRADE checkpointing solution.

Item Type:Book Section
Uncontrolled Keywords:Checkpointing, First Order Approximation, Natural Synchronisation Points, Critical Region
Research Community:University of Westminster > Electronics and Computer Science, School of
ID Code:6829
Deposited On:07 May 2009 16:16
Last Modified:07 May 2009 16:16

Repository Staff Only: item control page