This is an old revision of the document!


Checkpointing Programs

Benefits of Using condor_compile

In order to run jobs in the “standard” universe, which supports special Condor features like checkpointing and I/O redirection.

“Checkpointing” means that if Condor has to stop your job for any reason and move your job to another computer, such as if a person logged on to the computer, Condor can save the state of your program and use that state to start your program on another computer from where it left off. So, if you have a program that takes 8 hours to run to completion but Condor has to kick it off of a computer that was running it for 6 hours, Condor will start your program on anther computer but at the 6 hour mark.

“I/O redirection” means that any data your program needs to run will be fetched from the computer you submitted the program on. So, if you submit a program from your home directory on your computer that needs a file called input.txt in your home directory, the program will open the file located on your computer instead of trying to find it on the execute machine that it is actually running on.

How to Use condor_compile

Luckily, Condor makes it easy to recompile your code to incorporate these features. All you have to do is run condor_compile with the command that you normally use to compile your program. For example, if you normally run

gcc -o HellowOrld HellowOrld.c

to compile your HellowOrld program, simply stick condor_compile in front of this command to compile your program to include the extra Condor features:

condor_compile gcc -o HellowOrld HellowOrld.c
condor/submit/checkpointing.1312553977.txt.gz · Last modified: 2011/08/05 14:19 by garrettheath4
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0