For starters, Condor is good for running batch jobs. The jobs that are run with Condor are jobs that are submitted as one program with various sets of command line arguments that tell the program at runtime what its job is. For example, if you have a large set of data that needs to be processed by a program called myProgram
and you want the data to be split into four different independent chunks, you can submit the following four commands to be run by Condor:
myProgram data.txt 1 4
myProgram data.txt 2 4
myProgram data.txt 3 4
myProgram data.txt 4 4
Now you need to code myProgram
to interpret the command line arguments as <data_file> <my_chunk> <total_chunks>
.
In order to tell Condor how to run your jobs, you have to submit a Condor Submission File to the Condor system. Continuing the previous example, the Condor submission file for running myProgram
on Condor in four chunks would be as follows:
################################# # Process data.txt with myProgram # in 4 equal-sized chunks ################################# executable = /home/<UserName>/myProgram universe = vanilla log = myProgram_4p.log arguments = data.txt 1 4 output = output1.txt queue arguments = data.txt 2 4 output = output2.txt queue arguments = data.txt 3 4 output = output3.txt queue arguments = data.txt 4 4 output = output4.txt queue