[CS241] Re: CS241 Digest, Vol 3, Issue 4
Leo Meyerovich
Leo_Meyerovich at brown.edu
Fri Dec 8 03:32:26 EST 2006
Below is a file that may make working with hogwash a lot easier.
iterations: how many times to run through a sequence of batch files
runner: command to run (runlocally, rungrid -a 32 ...)
batches: the sequence of the generated batch files to repeat, in order
polls: for every batch file, how many seconds between stat polls
(might want to create a new variable doing an initial wait time)
the script records errors and will cleanerror / run if some batch file
in an iteration fails, so quahog should be ok to use (the migrate
option should probably be used, not sure if it works)
its rough as i don't really know bash, but here goes:
------
#!/bin/bash
iterations=2
runner=runlocally
#runner=rungrid -a 32 -q short
#runner=runquahog -m
batches=("hog-map.batch" "hog-reducer.batch") #ordered
procceses=(4 1) #for each batch file, 0 ... n-1 processes
polls=(3 3) #in seconds, time to poll sty
#errorfile='/dev/null'
errorfile=errors
let "numseq=${#batches[@]} - 1"
for i in `seq $iterations`
do
echo ITERATION $i
for ai in `seq ${#batches[@]}`
do
let "sqn=$ai - 1"
batch=${batches[$sqn]}
hjobs=${procceses[$sqn]}
let "jm1=$hjobs - 1"
if [ "$jm1" = 0 ]; then
f="Finished: 0 [$hjobs]"
else
f="Finished: 0-$jm1 [$hjobs]"
fi
#run current sequence
sty $batch $runner _ exit > /dev/null &
#keep polling status until all finished and no errors
#if an error, clearerror and run again
while [[ "$fstat" != "$f" || -n $estat ]]
do
sleep ${polls[$sqn]}
fstat=`sty $batch status _ exit | grep "Finished"`
estat=`sty $batch status _ exit | grep "Error"`
# echo "not done? " batch:$batch finished:$fstat errors:$estat
# echo "error? " $estat
if [ -n "$estat" ]
then
echo "error, clearing and restarting: " $estat
echo iteration $i batch $batch >> $errorfile
sty $batch errorsummary _ exit | tail +3 | egrep -v
"^(Bye\!)" >> $errorfile
sty $batch clearerror _ $runner _ exit > /dev/null
fi
done
echo sequence $ai $batch \($hjobs jobs\) done
done
echo
done;
echo "done!"
More information about the CS241
mailing list