[torqueusers] serial submission order on dual processor cluster

Richard Ross rross at hrl.com
Thu Oct 4 15:14:59 MDT 2007


Hopefully this is the best forum for this question. We are running  
Scyld 4.0 and
Taskmaster 2.0 on a Penguin cluster with dual processor AMD nodes. We  
regularly
submit many serial jobs using the following PBS script : (modeled on  
the sample in
the Taskmaster docs)

################
#PBS -N <job name>

echo "Running on Node : $BEOWULF_JOB_MAP"

echo Start Date: `date`

echo Dir: $PWD

echo "##########"
echo ""

bpsh $BEOWULF_JOB_MAP <executable and arguments>

echo ""
echo "##########"

echo End Date: `date`
###############

This loads jobs on the machine in the following order (node #) :
9 9 8 8 7 7 6 6 5 5 4 4 3 3 37 37 36 36  35 35  34 34 .....

How do we have it put one job on each node until all are filled and  
then put the
second job on.

The main reason is that some jobs require more than 50% of the memory  
and the
queue seems oblivious. I looked widely for a quick tutorial on adding  
memory
resource limits but just can't sort this out. Any help or links would  
be greatly
appreciated.

================================================================
Richard S. Ross, Ph.D.                         Manager, Computational  
Physics Department
Senior Research Staff  
Physicist                                        Email: rross -at-  
hrl -dot- com
HRL Laboratories,  
LLC                                                                  
Phone: (310) 317-5022
3011 Malibu Canyon  
Road                                                               
Fax : (310) 317-5485
Malibu, CA  90265-4797                            Personal Web :  
home.earthlink.net/~rsross
================================================================




More information about the torqueusers mailing list