[torqueusers] serial submission order on dual processor cluster

Richard Ross rross at hrl.com
Fri Oct 5 16:34:20 MDT 2007


Thanks for the reply Joshua.

Actually this doesn't solve the problem. I am using this script to  
launch a single serial job at a time. I then submit this script  
multiple times, from multiple working directories, in order to do  
parametric studies.

Specifically, if this script is called run_serial, I will execute the  
following from the bash command line:
for dir in <list of working directories>
do
   pushd $dir
   qsub -d `pwd` run_serial
   popd
done

This runs through all of the working directories that I have set up  
for the different instances of the job that I want. The parameters  
are all contained in input files which differ by directory, so that I  
can run a common job script that only needs the executable name. Each  
time the PBS script is called then, $BEOWULF_JOB_MAP will have just  
one number that it gets from PBS or Torque (I think). My question is  
how do I change that order? One thing to do is use some resource  
variables in Torque to guide it, i.e. load, memory requirements  
etc....but I can't find a tutorial on my system that instructs me how  
to do that easily. The documentation is fairly anemic in my opinion.

Richard

On Oct 5, 2007, at 2:59 PM, Joshua Bernstein wrote:

> Hi Richard,
>
>> Hopefully this is the best forum for this question. We are  
>> running  Scyld 4.0 and Taskmaster 2.0 on a Penguin cluster with  
>> dual processor AMD nodes. We  regularly submit many serial jobs  
>> using the following PBS script : (modeled on  the sample in the  
>> Taskmaster docs)
>
> This place works fine, but don't be afraid to contact Penguin  
> support directly for these sorts of question. That is what we are  
> here for.
>
>> ################
>> #PBS -N <job name>
>> echo "Running on Node : $BEOWULF_JOB_MAP"
>> echo Start Date: `date`
>> echo Dir: $PWD
>> echo "##########"
>> echo ""
>> bpsh $BEOWULF_JOB_MAP <executable and arguments> echo ""
>> echo "##########"
>> echo End Date: `date`
>> ###############
>> This loads jobs on the machine in the following order (node #) :
>> 9 9 8 8 7 7 6 6 5 5 4 4 3 3 37 37 36 36  35 35  34 34 .....
>> How do we have it put one job on each node until all are filled  
>> and  then put the
>> second job on.
>
> The order that the jobs will be lauched will depend on the order  
> each node number appears in the BEOWULF_JOB_MAP environment  
> variable. For the sake of demonstration, consider a smaller, bit  
> still applicable case with a 4 processes job. If my BEOWULD_JOB_MAP  
> (sometimes abbrev. to just BJM) is set to 0:0:1:1, then the first  
> two processes would each be started on node 0, followed by the  
> third and forth processes on node 1.
>
> Order is significant! If I can the order of BJM, I can place the  
> processes in the order I'd like, so consider  
> BEOWULF_JOB_MAP=0:1:0:1. Here a process would be placed on 0, then  
> 1, then wrapping around back to node 0, and then 1 again.
>
> Does that accomplish what you are looking to do?
>
> -Joshua Bernstein
> Software Engineer
> Penguin Computing
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>

================================================================
Richard S. Ross, Ph.D.                         Manager, Computational  
Physics Department
Senior Research Staff  
Physicist                                        Email: rross -at-  
hrl -dot- com
HRL Laboratories,  
LLC                                                                  
Phone: (310) 317-5022
3011 Malibu Canyon  
Road                                                               
Fax : (310) 317-5485
Malibu, CA  90265-4797                            Personal Web :  
home.earthlink.net/~rsross
================================================================




More information about the torqueusers mailing list