[torqueusers] serial submission order on dual processor cluster
Richard Ross
rross at hrl.com
Fri Oct 5 16:34:20 MDT 2007
Thanks for the reply Joshua.
Actually this doesn't solve the problem. I am using this script to
launch a single serial job at a time. I then submit this script
multiple times, from multiple working directories, in order to do
parametric studies.
Specifically, if this script is called run_serial, I will execute the
following from the bash command line:
for dir in <list of working directories>
do
pushd $dir
qsub -d `pwd` run_serial
popd
done
This runs through all of the working directories that I have set up
for the different instances of the job that I want. The parameters
are all contained in input files which differ by directory, so that I
can run a common job script that only needs the executable name. Each
time the PBS script is called then, $BEOWULF_JOB_MAP will have just
one number that it gets from PBS or Torque (I think). My question is
how do I change that order? One thing to do is use some resource
variables in Torque to guide it, i.e. load, memory requirements
etc....but I can't find a tutorial on my system that instructs me how
to do that easily. The documentation is fairly anemic in my opinion.
Richard
On Oct 5, 2007, at 2:59 PM, Joshua Bernstein wrote:
> Hi Richard,
>
>> Hopefully this is the best forum for this question. We are
>> running Scyld 4.0 and Taskmaster 2.0 on a Penguin cluster with
>> dual processor AMD nodes. We regularly submit many serial jobs
>> using the following PBS script : (modeled on the sample in the
>> Taskmaster docs)
>
> This place works fine, but don't be afraid to contact Penguin
> support directly for these sorts of question. That is what we are
> here for.
>
>> ################
>> #PBS -N <job name>
>> echo "Running on Node : $BEOWULF_JOB_MAP"
>> echo Start Date: `date`
>> echo Dir: $PWD
>> echo "##########"
>> echo ""
>> bpsh $BEOWULF_JOB_MAP <executable and arguments> echo ""
>> echo "##########"
>> echo End Date: `date`
>> ###############
>> This loads jobs on the machine in the following order (node #) :
>> 9 9 8 8 7 7 6 6 5 5 4 4 3 3 37 37 36 36 35 35 34 34 .....
>> How do we have it put one job on each node until all are filled
>> and then put the
>> second job on.
>
> The order that the jobs will be lauched will depend on the order
> each node number appears in the BEOWULF_JOB_MAP environment
> variable. For the sake of demonstration, consider a smaller, bit
> still applicable case with a 4 processes job. If my BEOWULD_JOB_MAP
> (sometimes abbrev. to just BJM) is set to 0:0:1:1, then the first
> two processes would each be started on node 0, followed by the
> third and forth processes on node 1.
>
> Order is significant! If I can the order of BJM, I can place the
> processes in the order I'd like, so consider
> BEOWULF_JOB_MAP=0:1:0:1. Here a process would be placed on 0, then
> 1, then wrapping around back to node 0, and then 1 again.
>
> Does that accomplish what you are looking to do?
>
> -Joshua Bernstein
> Software Engineer
> Penguin Computing
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
================================================================
Richard S. Ross, Ph.D. Manager, Computational
Physics Department
Senior Research Staff
Physicist Email: rross -at-
hrl -dot- com
HRL Laboratories,
LLC
Phone: (310) 317-5022
3011 Malibu Canyon
Road
Fax : (310) 317-5485
Malibu, CA 90265-4797 Personal Web :
home.earthlink.net/~rsross
================================================================
More information about the torqueusers
mailing list