[torqueusers] Job can not be allocated correctly

Greenseid, Joseph M (IS) Joseph.Greenseid at ngc.com
Tue Mar 2 08:40:34 MST 2010


are you using torque for scheduling or something else (like maui)?

i've had this problem with maui if maui's JOBNODEMATCHPOLICY is not set to EXACTNODE in the maui.cfg.  if you're using torque's scheduler, though, i am not sure what the equivalent is.

--Joe


________________________________

From: torqueusers-bounces at supercluster.org on behalf of Weiguang Chen
Sent: Tue 3/2/2010 4:25 AM
To: torqueusers maillist
Subject: [torqueusers] Job can not be allocated correctly


Hi, all

In fact, I can not sure whether Torque or mpich cause this problem. I just express my problem as follow a exapmle script:
#!/bin/bash
### Job name
#PBS -N name
#PBS -q batch
### number of numbers and process per node
#PBS -l nodes=2:ppn=4
### Job's error output  
#PBS -e error
### Job's general output
#PBS -o stdout

cd $PBS_O_WORKDIR
echo "Job begin at "`date`
# program examples
mpiexec -n 8 $PBS_O_WORKDIR/cpi
echo "Job stop at "`date`

exit 0

cpi is a example progrm in mpich package. Our cluster profile is two processors with every 4 cores, i.e. 8 cores per one node. But the message from the above job is as follow when i submit this job:
Process 0 on node5
Process 1 on node5
Process 2 on node5
Process 3 on node5
Process 5 on node5
Process 6 on node5
Process 4 on node5
Process 7 on node5  

All processes are ran on one node, but i allocated 2 nodes. I don't know what cause it happen, and how to solve it.
Thanks

PS: Torque version:2.4.6, mpich:2-1.1.1p1, mpiexec:0.83
--
Best Wishes 
ChenWeiguang

************************************************
#               Chen, Weiguang 
#
#    Postgraduate,  Ph. D
#  75 University Road, Physics Buliding  #  218
#  School of Physics & Engineering
#  Zhengzhou University
#  Zhengzhou, Henan 450052  CHINA
#
#  Tel: 86-13203730117;
#  E-mail:chenweiguang82 at gmail.com <mailto:E-mail%3Achenweiguang82 at gmail.com> ;
#            chenweiguang82 at qq.com
#
**********************************************


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20100302/a38489fc/attachment.html 


More information about the torqueusers mailing list