[torqueusers] 8-core node "busy" with only 3 jobs
troy at osc.edu
Thu Oct 5 08:38:41 MDT 2006
On Thu, 2006-10-05 at 12:37 +0200, Jacques Foury wrote:
> I've just added a quad-opteron dual-core to my cluster.
> I added that line to nodes file :
> callas99 np=8 opteron
> and re-launched pbs_server.
> Users submitted jobs, but when reaching 3 jobs on the node, remaining
> jobs are Queued and don't run :
> # pbsnodes -a callas99
> state = busy
> np = 8
> properties = opteron,callas
> ntype = cluster
> jobs = 0/10994.ulmo.math.u-bordeaux1.fr, 1/10995.ulmo.math.u-
> bordeaux1.fr, 2/10996.ulmo.math.u-bordeaux1.fr
> status = opsys=linux,uname=Linux callas99 2.6.12-12mdksmp #1 SMP
> Fri Sep 9 17:20:34 CEST 2005 x86_64,sessions=17408 17523
> 17536,nsessions=3,nusers=1,idletime=1200524,totmem=33006508kb,availmem=32791272kb,physmem=32802532kb,ncpus=8,loadave=3.00,netload=2486994073,state=busy,jobs=10994.ulmo.math.u-bordeaux1.fr 10995.ulmo.math.u-bordeaux1.fr 10996.ulmo.math.u-bordeaux1.fr,rectime=1160044144
> Why is this node tagged "busy" with only 3 jobs running ? I
> particularly don't understand that part :
> If ncpus=8 ad loadave=3.00 it seems to me that state should be
> free !!!
> Maui's checkjob to a queued job says :
> job cannot run in partition DEFAULT (idle procs do not meet
> requirements : 0 of 1 procs found)
> Why does my np=8 seem not to be taken by the MAUI/Torque system ?
> did I forget something ?
> Thanks for any help.
What's in $PBS_HOME/mom_priv/config on that node? Does it have
$ideal_load or $max_load set?
Also, check the jobs and make sure they're not asking for multiple
processors. Maui generally won't overcommit a node unless you
specifically tell it to do so, so if it thinks all the processors on the
node are allocated, it won't run anything else there.
Troy Baer troy at osc.edu
Science & Technology Support http://www.osc.edu/hpc/
Ohio Supercomputer Center 614-292-9701
More information about the torqueusers