[torqueusers] 8-core node "busy" with only 3 jobs

Troy Baer troy at osc.edu
Thu Oct 5 08:38:41 MDT 2006


On Thu, 2006-10-05 at 12:37 +0200, Jacques Foury wrote:
> I've just added a quad-opteron dual-core to my cluster.
> 
> I added that line to nodes file :
> 
> callas99 np=8 opteron
> 
> and re-launched pbs_server.
> 
> Users submitted jobs, but when reaching 3 jobs on the node, remaining
> jobs are Queued and don't run :
> 
> 
> # pbsnodes -a callas99
> callas99
>      state = busy
>      np = 8
>      properties = opteron,callas
>      ntype = cluster
>      jobs = 0/10994.ulmo.math.u-bordeaux1.fr, 1/10995.ulmo.math.u-
> bordeaux1.fr, 2/10996.ulmo.math.u-bordeaux1.fr
>      status = opsys=linux,uname=Linux callas99 2.6.12-12mdksmp #1 SMP
> Fri Sep 9 17:20:34 CEST 2005 x86_64,sessions=17408 17523
> 17536,nsessions=3,nusers=1,idletime=1200524,totmem=33006508kb,availmem=32791272kb,physmem=32802532kb,ncpus=8,loadave=3.00,netload=2486994073,state=busy,jobs=10994.ulmo.math.u-bordeaux1.fr 10995.ulmo.math.u-bordeaux1.fr 10996.ulmo.math.u-bordeaux1.fr,rectime=1160044144
> 
> 
> Why is this node tagged "busy" with only 3 jobs running ? I
> particularly don't understand that part :
> 
> "ncpus=8,loadave=3.00,netload=2486994073,state=busy"
> 
> If ncpus=8 ad loadave=3.00 it seems to me that state should be
> free !!!
> 
> 
> 
> Maui's checkjob to a queued job says :
> 
> job cannot run in partition DEFAULT (idle procs do not meet
> requirements : 0 of 1 procs found)
> 
> Why does my np=8 seem not to be taken by the MAUI/Torque system ?
> 
> did I forget something ?
> 
> Thanks for any help.

What's in $PBS_HOME/mom_priv/config on that node?  Does it have
$ideal_load or $max_load set?

Also, check the jobs and make sure they're not asking for multiple
processors.  Maui generally won't overcommit a node unless you
specifically tell it to do so, so if it thinks all the processors on the
node are allocated, it won't run anything else there.

	--Troy
-- 
Troy Baer                       troy at osc.edu
Science & Technology Support    http://www.osc.edu/hpc/
Ohio Supercomputer Center       614-292-9701



More information about the torqueusers mailing list