[torqueusers] Weird queue behavior.

Garrick Staples garrick at usc.edu
Wed Apr 19 18:31:05 MDT 2006


On Tue, Apr 18, 2006 at 08:52:30AM -0600, John Hanks alleged:
> Hi,
> 
> I have two queues, parallel and dedicated. Parallel is supposed to catch
> any job requesting less than nodes=32:ppn=4 and dedicated gets any job
> larger than that. But I'm getting weird behavior like this:
> 
> # A nine node, 4 processor per node job works:
> griznog at uinta ~ $ qsub -I -l nodes=9:ppn=4
> qsub: waiting for job 10349.uinta.hpc.usu.edu to start
> 
> # A ten node, 4 ppn job doesn't.
> griznog at uinta ~ $ qsub -I -l nodes=10:ppn=4
> qsub: Job rejected by all possible destinations
> 
> # However, a 20 node 2 ppn job does
> griznog at uinta ~ $ qsub -I -l nodes=20:ppn=2
> qsub: waiting for job 10351.uinta.hpc.usu.edu to start
> 
> What am I doing wrong here that allows > ~36 CPU jobs unless I pack all
> the processors on each node?
> 
> Queue configuration follows.
> 
> Thanks,
> 
> jbh
> 
> Qmgr: p q parallel
> #
> # Create queues and set their attributes.
> #
> #
> # Create and define queue parallel
> #
> create queue parallel
> set queue parallel queue_type = Execution
> set queue parallel resources_max.nodect = 32
> set queue parallel resources_max.nodes = 32:ppn=4
> set queue parallel resources_max.walltime = 24:00:00
> set queue parallel resources_min.nodect = 1
> set queue parallel resources_min.nodes = 1:ppn=2
> set queue parallel resources_default.nodes = 1:ppn=2
> set queue parallel resources_default.walltime = 01:00:00
> set queue parallel resources_available.nodect = 62
> set queue parallel resources_available.nodes = 62:ppn=4
> set queue parallel max_user_run = 8
> set queue parallel enabled = True
> set queue parallel started = True
> Qmgr: p q dedicated
> #
> # Create queues and set their attributes.
> #
> #
> # Create and define queue dedicated
> #
> create queue dedicated
> set queue dedicated queue_type = Execution
> set queue dedicated resources_max.nodect = 62
> set queue dedicated resources_max.nodes = 62:ppn=4
> set queue dedicated resources_max.walltime = 08:00:00
> set queue dedicated resources_min.nodect = 33
> set queue dedicated resources_min.nodes = 33:ppn=4
> set queue dedicated resources_default.nodes = 33:ppn=4
> set queue dedicated resources_default.walltime = 01:00:00
> set queue dedicated resources_available.nodect = 62
> set queue dedicated resources_available.nodes = 62
> set queue dedicated enabled = True
> set queue dedicated started = True

"nodes" is a string, not an integer, therefore it is only useful as a
default.  min/max nodes doesn't have any meaning.

I don't see a routing queue or your server's default queue here.  Your
qsub examples above don't use -q, so I don't know which queue is being
used.

-- 
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20060419/eb2ed619/attachment.bin


More information about the torqueusers mailing list