[torqueusers] Do I have to define the ncpus for a compute node?

Gareth.Williams at csiro.au Gareth.Williams at csiro.au
Sat Jan 14 03:08:25 MST 2012


Hi Ryan,

Unset queue batch resources_default.nodes - you don't need that.

The nodes resource is fighting with the procs resource. You need to only set one or the other for a given job (neither is OK for serial tasks).

Gareth

From: Ryan Golhar [mailto:ngsbioinformatics at gmail.com]
Sent: Saturday, 14 January 2012 4:31 AM
To: Torque Users Mailing List
Subject: Re: [torqueusers] Do I have to define the ncpus for a compute node?

So that's what's throwing me off.  I already configured the queue using:

[root at bic database]# qmgr -c 'create queue batch'
[root at bic database]# qmgr -c 'set queue batch queue_type = execution'
[root at bic database]# qmgr -c 'set queue batch started = true'
[root at bic database]# qmgr -c 'set queue batch enabled = true'
[root at bic database]# qmgr -c 'set queue batch resources_default.nodes=1:ppn=1'

[root at bic database]# qmgr -c "set queue batch keep_completed=120"
[root at bic database]# qmgr -c "set server default_queue=batch"
[root at bic database]# qmgr -c "set server query_other_jobs = true"

I assumed, by default, if the user doesn't specify any resources, a job would consume 1 core on 1 node.  My nodes file shows:

[root at bic hg19]# cat /var/spool/torque/server_priv/nodes
compute-0-0 np=8
compute-0-1 np=8
compute-0-2 np=8

So Torque knows there are 8 cpus per node, and I haven't set a maximum limit to how many resources a job could use.  To me, requesting 2 cpus on 1 node should have succeeded.

On Fri, Jan 13, 2012 at 11:18 AM, Axel Kohlmeyer <akohlmey at cmm.chem.upenn.edu<mailto:akohlmey at cmm.chem.upenn.edu>> wrote:
On Fri, Jan 13, 2012 at 10:59 AM, Ryan Golhar
<ngsbioinformatics at gmail.com<mailto:ngsbioinformatics at gmail.com>> wrote:
> Hi - I have a ROCKS cluster running and installed Torque.  I'm able to
> submit 1 core, 1 cpu jobs without problem.  I tried submitting a job that
> requested 4 cpus on 1 node using
>
> #PBS -l nodes=1:ppn=4
>
> in my job submission script.  When I submit the job however, I get the
> error:
>
> qsub: Job exceeds queue resource limits MSG=cannot locate feasible nodes
> (nodes file is empty or requested nodes exceed all systems)
>
> If I run anodes, I see:
>
> compute-0-0
>      state = free
>      np = 8
>      ntype = cluster
>      status =
> rectime=1326469800,varattr=,jobs=,state=free,netload=1720539412488,gres=,loadave=0.01,ncpus=8,physmem=16431248kb,availmem=17311704kb,totmem=17451364kb,idletime=339141,nusers=0,nsessions=?
> 15201,sessions=? 15201,uname=Linux compute-0-0.local 2.6.18-238.19.1.el5 #1
> SMP Fri Jul 15 07:31:24 EDT 2011 x86_64,opsys=linux
>      gpus = 0
>
>
> All my compute nodes have 8 cpus.  Do I need to tell Torque this?  I thought
> Torque could figure this out from np=8 or ncpus=8.
the error message says that the request exceeds the queue configuration.
that is being checked before it looks at any nodes. thus you probably have
to adjust the queue configuration.

axel.


>
> Ryan
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org<mailto:torqueusers at supercluster.org>
> http://www.supercluster.org/mailman/listinfo/torqueusers
>



--
Dr. Axel Kohlmeyer    akohlmey at gmail.com<mailto:akohlmey at gmail.com>
http://sites.google.com/site/akohlmey/

Institute for Computational Molecular Science
Temple University, Philadelphia PA, USA.
_______________________________________________
torqueusers mailing list
torqueusers at supercluster.org<mailto:torqueusers at supercluster.org>
http://www.supercluster.org/mailman/listinfo/torqueusers

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20120114/8bd4c452/attachment.html 


More information about the torqueusers mailing list