[torqueusers] Do I have to define the ncpus for a compute node?

Ryan Golhar ngsbioinformatics at gmail.com
Sat Jan 14 06:48:18 MST 2012


Thanks Gareth.  I removed that setting, using

qmgr -c 'unset queue batch resources_default.nodes'

but I'm still getting the same error. I can submit jobs that request 1-3
ppn, but not 4 ppn.



On Sat, Jan 14, 2012 at 5:08 AM, <Gareth.Williams at csiro.au> wrote:

> Hi Ryan,****
>
> ** **
>
> Unset queue batch resources_default.nodes – you don’t need that.****
>
> ** **
>
> The nodes resource is fighting with the procs resource. You need to only
> set one or the other for a given job (neither is OK for serial tasks).****
>
> ** **
>
> Gareth****
>
> ** **
>
> *From:* Ryan Golhar [mailto:ngsbioinformatics at gmail.com]
> *Sent:* Saturday, 14 January 2012 4:31 AM
> *To:* Torque Users Mailing List
> *Subject:* Re: [torqueusers] Do I have to define the ncpus for a compute
> node?****
>
> ** **
>
> So that's what's throwing me off.  I already configured the queue using:**
> **
>
> ** **
>
> [root at bic database]# qmgr -c 'create queue batch'****
>
> [root at bic database]# qmgr -c 'set queue batch queue_type = execution'****
>
> [root at bic database]# qmgr -c 'set queue batch started = true'****
>
> [root at bic database]# qmgr -c 'set queue batch enabled = true'****
>
> [root at bic database]# qmgr -c 'set queue batch
> resources_default.nodes=1:ppn=1'****
>
>  ****
>
> [root at bic database]# qmgr -c "set queue batch keep_completed=120"****
>
> [root at bic database]# qmgr -c "set server default_queue=batch" ****
>
> [root at bic database]# qmgr -c "set server query_other_jobs = true"****
>
> ** **
>
> I assumed, by default, if the user doesn't specify any resources, a job
> would consume 1 core on 1 node.  My nodes file shows:****
>
> ** **
>
> [root at bic hg19]# cat /var/spool/torque/server_priv/nodes ****
>
> compute-0-0 np=8****
>
> compute-0-1 np=8****
>
> compute-0-2 np=8****
>
> ** **
>
> So Torque knows there are 8 cpus per node, and I haven't set a maximum
> limit to how many resources a job could use.  To me, requesting 2 cpus on 1
> node should have succeeded.  ****
>
> ** **
>
> On Fri, Jan 13, 2012 at 11:18 AM, Axel Kohlmeyer <
> akohlmey at cmm.chem.upenn.edu> wrote:****
>
> On Fri, Jan 13, 2012 at 10:59 AM, Ryan Golhar
> <ngsbioinformatics at gmail.com> wrote:
> > Hi - I have a ROCKS cluster running and installed Torque.  I'm able to
> > submit 1 core, 1 cpu jobs without problem.  I tried submitting a job that
> > requested 4 cpus on 1 node using
> >
> > #PBS -l nodes=1:ppn=4
> >
> > in my job submission script.  When I submit the job however, I get the
> > error:
> >
> > qsub: Job exceeds queue resource limits MSG=cannot locate feasible nodes
> > (nodes file is empty or requested nodes exceed all systems)
> >
> > If I run anodes, I see:
> >
> > compute-0-0
> >      state = free
> >      np = 8
> >      ntype = cluster
> >      status =
> >
> rectime=1326469800,varattr=,jobs=,state=free,netload=1720539412488,gres=,loadave=0.01,ncpus=8,physmem=16431248kb,availmem=17311704kb,totmem=17451364kb,idletime=339141,nusers=0,nsessions=?
> > 15201,sessions=? 15201,uname=Linux compute-0-0.local 2.6.18-238.19.1.el5
> #1
> > SMP Fri Jul 15 07:31:24 EDT 2011 x86_64,opsys=linux
> >      gpus = 0
> >
> >
> > All my compute nodes have 8 cpus.  Do I need to tell Torque this?  I
> thought
> > Torque could figure this out from np=8 or ncpus=8.****
>
> the error message says that the request exceeds the queue configuration.
> that is being checked before it looks at any nodes. thus you probably have
> to adjust the queue configuration.
>
> axel.
>
>
> >
> > Ryan
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> >
>
>
>
> --
> Dr. Axel Kohlmeyer    akohlmey at gmail.com
> http://sites.google.com/site/akohlmey/
>
> Institute for Computational Molecular Science
> Temple University, Philadelphia PA, USA.
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers****
>
> ** **
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20120114/1f885475/attachment-0001.html 


More information about the torqueusers mailing list