[torqueusers] Submitting jobs to use multiprocessors.

hitesh chugani hiteshschugani at gmail.com
Thu Mar 20 12:30:59 MDT 2014


Hi Gus,


Did you create a $TORQUE/pbs_server/nodes file? *Yes*

What are the contents of that file?

*<node1> np=2<node2> np=2*

What is the output of "pbsnodes -a"?

*<node1>*







*     state = free     np = 2     ntype = cluster     status =
rectime=1395339913,varattr=,jobs=,state=free,netload=8159659934,gres=,loadave=0.00,ncpus=2,physmem=3848508kb,availmem=15671808kb,totmem=16300340kb,idletime=89,nusers=2,nsessions=22,sessions=2084
2619 2839 2855 2873 2877 2879 2887 2889 2916 2893 2891 3333 6665 3053 8036
25960 21736 22263 23582 26141 30680,uname=Linux lws81 2.6.18-371.4.1.el5 #1
SMP Wed Jan 8 18:42:07 EST 2014 x86_64,opsys=linux     mom_service_port =
15002     mom_manager_port = 15003*

*<node2>*





*     state = free     np = 2     ntype = cluster     status =
rectime=1395339913,varattr=,jobs=,state=free,netload=2817775035,gres=,loadave=0.00,ncpus=8,physmem=16265764kb,availmem=52900464kb,totmem=55259676kb,idletime=187474,nusers=3,nsessions=4,sessions=11923
17547 20030 29392,uname=Linux lws10.uncc.edu <http://lws10.uncc.edu>
2.6.18-371.4.1.el5 #1 SMP Wed Jan 8 18:42:07 EST 2014
x86_64,opsys=linux     mom_service_port = 15002     mom_manager_port =
15003*


Did you enable scheduling in the pbs_server? *Maui is enabled*


Did you keep the --enable-cpuset configuration option? *No. I have disabled
it*


I am able to run single/two node single processor job(nodes=1(and2):ppn=1).
But when i am trying to run multiprocessor jobs(nodes=2:ppn=2 with nodes
having 2 and 8 ncpu), the job is remaining in queue . I am able to
forcefully run the job via qrun. I am using Maui scheduler.


Please help.


Thanks,
Hitesh chugani.





On Mon, Mar 17, 2014 at 7:35 PM, Gus Correa <gus at ldeo.columbia.edu> wrote:

> Hi Hitesh
>
> Did you create a $TORQUE/pbs_server/nodes file?
> What are the contents of that file?
> What is the output of "pbsnodes -a"?
>
> Make sure the nodes file is there.
> If not, create it again, and restart pbs_server.
>
> Did you enable scheduling in the pbs_server?
>
> Also:
>
> Did you keep the --enable-cpuset configuration option?
> If you did:
> Do you have a /dev/cpuset directory on your nodes?
> Do you have a type cpuset filesystem mounted on /dev/cpuset
> on the nodes?
>
> Check this link:
>
>
> http://docs.adaptivecomputing.com/torque/Content/topics/3-nodes/linuxCpusetSupport.htm
>
> Still in the topic of cpuset:
>
> Are you perhaps running cgroups on the nodes (the cgconfig service)?
>
> I hope this helps,
> Gus Correa
>
> On 03/17/2014 05:45 PM, hitesh chugani wrote:
> > Hello,
> >
> > I have reconfigured torque to disable NUMA support. I am able to run
> > single node single processor job(nodes=1:ppn=1). But when i am trying to
> > run multiprocessor jobs(nodes=2:ppn=2 with nodes having 2 and 8 ncpu),
> > the job is remaining in queue . I am able to forcefully run the job via
> > qrun. I am using Maui scheduler.  Can anyone please tell me what may be
> > the issue? is it something to do with Maui scheduler? Thanks.
> >
> > Regards,
> > Hitesh Chugani.
> >
> >
> > On Mon, Mar 17, 2014 at 12:40 PM, hitesh chugani
> > <hiteshschugani at gmail.com <mailto:hiteshschugani at gmail.com>> wrote:
> >
> >     I tried nodes=X:ppn=Y option. It still didn't work . I guess it is
> >     something to deal with NUMA option enabling. I am looking into this
> >     issue and will let you guys know . Thanks a lot
> >
> >
> >
> >     On Thu, Mar 13, 2014 at 10:22 AM, Ken Nielson
> >     <knielson at adaptivecomputing.com
> >     <mailto:knielson at adaptivecomputing.com>> wrote:
> >
> >         Glen is right. There is a regression with procs.
> >
> >
> >         On Wed, Mar 12, 2014 at 5:29 PM, <glen.beane at gmail.com
> >         <mailto:glen.beane at gmail.com>> wrote:
> >
> >             I think there is a regression in Torque and procs only works
> >             with Moab now. Try nodes=X:ppn=Y
> >
> >
> >             On Mar 12, 2014, at 6:26 PM, hitesh chugani
> >             <hiteshschugani at gmail.com <mailto:hiteshschugani at gmail.com>>
> >             wrote:
> >
> >>             Hi all,
> >>
> >>
> >>             I am trying to submit a job with to use
> >>             multiprocessors(Added #PBS -l procs=4 in the job script)
> >>             but the job is remaining queued forever. I am using 2
> >>             computes nodes (ncpus=8 and 2). Any idea why is it not
> >>             running? Please help.
> >>
> >>             I have installed torque using this configuration option.
> >>             *./configure --enable-unixsockets --enable-cpuset
> >>             --enable-geometry-requests --enable-numa-support *
> >>
> >>
> >>
> >>
> >>             Thanks,
> >>             Hitesh Chugani.
> >>             Student Linux specialist
> >>             University of North Carolina, Charlotte
> >>             _______________________________________________
> >>
> >>             torqueusers mailing list
> >>             torqueusers at supercluster.org
> >>             <mailto:torqueusers at supercluster.org>
> >>             http://www.supercluster.org/mailman/listinfo/torqueusers
> >
> >             _______________________________________________
> >             torqueusers mailing list
> >             torqueusers at supercluster.org
> >             <mailto:torqueusers at supercluster.org>
> >             http://www.supercluster.org/mailman/listinfo/torqueusers
> >
> >
> >
> >
> >         --
> >         Ken Nielson
> >         +1 801.717.3700 <tel:%2B1%20801.717.3700> office +1 801.717.3738
> >         <tel:%2B1%20801.717.3738> fax
> >         1712 S. East Bay Blvd, Suite 300  Provo, UT  84606
> >         www.adaptivecomputing.com <http://www.adaptivecomputing.com>
> >
> >
> >         _______________________________________________
> >         torqueusers mailing list
> >         torqueusers at supercluster.org <mailto:
> torqueusers at supercluster.org>
> >         http://www.supercluster.org/mailman/listinfo/torqueusers
> >
> >
> >
> >
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> >
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20140320/7daaf398/attachment.html 


More information about the torqueusers mailing list