[torqueusers] only one processor is used when using qsub -l procs flag

Xiangqian Wang jascha.wang at gmail.com
Mon Jan 16 23:56:29 MST 2012


The processors allocated when request 'nodes=1:ppn=3'  is right, both when
use maui 3.3.1 and 3.2.6p21.

I try to add "JOBNODEMATCHPOLICY EXACTNODE" in maui 3.3.1 config file, but
the processor allocation for "procs" syntax is still one. I compare the
config file of maui 3.3.1 and maui 3.2.6p21, and see nothing is different.

Probably I should use maui 3.2.6p21 for the moment if i want to submit job
with "procs" syntax.

BTW, i'm concerning the strength and weakness of the usage of "procs",
since I don't want to care about the hardware configuration and its
current usage, maybe this laziness is at some cost of performance
degradation.

Thanks for you Gustavo!

Xiangqian


2012/1/16 Gustavo Correa <gus at ldeo.columbia.edu>

> PS - Hi Xiangqian.
>
> Maybe you need to add this line to your maui.cfg [and restart maui],
> for the 'proc=Z' syntax to work as you expect:
>
> JOBNODEMATCHPOLICY EXACTNODE
>
> I *think* the default is
>
> JOBNODEMATCHPOLICY EXACTPROC
>
> which expects your node to have the exact number of processors you
> requested [i.e. 3].
>
> See appendix F of the Maui Admininstrator Guide for details.
>
> I am not sure, but my recollection is that somebody reported a problem
> similar to yours
> in the list before, and the solution suggested was this one.
>
> I hope this helps,
> Gus Correa
>
> On Jan 16, 2012, at 10:21 AM, Gustavo Correa wrote:
>
> > Hi Xiangqian
> >
> > For what it is worth, I use Maui 3.2.6p21, and I don't have the problem
> you described.
> > I don't know the behavior in Maui 3.3.1, but as you reported 3.2.6p1
> also works correctly for you,
> > with the nodes-1:ppn=3 syntax.
> > I am happy with 3.2.6p21.
> >
> > There is still a chance that a change in maui.cfg 3.3.1 may fix this
> glitch,
> > but I don't know what it would be.  Most likely it has to do with the
> node allocation policies,
> > and how it translates 'procs' into nodes and ppn.
> > Somebody else more savvy in the list may clarify this point.
> >
> > I confess I prefer the more detailed syntax 'nodes=X:ppn=Y',
> > because it specifies more detail about the resources you are requesting,
> > and apparently avoids the issue that hit you.
> >
> > Have you tried the 'nodes=1:ppn=3' syntax in Maui 3.3.1?
> > I wonder if it would work there too.
> >
> > I hope this helps,
> > Gus Correa
> >
> >
> > On Jan 16, 2012, at 1:43 AM, Xiangqian Wang wrote:
> >
> >> thanks, Gustavo
> >>
> >> sorry for the misspelling in the previous email, i recheck it and
> correct it as following:
> >>
> >> i tested torque 2.5.8 and maui 3.3.1 on a centos 6.0 node, the job
> script is:
> >>
> >> #!/bin/sh
> >> #PBS -N procsjob
> >> #PBS -l procs=3
> >> #PBS -q batch
> >> ping localhost -c 100
> >>
> >> and qstat output "exec_host = snode02/0".
> >> i replace with the new job script, as
> >>
> >> #!/bin/sh
> >> #PBS -N procsjob
> >> #PBS -l nodes=1:ppn=3
> >> #PBS -q batch
> >> ping localhost -c 100
> >> and qstat output "exec_host = snode02/2+snode02/1+snode02/0".
> >>
> >> i change maui 3.3.1 to maui 3.2.6p21 and test again, qstat output
>  "exec_host = snode02/2+snode02/1+snode02/0" for both script. maybe it's a
> maui 3.3.1 problem?
> >>
> >>
> >> 2012/1/14 Gustavo Correa <gus at ldeo.columbia.edu>
> >> Hi Xiangqian
> >>
> >> Is it a typo in your email or did you comment out this line in your
> Torque/PBS script?
> >> [Note the double hash ##.]
> >>
> >>> ##PBS -l procs=3
> >>
> >> Have you tried this form instead?
> >>
> >> #PBS -l nodes=1:ppn=3
> >>
> >> For more details check 'man qsub' and 'man pbs_resources'.
> >>
> >> I hope it helps,
> >> Gus Correa
> >>
> >> On Jan 13, 2012, at 4:10 AM, Xiangqian Wang wrote:
> >>
> >>> my demo torque+maui cluster has one node with np=4 set fot it. i want
> to submit a job requesting 3 processors, but when it start to run, i see
> only one processor is used (qstat shows "exec_host = snode02/0").
> >>>
> >>> i use torque 2.5.6 and maui 3.3.1. anyone can help me out, it'll be
> greatly appreciated
> >>>
> >>> the submit script is something like:
> >>>
> >>> #!/bin/sh
> >>> #PBS -N procsjob
> >>> ##PBS -l procs=3
> >>> #PBS -q batch
> >>> the output of checkjob is :
> >>>
> >>> checking job 33
> >>> State: Running
> >>> Creds:  user:wangxq  group:wangxq  class:batch  qos:DEFAULT
> >>> WallTime: 00:00:00 of 1:00:00
> >>> SubmitTime: Fri Jan 13 17:07:43
> >>>  (Time Queued  Total: 00:00:01  Eligible: 00:00:01)
> >>> StartTime: Fri Jan 13 17:07:44
> >>> Total Tasks: 1
> >>> Req[0]  TaskCount: 1  Partition: DEFAULT
> >>> Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
> >>> Opsys: [NONE]  Arch: [NONE]  Features: [NONE]
> >>> Exec:  ''  ExecSize: 0  ImageSize: 0
> >>> Dedicated Resources Per Task: PROCS: 1
> >>> Utilized Resources Per Task:  [NONE]
> >>> Avg Util Resources Per Task:  [NONE]
> >>> Max Util Resources Per Task:  [NONE]
> >>> NodeAccess: SHARED
> >>> NodeCount: 0
> >>> Allocated Nodes:
> >>> [snode02:1]
> >>> Task Distribution: snode02
> >>>
> >>> IWD: [NONE]  Executable:  [NONE]
> >>> Bypass: 0  StartCount: 1
> >>> PartitionMask: [ALL]
> >>> Flags:       RESTARTABLE
> >>> Reservation '33' (00:00:00 -> 1:00:00  Duration: 1:00:00)
> >>> PE:  1.00  StartPriority:  1
> >>> _______________________________________________
> >>> torqueusers mailing list
> >>> torqueusers at supercluster.org
> >>> http://www.supercluster.org/mailman/listinfo/torqueusers
> >>
> >> _______________________________________________
> >> torqueusers mailing list
> >> torqueusers at supercluster.org
> >> http://www.supercluster.org/mailman/listinfo/torqueusers
> >>
> >> _______________________________________________
> >> torqueusers mailing list
> >> torqueusers at supercluster.org
> >> http://www.supercluster.org/mailman/listinfo/torqueusers
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20120117/55a28089/attachment-0001.html 


More information about the torqueusers mailing list