[Mauiusers] Questions regarding "tasks" and the cfg file.

Michael Homa mhoma at uic.edu
Wed Sep 17 14:21:43 MDT 2008

On Wed, 17 Sep 2008, Garrick Staples wrote:

> On Wed, Sep 17, 2008 at 11:12:41AM -0500, Michael Homa alleged:
> > In order to test, I wrote a simple hello_world program in C. When I submit
> > the program for execution, I see that the number of tasks is 6:
> >
> > Job ID    Username Queue    Jobname    SessID NDS   TSK Memory Time  S Time
> > 317       mhoma    dedicate hello_worl    --   1     6    --- 00:30  Q --
> What did the job actually request?  nodes=1:ppn=6?  ncpus=6?  Neither of those
> requests can be answered with quad-proc machines.

Hi Garrick:

I didn't have a "-l nodes" option in the script:

  #PBS -N hello_world
  #PBS -q dedicated

and did not specify a -l on qsub (qsub script). When I add the -l option:

  #PBS -N hello_world
  #PBS -q dedicated
  #PBS -l nodes=1:ppn=1

I get the same result:

  checking job 322

  State: Idle
  Creds:  user:mhoma  group:users  class:dedicated  qos:DEFAULT
  WallTime: 00:00:00 of 00:30:00
  SubmitTime: Wed Sep 17 14:45:29
    (Time Queued  Total: 00:01:41  Eligible: 00:01:41)

  Total Tasks: 1

  Req[0]  TaskCount: 1  Partition: ALL
  Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
  Opsys: [NONE]  Arch: [NONE]  Features: [dedicated]
  Dedicated Resources Per Task: PROCS: 6   <----------- I find this interesting
                                                        but where is it
                                                        getting it.
  PE:  6.00  StartPriority:  1
  job cannot run in partition DEFAULT (idle procs do not meet requirements :
  0 of 6 procs found)
  idle procs:  28  feasible procs:   0

The only place I figure it may come from is the torque configuration
for the dedicated queue:

        resources_max.ncpus = 6

But my understand from reading the queue configuration guide (and feel
free to tell me I'm full of crap) is that resources_max.ncpus is the
maximum number of processors a single job can request in the queue and not
the default number of processors allocated per job if the user does not
include "-l node" argument.

> > The dedicated queue has three dual CPU, dual cores and was established in
> > torque:
> >
> >   argo17-1 np=4 Linux2.i86pc dualcore amd smp dedicated
> >   argo18-2 np=4 Linux2.i86pc dualcore amd smp dedicated
> >   argo18-3 np=4 Linux2.i86pc dualcore amd smp dedicated

I've always wanted to ask this question. Does the np refer to "real,
physical processors" or does it refer to the total number of cores?
If the former, then argo17-1 should be:
  argo17-1 np=2:ppn=2 Linux2.i86pc dualcore amd smp dedicated

If the latter, then:
  argo17-1 np=4 Linux2.i86pc dualcore amd smp dedicated
is correct

> Don't change the number of CPUs in a task.  Down that road lies madness.

ok. Technically "done that road lies more madness."

> >    2) I'm unclear as to how the "task" number is derived? I noticed that
> >       my hello_world has a PE of 6. Is that a coincidence or does the
> >       resulting PE become the number of tasks? Why six processors for
> >       hello_world?
> We would need to see that actual request.

I'm not being funny but how does one get the request. From the checkjob


And, I don't want to forget to say, thank you for your help.

