[torquedev] PPN/Node state bug?

Simon Toth SimonT at mail.muni.cz
Tue Sep 15 03:05:01 MDT 2009


>> Until all processes are busy pbsnodes will report the node as free. This
>> is because there are available processes free for that node. Once all
>> nodes are used the state becomes job-exclusive because at this point
>> there are no processes available.
> 
> Can you elaborate on the difference between shared and exclusive then?
> 
> Torque doesn't support cpus (at least I don't know about it), so, if I
> have a 5cpu MOM I set the np to 5 on the server, and then request the
> cpus using ppn, when I submit jobs.
> 
> Logically I would assume that if the jobs is exclusive and I request
> ppn=3 and my job is assigned to a MOM with np=10 no other jobs will run
> on this MOM.
> 
> I have added support for npfree to the server, so the scheduler can see
> how many cpus are still free on the MOM, but because the server reports
> the MOM as free (even if there is an exclusive job), I can't distinct
> between shared and exclusive jobs and run all as shared.
> 
> [please clarify where I'm going wrong]

After reading the code, its clear to me, that the server actually works
this way:

exclusive = per sub-mom/process exclusive
shared = non-exclusive (any amount can run on a sub-mom)

For the setup I'm developing the scheduler, shared is completely
unusable and exclusive actually means shared.

So, before I will jump into implementing the exclusive functionality, I
just want to make sure that torque doesn't support exclusive jobs/nodes
in the meaning: "no other job will run on the same MOM while an
exclusive job is running".

-- 
Mgr. Simon Toth
CESNET z.s.p.o.
Zikova 4
160 00 Praha 6
Czech Republic


More information about the torquedev mailing list