[torquedev] PPN/Node state bug?

Simon Toth SimonT at mail.muni.cz
Tue Sep 15 09:56:04 MDT 2009


> Until all processes are busy pbsnodes will report the node as free. This
> is because there are available processes free for that node. Once all
> nodes are used the state becomes job-exclusive because at this point
> there are no processes available.

I have re-read the code and the documentation for Torque and I have to
confirm that this is a bug.

According to the documentation an exclusive job is a job that has been
assigned exclusively to one Node (that won't run any other jobs during
this time) and such node will have the state job-exclusive (independent
on how many cpu's are empty/full).

Even more serious, the server will run exclusive and sharing job on the
same node. The final state (job-exclusive/job-sharing) is dependent on
the order in which are the jobs run.

To reproduce:
- run the server
- run one mom
- set mom np to >=2
- submit exclusive job
- submit sharing job
- qrun exclusive-job-id
- qrun sharing-job-id

Node will have job-sharing status. Swap the order of qruns for
job-exclusive status.

-- 
Mgr. Simon Toth
CESNET z.s.p.o.
Zikova 4
160 00 Praha 6
Czech Republic


More information about the torquedev mailing list