[torqueusers] Torque 4.1.4: Running jobs discrepancy

Joerg Blank j.blank at fz-juelich.de
Fri Jan 11 11:49:43 MST 2013


Am 11.01.2013 18:17, schrieb David Beer:

> If you don't think this is due to the above-described scenario, can you
> provide some more details of what happens to get into this state? How
> long does this state persist? Does it get cleaned up? Do you have
> messages about rejected job obituaries in the server logs?

I think I cleaned that up by fixing the job numbers in serverdb leftover
bugged from before the 2.5.x upgrade.

This happened on an arrayjob wie max run parameter "-t 0-100%20"
Those jobs did not run (I checked on the node), but they used a slot in
some functions (and did not show up in others).
There were 20 Jobs not on hold, but whenever one was scheduled, it could
not start because of the 20 jobs limitation. This created another ghost job.

Regards,
Jörg Blank




More information about the torqueusers mailing list