[Mauiusers] maui not scheduling when no resources avaliable
Jan.Ploski at offis.de
Thu Dec 13 05:59:59 MST 2007
mauiusers-bounces at supercluster.org schrieb am 12/13/2007 01:05:06 PM:
> Something similar happened when requesting hosst with "slc3 && slc4",
> no nodes fit that condition and maui got hanged....
> So, is it a bug?¿ Is anyone having same problem ? any workaround?
I once had the same kind of problem - a job stuck in the front of the
queue preventing other jobs from executing even though checkjob reported
"can run" for them. In my case, it was due to an inconsistency between
Maui's and TORQUE's view of the available resources - Maui was trying to
assign a job to an already occupied resource - because it thought jobs
running there each use 0 processors, TORQUE was rejecting these attempts.
Maybe the output of diagnose -n <name of the offline node>, diagnose -j
<job id>, diagnose -r will provide additional clues?
Another thing that you might try is setting the node 'down' (kill pbs_mom
on it) rather than 'offline' to see if it changes anything.
More information about the mauiusers