[torqueusers] all jobs in Q state

Glen Beane glen.beane at gmail.com
Thu Feb 22 10:04:19 MST 2007


On 2/22/07, Krause, Roland <Roland.Krause at amtc-dresden.com> wrote:
> Hi all,
>
> today I have following problem with torque. We have a 40 node cluster
> with 4 time-shared and 36 cluster nodes.
> All jobs I submit to torque are going to state 'Q', even if I request a
> definitely free node.
>
> qsub -V
> to get a time-shared node does not work, allthough they are all "free".
> I thought, if I request a time-shared node, I will always
> get immediatelly a running session on such a node?
>
> qsub -V -l nodes=nameOfFreeNode
> to get a cluster not does not work, allthough this node is "free" as
> well.
>
> Comment is: Not running: Draining system to allow starving job to run
>
> Quick help would be highly appreciated
> Regards,
> Roland

are you running the fifo scheduler that comes with torque?  What it
looks like is happening is  you have a job in your queue that has
exceeded the queue time to be considered starving.  The fifo scheduler
will not run any new jobs until it has enough free nodes to run the
starving job, even if it could backfill into nodes currently free and
not push back the start time of the starved job

you might want to try maui or moab


More information about the torqueusers mailing list