[torqueusers] all jobs in Q state
glen.beane at gmail.com
Thu Feb 22 10:04:19 MST 2007
On 2/22/07, Krause, Roland <Roland.Krause at amtc-dresden.com> wrote:
> Hi all,
> today I have following problem with torque. We have a 40 node cluster
> with 4 time-shared and 36 cluster nodes.
> All jobs I submit to torque are going to state 'Q', even if I request a
> definitely free node.
> qsub -V
> to get a time-shared node does not work, allthough they are all "free".
> I thought, if I request a time-shared node, I will always
> get immediatelly a running session on such a node?
> qsub -V -l nodes=nameOfFreeNode
> to get a cluster not does not work, allthough this node is "free" as
> Comment is: Not running: Draining system to allow starving job to run
> Quick help would be highly appreciated
are you running the fifo scheduler that comes with torque? What it
looks like is happening is you have a job in your queue that has
exceeded the queue time to be considered starving. The fifo scheduler
will not run any new jobs until it has enough free nodes to run the
starving job, even if it could backfill into nodes currently free and
not push back the start time of the starved job
you might want to try maui or moab
More information about the torqueusers