[torqueusers] pbs_sched only few jobs running
knielson at adaptivecomputing.com
Fri Sep 3 09:45:03 MDT 2010
----- Original Message -----
From: "Frederick Kramer" <kramer at ikf.uni-frankfurt.de>
To: torqueusers at supercluster.org
Sent: Friday, September 3, 2010 3:43:50 AM
Subject: [torqueusers] pbs_sched only few jobs running
we have a small cluster set up with around 20 CPU cores.
Currently we are facing the following problem: The queues are filled with a few hundred jobs but only 8 are running. pbsnodes says that the nodes are free.
How can I find out what's wrong?
Or is this a common problem?
Under TORQUE the free status for a node means that it has processors available to run jobs. It does not mean there are not jobs running on the node.
If the jobs waiting in the queue require more processors than are currently available then those jobs will wait until enough processors are ready.
If you have a node that is configured with np=4 in the nodes file and it currently has a job running on two of the processors, then the number of processors available on the node is two. If a job needs more than two processors to run it will need to wait.
With only 20 CPU cores, 8 running jobs does not seem an unreasonable number depending on how many processors each job is requesting.
More information about the torqueusers