[torqueusers] momctl error - A description
garrick at clusterresources.com
Tue Mar 6 14:54:48 MST 2007
On Tue, Mar 06, 2007 at 04:53:16PM -0500, michael young alleged:
> Sorry about that.
> We have a cluster of Sun servers.
> 1 master and 12 slave nodes.
> AMD Opteron Processor 248 2.2 GHz, 4GB ram, 74 GB SCSI HD
> It runs Spartan '04 on Red Hat Enterprise Linux AS release 4 (Nahant
> Update 1).
> master node's name: cluster
> slave node's names: he1 - he12
> When a job is submitted to the cluster, it runs only on the master node.
> It does not pass any work to the slave nodes.
While the job is running, does 'qstat -n <jobid>' show that the job is
assigned to a node?
Since your master node isn't running pbs_mom, this implies that the
problem is in your job script. Is your job script using $PBS_NODEFILE
to spawn the processes?
More information about the torqueusers