[torqueusers] momctl error - A description
mhyoung at valdosta.edu
Tue Mar 6 15:31:46 MST 2007
Garrick Staples wrote:
>On Tue, Mar 06, 2007 at 04:53:16PM -0500, michael young alleged:
>>Sorry about that.
>>We have a cluster of Sun servers.
>>1 master and 12 slave nodes.
>>AMD Opteron Processor 248 2.2 GHz, 4GB ram, 74 GB SCSI HD
>>It runs Spartan '04 on Red Hat Enterprise Linux AS release 4 (Nahant
>>master node's name: cluster
>>slave node's names: he1 - he12
>>When a job is submitted to the cluster, it runs only on the master node.
>>It does not pass any work to the slave nodes.
>While the job is running, does 'qstat -n <jobid>' show that the job is
>assigned to a node?
How do I determan the jobid?
Just running qstat give no output.
>Since your master node isn't running pbs_mom, this implies that the
>problem is in your job script. Is your job script using $PBS_NODEFILE
>to spawn the processes?
Where do I find the job script?
I did a 'env' and there is no "$PBS_NODEFILE"
>torqueusers mailing list
>torqueusers at supercluster.org
More information about the torqueusers