[torqueusers] Compute nodes can not work
Greenseid, Joseph M (IS)
Joseph.Greenseid at ngc.com
Wed Apr 29 07:50:37 MDT 2009
it sounds like the compute nodes are having trouble sending files back to the head node once the job is complete. from a compute node, can you ssh back to the head node with no password?
do the .out and .err files end up in the "undelivered" directory of the PBS directory on the compute nodes?
From: torqueusers-bounces at supercluster.org on behalf of baibart
Sent: Wed 4/29/2009 8:07 AM
Subject: [torqueusers] Compute nodes can not work
I have 3 nodes .The server node also joins computing .When i sent a job with the other two computing nodes the job can not run .
For example qsub 1.job .
#PBS -l nodes=3:ppn=2
#PBS -N pbs
The state is E
if I choose one node, i must use the head node (cause if i turn off the head node .also can not work).it run.
#PBS -l nodes=1:ppn=2
#PBS -N pbs
node1 is my server node
all nodes are free They seem ok
each node has the same config
Thanks in advance!
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the torqueusers