[torqueusers] Compute nodes can not work

Greenseid, Joseph M (IS) Joseph.Greenseid at ngc.com
Wed Apr 29 07:50:37 MDT 2009

it sounds like the compute nodes are having trouble sending files back to the head node once the job is complete.  from a compute node, can you ssh back to the head node with no password?  
do the .out and .err files end up in the "undelivered" directory of the PBS directory on the compute nodes? 


From: torqueusers-bounces at supercluster.org on behalf of baibart
Sent: Wed 4/29/2009 8:07 AM
To: torqueusers
Subject: [torqueusers] Compute nodes can not work

Hi all
      I have 3 nodes .The server node also joins computing .When i sent a job with the other two computing nodes the job can not run .
      For example  qsub 1.job .
      cat 1.job 
      #PBS -l nodes=3:ppn=2
      #PBS -N pbs
       cat $PBS_NODEFILE>/home/pbs/3
       The  state is E 
       if I choose one node, i must use the head node (cause if i turn off the head node .also can not work).it run. 
       #PBS -l nodes=1:ppn=2
       #PBS -N pbs
       cat $PBS_NODEFILE>/home/pbs/3
       cat 3
       node1 is my server node
       pbsnode -a 
       all nodes are free They seem ok
       cat /var/spool/torque/mom_priv/config
       $clienthost node1
       $logevent 255
       each node has the same config
       Thanks in advance!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20090429/3966f921/attachment.html 

More information about the torqueusers mailing list