[torqueusers] Obit rejected by server?
garrick at usc.edu
Fri Feb 3 09:48:40 MST 2006
On Thu, Feb 02, 2006 at 09:29:36AM -0600, Richard Rowbatham alleged:
> Can anyone tell me what might cause a "server rejected job obit - 15008"
> message (see log excerpt below). These errors cause the TORQUE server to
> never vacate a completed job, so no scheduling takes place after the
> first n jobs are assigned to my n nodes.
15008 is PBSE_BADHOST. pbs_server could reply with that because of a
name resolution problem, host acl violation, rhosts auth, etc.
> 02/01/2006 21:43:07;0002; pbs_mom;Svr;Log;Log opened
> 02/01/2006 21:43:07;0001; pbs_mom;Svr;pbs_mom;addclient, host
> localhost not found
localhost not found? Looks like you have deeper problems. Did
/etc/hosts get broken?
> 02/01/2006 21:48:09;0001; pbs_mom;Job;3823.pbssrv1;server rejected job
> obit - 15008
Anything in server's logfile about why it rejected the obit?
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20060203/14e10860/attachment.bin
More information about the torqueusers