[torqueusers] Obit rejected by server?

Garrick Staples garrick at usc.edu
Fri Feb 3 09:48:40 MST 2006


On Thu, Feb 02, 2006 at 09:29:36AM -0600, Richard Rowbatham alleged:
> Can anyone tell me what might cause a "server rejected job obit - 15008"
> message (see log excerpt below). These errors cause the TORQUE server to
> never vacate a completed job, so no scheduling takes place after the
> first n jobs are assigned to my n nodes. 

15008 is PBSE_BADHOST.  pbs_server could reply with that because of a
name resolution problem, host acl violation, rhosts auth, etc.

 

> 02/01/2006 21:43:07;0002;   pbs_mom;Svr;Log;Log opened
> 
> 02/01/2006 21:43:07;0001;   pbs_mom;Svr;pbs_mom;addclient, host
> localhost not found

localhost not found?  Looks like you have deeper problems.  Did
/etc/hosts get broken?


> 02/01/2006 21:48:09;0001;   pbs_mom;Job;3823.pbssrv1;server rejected job
> obit - 15008

Anything in server's logfile about why it rejected the obit?

-- 
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20060203/14e10860/attachment.bin


More information about the torqueusers mailing list