[torqueusers] All Jobs stucked in undelivered dir
Garrick Staples
garrick at usc.edu
Thu Mar 2 14:20:43 MST 2006
On Thu, Mar 02, 2006 at 10:45:21PM +0800, group hpc alleged:
> Hi all,
>
> Torque seems not working well recently on our server. The output and error
> files not be able to copy back to user home directory, but it stored in
> /var/spool/pbs/undelivered. Can anyone shows me how to fix it? Thanks.
The user should have gotten an email with the exact error message.
> Btw, does anyone knows what this mean - " pbs_mom;Req;dis_reply_write;DIS
> reply failure, -1"?
> I have included the pbs_mom log as below:
That doesn't look good. MOM is trying to reply to the queue requests,
but is failing. Do you have any port filtering on the pbs_server host
or anything like that?
--
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20060302/bd76b1cd/attachment.bin
More information about the torqueusers
mailing list