[torqueusers] Missing home directory causes pbs_mom to crash ?
csamuel at vpac.org
Wed Oct 27 21:29:39 MDT 2004
One thing we see occasionally (when a node looses access to home directories
due to stale NFS file handles for instance) any job submitted to an affected
node will cause it to crash, and in the logs we see:
10/28/2004 09:19:42;0001; pbs_mom;Svr;pbs_mom;No such file or directory (2)
in fork_to_user, invalid home directory '/home/san02/rajdas' specified,
errno=2 (No such file or directory)
10/28/2004 09:19:42;0080; pbs_mom;Req;req_reject;Reject reply code=15035
( MSG=invalid home directory '/home/san02/rajdas' specified, errno=2 (No such
file or directory)), aux=0, type=54, from PBS_Server at mgtnode
This seems to be the similar to the error that Roy is seeing, though not
identical, but we don't use automounters so we see it only when we're having
This is the same snapshot of Torque as Roy's (torque-1.1.0p4-snap.1098376627).
The odd thing is that ours crashes the mom whilst his doesn't, and that may be
because on his systems his automounter has mounted the directory whilst on
ours it's still absent.
I'm trying to work further on making a reproducible test case for a crash.
Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
Victorian Partnership for Advanced Computing http://www.vpac.org/
Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20041028/d808b336/attachment.bin
More information about the torqueusers