[torqueusers] read of pipe for sid job error
Glen Beane
beaneg at umcs.maine.edu
Mon Sep 20 11:52:36 MDT 2004
On my OS X cluster, I keep getting errors from pbs_mom in the form of
"read of pipe for sid job xxx got 0 not 8".
If I kill pbs_mom on the node with signal 15, then reboot the node,
often the problem will seem to go away. Just restarting pbs_mom never
fixes the problem.
This error is coming from the start_process fuction, this particular
block of code starts around line number 2199
if (i != sizeof(sjr))
{
sprintf(log_buffer, "read of pipe for sid job %s got %d not %d",
pjob->ji_qs.ji_jobid,
i,
sizeof(sjr));
log_err(j,id,log_buffer);
return(-1);
}
Any help troubleshooting this problem would be greatly appreciated.
More information about the torqueusers
mailing list