[torqueusers] read of pipe for sid job error

Glen Beane beaneg at umcs.maine.edu
Mon Sep 20 11:52:36 MDT 2004


On my OS X cluster, I keep getting errors from pbs_mom in the form of
"read of pipe for sid job xxx got 0 not 8".

If I kill pbs_mom on the node with signal 15, then reboot the node,
often the problem will seem to go away. Just restarting pbs_mom never
fixes the problem.


This error is coming from the start_process fuction, this particular
block of code starts around line number 2199

if (i != sizeof(sjr))
{
  sprintf(log_buffer, "read of pipe for sid job %s got %d not %d",
    pjob->ji_qs.ji_jobid,
    i,
    sizeof(sjr));

  log_err(j,id,log_buffer);

  return(-1);
}


Any help troubleshooting this problem would be greatly appreciated.




More information about the torqueusers mailing list