[torqueusers] Bug? in torque 2.0.0p2

Åke Sandgren ake.sandgren at hpc2n.umu.se
Mon Dec 19 03:17:12 MST 2005


On Sat, 2005-12-17 at 11:29 +0100, Åke Sandgren wrote:
> > This means the child process that will eventually become the job has
> > died.  Unfortunately, this is really really hard to debug.
> > 
> > Have you tried 2.0.0p4?  We're always fixing bugs and have improved
> > logging in this area.  Be sure you configure with --enable-syslog.
> > 
> > One unfixed bug that can cause this is with the job's environmental
> > variables.  Vars with newlines and commas can break things.
> 
> It's still there in 2.0.0p4 although the error message this time says
> Bad file descriptor (9) in TMomFinalizeJob3, read of pipe for sid failed
> for job 200045.ingrid-h.hpc2n.umu.se (0 of 8 bytes)
> 
> I'll try to find this next week...

The problem was a newline in one of the env variables.

I have written a patch for qsub to handle this and a few other things
regarding the set_job_env function. Take a look at it and use the parts
you feel is worth using.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: qsub.patch
Type: text/x-patch
Size: 3724 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20051219/4ee3e30b/qsub.bin


More information about the torqueusers mailing list