[torqueusers] Interpreting Exit_status in server accounting files
Daniel.G.Roberts at sanofi-aventis.com
Tue Jan 10 12:16:36 MST 2006
I have exit status numbers of 143 and 271 on my system..
Have any thoughts on what these particular status numbers translate to?
On Tue, 2006-01-10 at 13:35 +0100, Ole Holm Nielsen wrote:
> Hi Jeroen,
> Thanks a lot. Signals 1-31 are defined in /usr/include/asm/signal.h
> but then I don't understand "Exit_status" values of 126 and 127,
> since there aren't any signals of those values. Maybe exit status
> 126 and 127 have some special meaning within Torque ?
> Jeroen van den Muyzenberg wrote:
> > The exit status should be (haven't checked) the return from the exec'd
> > job. We've had a look at them recently and they do seem to conform to;
> > Exit_status >> 8 # Actual exit value
> > Exit_status & 127 # Signal number if thus killed
> > Exit_status & 128 # True if a core dump happened
> > Jeroen
> > On Tue, 10 Jan 2006, Ole Holm Nielsen wrote:
> >> I'm working on the "pbsacct" accounting package for Torque/PBS
> >> and would like to understand the meaning of the "Exit_status"
> >> numbers in the server accounting files. Unfortunately, I
> >> haven't been able to find a list of exit status values in the
> >> Torque source tree. Going through some of our accounting files,
> >> I find a number of jobs with non-zero "Exit_status" values
> >> such as: 1, 126, 127, 139, 143, 265, 271.
> >> Question: How do I assign a meaning to these "Exit_status" values
> >> so that I can decide whether or not to flag a job termination as OK
> >> (or just sort of OK) or as "failed" in the accounting output ?
> >> It would also be nice to know if a job exited because of wall or
> >> cpu time exceeded.
> torqueusers mailing list
> torqueusers at supercluster.org
More information about the torqueusers