[torqueusers] Interpreting Exit_status in server accounting files

Jeroen van den Muyzenberg Jeroen.vandenMuyzenberg at csiro.au
Tue Jan 10 04:58:00 MST 2006


The exit status should be (haven't checked) the return from the exec'd
job. We've had a look at them recently and they do seem to conform to;

     Exit_status >> 8 # Actual exit value
     Exit_status & 127 # Signal number if thus killed
     Exit_status & 128 # True if a core dump happened

Jeroen

On Tue, 10 Jan 2006, Ole Holm Nielsen wrote:

> I'm working on the "pbsacct" accounting package for Torque/PBS
> and would like to understand the meaning of the "Exit_status"
> numbers in the server accounting files.  Unfortunately, I
> haven't been able to find a list of exit status values in the
> Torque source tree.  Going through some of our accounting files,
> I find a number of jobs with non-zero "Exit_status" values
> such as: 1, 126, 127, 139, 143, 265, 271.
>
> Question: How do I assign a meaning to these "Exit_status" values
> so that I can decide whether or not to flag a job termination as OK
> (or just sort of OK) or as "failed" in the accounting output ?
> It would also be nice to know if a job exited because of wall or
> cpu time exceeded.
>
> Thanks,
> Ole
>
> -- 
> Ole Holm Nielsen
> Department of Physics, Technical University of Denmark
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>

Jeroen van den Muyzenberg
CSIRO High Performance Scientific Computing
Bureau of Meteorology/CSIRO HPCCC -
High Performance Computing and Communications Centre
Ph: +61 3 9669 8111 Fax: +61 3 9669 8112
Jeroen.vandenMuyzenberg at csiro.au


More information about the torqueusers mailing list