[torqueusers] Interpreting Exit_status in server accounting files

Jeroen van den Muyzenberg Jeroen.vandenMuyzenberg at csiro.au
Tue Jan 10 16:06:32 MST 2006


Erm... I'll take that back. Have had a brief look and it seems that
exit_status could be assigned to one of the JOB_EXIT_* definitions in
job.h.

However I can't see the correlation with those definitions and the final
reported value.

???

Jeroen

On Tue, 10 Jan 2006, Jeroen van den Muyzenberg wrote:

> The exit status should be (haven't checked) the return from the exec'd
> job. We've had a look at them recently and they do seem to conform to;
>
>     Exit_status >> 8 # Actual exit value
>     Exit_status & 127 # Signal number if thus killed
>     Exit_status & 128 # True if a core dump happened
>
> Jeroen
>
> On Tue, 10 Jan 2006, Ole Holm Nielsen wrote:
>
>>  I'm working on the "pbsacct" accounting package for Torque/PBS
>>  and would like to understand the meaning of the "Exit_status"
>>  numbers in the server accounting files.  Unfortunately, I
>>  haven't been able to find a list of exit status values in the
>>  Torque source tree.  Going through some of our accounting files,
>>  I find a number of jobs with non-zero "Exit_status" values
>>  such as: 1, 126, 127, 139, 143, 265, 271.
>>
>>  Question: How do I assign a meaning to these "Exit_status" values
>>  so that I can decide whether or not to flag a job termination as OK
>>  (or just sort of OK) or as "failed" in the accounting output ?
>>  It would also be nice to know if a job exited because of wall or
>>  cpu time exceeded.
>>
>>  Thanks,
>>  Ole
>>
>>  --
>>  Ole Holm Nielsen
>>  Department of Physics, Technical University of Denmark
>>  _______________________________________________
>>  torqueusers mailing list
>>  torqueusers at supercluster.org
>>  http://www.supercluster.org/mailman/listinfo/torqueusers
>> 
>> 
>


More information about the torqueusers mailing list