[torquedev] epilogue job exit code

Gareth.Williams at csiro.au Gareth.Williams at csiro.au
Thu Jun 16 05:45:46 MDT 2011


Hi All,

Can anyone confirm that there is a bug in the job exit code (10th argument to epilogue)?  I get the right Exit_status in the server and accounting log, but in the epilogue I seem to get the a particular number regardless of the exit code and it seems to be my numeric uid (18686).

Some of the relevant code is in prolog.c:
      sprintf(exit_stat,"%d",
              pjob->ji_qs.ji_un.ji_exect.ji_exitstat);
and
      arg[10] = exit_stat;

The server code which works OK looks similar in req_jobobit.c:

  sprintf(acctbuf, msg_job_end_stat,
          pjob->ji_qs.ji_un.ji_exect.ji_exitstat);
where:
lib/Liblog/pbs_messages.c:char *msg_job_end_stat = "Exit_status=%d";

but it's mom vs server code so perhaps the mom data structure is not populated correctly (at least not when the value is read!).

I'm working with the 3.0.2 snapshot but have noticed this problem for a while though have not complained (I thought I might be mishandling high numbered arguments to shell script but I reckon I've ruled that out just now).

regards,

Gareth


More information about the torquedev mailing list