[torqueusers] Negative (-2) Exit_status

Sam Rash srash at yahoo-inc.com
Wed Jan 3 18:11:43 MST 2007


I frequently see this error when a job tries to execute on a box that
doesn't mount my homedir (as suggested).  

Sam Rash
srash at yahoo-inc.com
408-349-7312
vertigosr37

-----Original Message-----
From: torqueusers-bounces at supercluster.org
[mailto:torqueusers-bounces at supercluster.org] On Behalf Of Chris Samuel
Sent: Wednesday, December 27, 2006 8:10 PM
To: torqueusers at supercluster.org
Subject: Re: [torqueusers] Negative (-2) Exit_status

On Wednesday 27 December 2006 22:38, Alberto Simões wrote:

> In fact that was my conclusion as well. I tried to run a program that
> returns -2
>
>    int main(void) { return -2; }
>
> as a PBS job, but as processes in Unix return an unsigned char, I get
> 254... not -2 as an exit_status.
>
> Thus, I still do not understand that -2

Aha!  It's an internal PBS error code in the pbs_mom called JOB_EXEC_FAIL2 
(-2) and it would log an error that says:

   job exec failure, after files staged, no retry

So I reckon you need to go and hassle your cluster admin about why your job
is 
failing.  Looking in src/resmom/start_exec.c it can be anything from the 
prolog script failing through not enough resources through not being able to

find your home directory, etc..

Good luck!
Chris
-- 
 Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia




More information about the torqueusers mailing list