[torqueusers] PBS unable to execute Job.

Ashley Wright a2.wright at qut.edu.au
Tue Sep 6 17:13:39 MDT 2005


Hi,

Today I have tried to execute qsub with torque, and it has failed. It 
was working on Monday and I don't think any configuration has changed 
since then. ssh is working fine, and /home is nfs mounted.

Can anybody help me? Below is the output from various log files.


When I run 'qsub time' (time is my script) I get a mail back which says:
PBS Job Id: 889.auriga.qut.edu.au
Job Name:   time
Aborted by PBS Server
Job cannot be executed
See Administrator for help


I get the following lines is server_log:
09/07/2005 08:59:48;0100;PBS_Server;Job;889.auriga.qut.edu.au;enqueuing 
into gen_30min, state 1 hop 1
09/07/2005 08:59:48;0008;PBS_Server;Job;889.auriga.qut.edu.au;Job Queued 
at request of wright4 at auriga.qut.edu.au, owner = 
wright4 at auriga.qut.edu.au, job name = time, queue = gen_30min
09/07/2005 08:59:48;0040;PBS_Server;Svr;auriga.qut.edu.au;Scheduler sent 
command new
09/07/2005 08:59:49;0100;PBS_Server;Req;;Type StatusNode request 
received from root at auriga.qut.edu.au, sock=10
09/07/2005 08:59:49;0100;PBS_Server;Req;;Type StatusQueue request 
received from root at auriga.qut.edu.au, sock=10
09/07/2005 08:59:49;0100;PBS_Server;Req;;Type StatusJob request received 
from root at auriga.qut.edu.au, sock=10
09/07/2005 08:59:49;0100;PBS_Server;Req;;Type ModifyJob request received 
from root at auriga.qut.edu.au, sock=10
09/07/2005 08:59:49;0008;PBS_Server;Job;889.auriga.qut.edu.au;Job 
Modified at request of root at auriga.qut.edu.au
09/07/2005 08:59:49;0100;PBS_Server;Req;;Type RunJob request received 
from root at auriga.qut.edu.au, sock=10
09/07/2005 08:59:49;0008;PBS_Server;Job;889.auriga.qut.edu.au;Job Run at 
request of root at auriga.qut.edu.au
09/07/2005 08:59:49;0100;PBS_Server;Req;;Type JobObituary request 
received from pbs_mom at node010, sock=13
09/07/2005 
08:59:49;0010;PBS_Server;Job;889.auriga.qut.edu.au;Exit_status=-1 
resources_used.cput=00:00:00 resources_used.mem=0kb 
resources_used.vmem=0kb resources_used.walltime=00:00:00
09/07/2005 08:59:49;0100;PBS_Server;Job;889.auriga.qut.edu.au;dequeuing 
from gen_30min, state EXITING


I get the following line in mom_log on node010:
09/07/2005 08:59:49;0001;   pbs_mom;Job;TMomFinalizeJob3;job not 
started, Failure job exec failure, before files staged, no retry
09/07/2005 08:59:49;0001;   pbs_mom;Job;889.auriga.qut.edu.au;ALERT:  
job failed phase 3 start, server will retry


Thanks,
Ashley

-- 
Ashley Wright
3864 9264
a2.wright at qut.edu.au
HPC and Research Support Group
Queensland University of Technology (QUT)



More information about the torqueusers mailing list