[torqueusers] Help with Torque 4.2.4 - Nodes O.K., but jobs 'Q' and error 15010 on qrun

João Rodrigues anaryin at gmail.com
Fri Aug 9 21:07:36 MDT 2013

Dear all,

I just installed torque 4.2.4 from scratch on a CentOS cluster (ROCKS) I'm
working on. I followed the instructions in the

The output of running 'pbsnodes -a' is the following:

     state = free
     np = 24
     ntype = cluster
     status =
compute-0-14.local 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT
2012 x86_64,opsys=linux
     mom_service_port = 15002
     mom_manager_port = 15003

When I try to submit a job it shows up in 'qstat' but as Queued. Issuing
'qrun' produces the following error message:

qrun: Execution server rejected request MSG=cannot send job to mom,
state=TRNOUT 3.<redacted.host.name>

Issuing 'tracejob' to see what's up gives this in return:

08/09/2013 17:50:26  S    enqueuing into batch, state 1 hop 1
08/09/2013 17:50:26  A    queue=batch
08/09/2013 17:50:39  S    Job Run at request of root@<redacted.host.name>
08/09/2013 17:50:39  S    send of job to compute-0-4.local failed error =
08/09/2013 17:50:39  S    unable to run job, MOM rejected/rc=-1
08/09/2013 17:50:39  S    unable to run job, send to MOM '3232238330' failed

Can anyone offer a hint of what might be going on? Google doesn't know
about that TRNOUT state nor about something similar.



Disclaimer: I'm not a sysadmin nor IT guy, but I can read.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20130809/b0ed2a89/attachment-0001.html 

More information about the torqueusers mailing list