[torqueusers] mom rejecting job?

Lippert, Kenneth B. Kenneth.Lippert at alcoa.com
Tue Dec 13 14:04:23 MST 2005


Hello again.  

Back on the HPUX.  I gave up trying to get the HPUX client to work with
the Linux server, so I just made one of the HPUX machines a server, and
set the HPUX machines as a separate cluster from the Linux one.

Things are progressing.  Now I can submit a job from any of the
machines, but if I request it run anywhere except the server the job
queues forever with the following from maui's "checkjob".

job is deferred.  Reason: RMFailure (cannot start job - RMFailure, rc:
15041, msg 'execution server rejected request MSG=sendfailed, STARTING')

I have a separate queue for each machine which I tie to a particular
machine by having a 

"resources_default.neednodes=local_machine_node_name"

in the queue definition.

Thanks for any pointers, sorry to be a pain.

-k


More information about the torqueusers mailing list