[torqueusers] RM Failure
jacksond at clusterresources.com
Fri Mar 4 13:46:43 MST 2005
What TORQUE release are you on? This indicates that TORQUE is
attempting to start the job but when it does so, the MOM reports that
the jobs already exists and is running locally. If this is in fact the
issue, this may already be resolvved in the latest release of TORQUE.
On Fri, 2005-03-04 at 13:45 -0600, Bobby Brown wrote:
> We started seeing jobs that are blocked when there are plenty of free
> nodes and a checkjob reveals:
> Messages: cannot start job - RM failure, rc: 15041, msg: 'MSG=send
> failed, JOB_SUBSTATE_RUNNING' PE: 1.00 StartPriority: 6234
> Any ideas?
> torqueusers mailing list
> torqueusers at supercluster.org
More information about the torqueusers