[torqueusers] RM failure, rc: 15085, msg: 'End of File'

Ken Nielson knielson at adaptivecomputing.com
Thu Sep 19 08:59:21 MDT 2013


Clotho,

TORQUE will put a job in a running state even though it is possible that
the job may fail in its link up with other MOMs. This will cause the job to
be re-queued. With the firewall turned on on one of the MOMs this is
probably what happened.




On Thu, Sep 19, 2013 at 3:58 AM, Clotho Tsang <wytsang at clustertech.com>wrote:

> Torque 4.2.2
> Maui 3.3.1
>
>
>
> On 19 September 2013 01:11, Ken Nielson <knielson at adaptivecomputing.com>wrote:
>
>> What version of TORQUE and what scheduler are you using.
>>
>>
>> On Wed, Sep 18, 2013 at 12:54 AM, Clotho Tsang <wytsang at clustertech.com>wrote:
>>
>>> Jobs turn to "R" status, and then changed to "Q" again.
>>> "checkjob" commands shows:
>>>
>>> job is deferred.  Reason:  RMFailure  (cannot start job - RM failure,
>>> rc: 15085, msg: 'End of File')
>>> Holds:    Defer  (hold reason:  RMFailure)
>>>
>>>
>>> Later we find that it is because one of nodes has turned on firewall.
>>>
>>>
>>>
>>> --
>>> Clotho Tsang
>>> Senior Software Engineer
>>> Cluster Technology Limited
>>> Email: clotho at clustertech.com
>>> Tel: (852) 2655-6129
>>> Fax: (852) 2994-2101
>>> Website: www.clustertech.com
>>>
>>> _______________________________________________
>>> torqueusers mailing list
>>> torqueusers at supercluster.org
>>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>>
>>>
>>
>>
>> --
>> Ken Nielson
>> +1 801.717.3700 office +1 801.717.3738 fax
>> 1712 S. East Bay Blvd, Suite 300  Provo, UT  84606
>> www.adaptivecomputing.com
>>
>>
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>
>> --
>> Clotho Tsang
>> Senior Software Engineer
>> Cluster Technology Limited
>> Email: <http://www.supercluster.org/mailman/listinfo/torqueusers>
>> clotho at clustertech.com
>> Tel: (852) 2655-6129
>> Fax: (852) 2994-2101
>> Website: www.clustertech.com
>>
>>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>


-- 
Ken Nielson
+1 801.717.3700 office +1 801.717.3738 fax
1712 S. East Bay Blvd, Suite 300  Provo, UT  84606
www.adaptivecomputing.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20130919/ea8b43c0/attachment.html 


More information about the torqueusers mailing list