[torqueusers] Torque 4.1.2 does not accept hostname with '-'

Ezell, Matthew A. ezellma at ornl.gov
Sat Oct 20 22:41:14 MDT 2012


>> At RHEL6, if the headnode hostname consists of char "-",
>> jobs will keep running but not stop, checkjob shows message
>> "cannot start job - RM failure, rc: 15033, msg: 'End of File' "
>> 
>> The problem is not found if the hostname has no "-".

> Have you had any luck tracking down the issue in the code?  I've been
> looking at it, but I don't see anything jumping out at me.

We found this on our test system.  The problem was in the 4.1.2 "subjob" feature.  We developed patches and sent them to Adaptive.  You can either pull r6794 and r6799 from the subversion branch '4.1-fixes' or just wait until 4.1.3 is released.

Good luck,
~Matt

---
Matt Ezell
HPC Systems Administrator
Oak Ridge National Laboratory


More information about the torqueusers mailing list