[torqueusers] RM Failure - MOM rejected
Gaurav Chopra
gauravchopra at gmail.com
Tue Mar 14 08:49:04 MST 2006
Hi
I submitted this job on the cluster and the job is deferred. Using tracejob
I get:
03/14/2006 05:06:17 S unable to run job, MOM rejected/rc=1
_
Using checkjob $PBS_ID_
StartDate: -00:06:36 Tue Mar 14 05:06:18
Total Tasks: 1
Req[0] TaskCount: 1 Partition: ALL
Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0
Opsys: [NONE] Arch: [NONE] Features: [NONE]
IWD: [NONE] Executable: [NONE]
Bypass: 0 StartCount: 2
PartitionMask: [ALL]
Flags: RESTARTABLE
job is deferred. Reason: RMFailure (cannot start job - RM failure, rc:
15041, msg: 'Execution server rejected request MSG=send failed, STARTING')
Holds: Defer (hold reason: RMFailure)
PE: 1.00 StartPriority: 1
cannot select job 99950 for partition DEFAULT (job hold active)
Please advice
Gaurav
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20060314/84cf09a9/attachment.html
More information about the torqueusers
mailing list