[torqueusers] rc: 15041, msg: 'Execution server rejected request MSG=cannot send job to mom, state=PRERUN'

Guille guillermo.marco at gmail.com
Thu Nov 4 11:45:19 MDT 2010

Hi guys i'm in trouble. I'm new to TORQUE and i hope you can help me. I made
my first script (you can see it below).

#This is an example script test1.sh
#These commands set up the Grid Environment for your job:
#PBS -N test1
#PBS -l nodes=3
#PBS -q pipeline

nohup SOLiD_preprocess_filter_v1.pl -i f -f DEN2348_saet_F3.csfasta -g
DEN2348_saet_F3_QV.qual -x n -y y -e 15 -d 10 -n n -a y -o filtering_e15 -v
on &

After i try to run it like this:

qsub -l nodes=4:ppn=2 test1.sh

But my job gets deferred all the time if i check the status with checkjob:
Result of checkjob 22503:

checking job 22503

State: Idle  EState: Deferred
Creds:  user:corona  group:users  class:pipeline  qos:DEFAULT
WallTime: 00:00:00 of 99:23:59:59
SubmitTime: Thu Nov  4 18:16:27
  (Time Queued  Total: 00:22:38  Eligible: 00:00:01)

StartDate: -00:22:36  Thu Nov  4 18:16:29
Total Tasks: 8

Req[0]  TaskCount: 8  Partition: ALL
Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
Opsys: [NONE]  Arch: [NONE]  Features: [NONE]

IWD: [NONE]  Executable:  [NONE]
Bypass: 0  StartCount: 1
PartitionMask: [ALL]
Flags:       RESTARTABLE

job is deferred.  Reason:  RMFailure  (cannot start job - RM failure, rc:
15041, msg: 'Execution server rejected request MSG=cannot send job to mom,
Holds:    Defer  (hold reason:  RMFailure)
PE:  8.00  StartPriority:  1
cannot select job 22503 for partition DEFAULT (job hold active)

I get this error message. I've been checking a bit the mailing list and i
know it's a common error but i really have no clue how to solve it. If you
guys need more info for trying to find out a solution please tell me what
you need.
