[torqueusers] mom daemon crashes
Giorgio Padoan
gpadoan at inogs.it
Tue Jul 8 08:03:23 MDT 2008
Hello,
when the pbs server contact mom on the remote node the daemo on the node crash.
This is the log:
07/08/2008 15:34:59;0008;PBS_Server;Job;59.headnode.it;Job Modified at request of root at headnode.it
07/08/2008 15:34:59;0001;PBS_Server;Req;;Server could not connect to MOM
07/08/2008 15:34:59;0080;PBS_Server;Req;req_reject;Reject reply code=15070(Server could not connect to MOM), aux=0, type=ModifyJob,
from root at headnode.it
07/08/2008 15:43:28;0008;PBS_Server;Job;59.headnode.it;Job Run at request of root at headnode.it
07/08/2008 15:43:28;0008;PBS_Server;Job;59.headnode.it;send of job to sissi6 failed error = 15031
07/08/2008 15:43:28;0001;PBS_Server;Svr;PBS_Server;Batch protocol error (15031) in send_job, child failed in previous commit request
for job 59.headnode.it
07/08/2008 15:43:28;0008;PBS_Server;Job;59.headnode.it;unable to run job, MOM rejected/rc=1
07/08/2008 15:43:28;0080;PBS_Server;Req;req_reject;Reject reply code=15041(Execution server rejected request MSG=cannot send job to
mom, state=PRERUN), aux=0, type=RunJob, from root at headnode.it
07/08/2008 15:43:28;0040;PBS_Server;Svr;headnode.it;Scheduler sent command new
07/08/2008 15:46:58;0004;PBS_Server;Svr;check_nodes;node sissi6 not detected in 249 seconds, marking node down
07/08/2008 15:46:58;0004;PBS_Server;Svr;check_nodes;node sissi7 not detected in 291 seconds, marking node down
I have installaed:
torque-2.3.1-snap.200806261221
maui-3.2.6p20
mpich2-1.0.7
on a cluster Linux Fedora 8 x86_64 [2.6.25]
Can you help me?
Thanks in advance.
giorgio padoan
--
-----------------------------------------------------------------
Giorgio Padoan gpadoan [at] inogs.it
The computer whisperer.
GDL-SIEG Supporto informatico e grafica computerizzata
Istituto Nazionale di Oceanografia e Geofisica Sperimentale - OGS
Borgo Grotta Gigante 42/c PHONE +39 40 2140265
34010 - TRIESTE (ITALIA) FAX +39 40 327521
-----------------------------------------------------------------
More information about the torqueusers
mailing list