[torqueusers] mpiexec errors
Gelonia L. Dent
gdent at amnh.org
Mon Apr 27 10:48:25 MDT 2009
Error messages.
We are running Scyld Taskmaster and since a recent reboot of the headnode,
the following error persists when jobs are submitted to the scheduler,
launch then are rejected.
mpiexec: Error: get_hosts: pbs_connect: no error.
Moab seems to be functioning properly
[root at enyo ~]# mdiag -S
Moab Server 'Scyld' running on enyo:42559 (Mode: NORMAL)
Time(ms) Sched: 0 RMLoad: 1 RMProcess: 0 RMAction: 0
Triggers: 0 User: 0 Idle: 61077 Total: 61078
Load(5m) Sched: 0.00% RMLoad: 0.00% RMProcess: 0.00% RMAction: 0.03%
Triggers: 0.00% User: 0.03% Idle: 99.94%
Load(24h) Sched: 0.00% RMLoad: 0.00% RMProcess: 0.00% RMAction: 0.00%
Triggers: 0.00% User: 0.00% Idle: 100.00%
PollInterval: 00:01:00 (Avg Sched Interval: 00:00:57 Iterations: 1375)
NOTE: scheduler will restart in 1:57:57
Message: profiling enabled (50 of 50 samples/00:30:00 interval)
However,
[root at enyo ~]# momctl -d 3
ERROR: query[0] 'diag3' failed on localhost (errno=0-Success:
5-Input/output error)
Does anyone know how to resolve this problem?
More information about the torqueusers
mailing list