[torqueusers] non-existing job
Garrick Staples
garrick at usc.edu
Wed Jan 23 12:36:05 MST 2008
On Wed, Jan 23, 2008 at 02:39:35PM +0100, Schulz, Henrik alleged:
> Dear all,
>
> I am running TORQUE 2.1.2 together with MAUI 3.2.6p16. On a couple of nodes I have such a line every 45 seconds in my mom_logs:
>
> 01/23/2008 00:00:27;0008; pbs_mom;Job;34422.master;job was terminated
>
> On other nodes job 34422 produces this line:
>
> 01/23/2008 14:33:17;0008; pbs_mom;Job;34422.master;ERROR: received request 'KILL_JOB' from 10.0.0.91:1023 for job '34422.master' (job does not exist locally)
>
> The job with number 34422 really existed, but this was about 4 months ago. Maybe there was a problem with this job concerning MPI communication.
>
> Is there any chance to suppress these messages?
momctl -c 34422 -h <momhostname>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20080123/6642c6e0/attachment.bin
More information about the torqueusers
mailing list