[Mauiusers] completed jobs still shown in queue

Bisbal, Prentice PBisbal at LexPharma.com
Wed Mar 1 11:29:53 MST 2006


qdel didn't work for me - something about the job being in an invalid state for that operation. 

All the jobs involved were on a system that was very loaded (8 cpus, all at 99% usage). I suspect the heavy loading of the system caused delays in communication which in turn caused some sort fo message time out. 

Prentice 



-----Original Message-----
From: Stewart.Samuels at sanofi-aventis.com [mailto:Stewart.Samuels at sanofi-aventis.com]
Sent: Wed 3/1/2006 12:45 PM
To: Bisbal, Prentice; mauiusers at supercluster.org
Subject: RE: [Mauiusers] completed jobs still shown in queue
 
We se the same behavior periodically.  We are running torque-1.2.0p1 and maui-3.2.6p11.  Not only is this an anoyance, but it also prevents maui from scheduling jobs on those nodes.  Most of the time you can qdel them.
 
                Stewart

-----Original Message-----
From: mauiusers-bounces at supercluster.org [mailto:mauiusers-bounces at supercluster.org]On Behalf Of Bisbal, Prentice
Sent: Wednesday, March 01, 2006 10:03 AM
To: mauiusers at supercluster.org
Subject: [Mauiusers] completed jobs still shown in queue



I have 4 simple jobs stuck in my queue. The jobs ran to completion, but they are still shown as being in the queue:


$ showq
ACTIVE JOBS--------------------
JOBNAME            USERNAME      STATE  PROC   REMAINING            STARTTIME

3183                pxxxxxx    Running     1    00:46:01  Wed Mar  1 09:44:58
3184                pxxxxxx    Running     1    00:46:04  Wed Mar  1 09:45:01
3185                pxxxxxx    Running     1    00:46:04  Wed Mar  1 09:45:01
3186                pxxxxxx    Running     1    00:46:04  Wed Mar  1 09:45:01

     4 Active Jobs       4 of   22 Processors Active (18.18%)
                         1 of    7 Nodes Active      (14.29%)

A tracejob shows that these jobs completed and exited w/o any errors:

$  tracejob 3186

Job: 3186.hw-emperor.lexpharma.com

03/01/2006 09:43:38  S    enqueuing into batch, state 1 hop 1
03/01/2006 09:43:38  S    Job Queued at request of
                          pxxxxxx at hw-underdog.xxxxxxxxx.com owner =
                          pxxxxxx at hw-underdog.xxxxxxxxx.com, job name =
                          PBS_TEST.87, queue = batch
03/01/2006 09:45:02  S    Job Modified at request of
                          maui at hw-emperor.lexpharma.com
03/01/2006 09:45:02  S    Job Run at request of maui at hw-emperor.xxxxxxxxxx.com
03/01/2006 09:45:33  S    Exit_status=0 resources_used.cpupercent=0
                          resources_used.cput=00:00:00 resources_used.mem=5408kb
                          resources_used.vmem=9280kb
                          resources_used.walltime=00:00:30

Any idea why these jobs are still shown in the queue? What is the best way to get rid of them?

Prentice



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20060301/c1f4d635/attachment-0001.html


More information about the mauiusers mailing list