[Mauiusers] completed jobs still shown in queue
Bisbal, Prentice
PBisbal at LexPharma.com
Wed Mar 1 14:24:28 MST 2006
Neither, it was just a short shell script. I posed it earlier. Here it
is again, in case you missed it.
#!/bin/bash
echo "Hello from $(uname -n)"
sleep 20
printenv | egrep "PBS_NODENUM|PBS_VNODENUM|PBS_TASKNUM|PBS_O_HOST" |
sort
echo " "
exit 0
Prentice
________________________________
From: Stewart.Samuels at sanofi-aventis.com
[mailto:Stewart.Samuels at sanofi-aventis.com]
Sent: Wednesday, March 01, 2006 2:57 PM
To: Bisbal, Prentice; mauiusers at supercluster.org
Subject: RE: [Mauiusers] completed jobs still shown in queue
Just curious, are you using pvm or mpi with these jobs?
Stewart
-----Original Message-----
From: Bisbal, Prentice [mailto:PBisbal at LexPharma.com]
Sent: Wednesday, March 01, 2006 1:30 PM
To: Samuels, Stewart PH/US; mauiusers at supercluster.org
Subject: RE: [Mauiusers] completed jobs still shown in queue
qdel didn't work for me - something about the job being in an
invalid state for that operation.
All the jobs involved were on a system that was very loaded (8
cpus, all at 99% usage). I suspect the heavy loading of the system
caused delays in communication which in turn caused some sort fo message
time out.
Prentice
-----Original Message-----
From: Stewart.Samuels at sanofi-aventis.com
[mailto:Stewart.Samuels at sanofi-aventis.com]
Sent: Wed 3/1/2006 12:45 PM
To: Bisbal, Prentice; mauiusers at supercluster.org
Subject: RE: [Mauiusers] completed jobs still shown in queue
We se the same behavior periodically. We are running
torque-1.2.0p1 and maui-3.2.6p11. Not only is this an anoyance, but it
also prevents maui from scheduling jobs on those nodes. Most of the
time you can qdel them.
Stewart
-----Original Message-----
From: mauiusers-bounces at supercluster.org
[mailto:mauiusers-bounces at supercluster.org]On Behalf Of Bisbal, Prentice
Sent: Wednesday, March 01, 2006 10:03 AM
To: mauiusers at supercluster.org
Subject: [Mauiusers] completed jobs still shown in queue
I have 4 simple jobs stuck in my queue. The jobs ran to
completion, but they are still shown as being in the queue:
$ showq
ACTIVE JOBS--------------------
JOBNAME USERNAME STATE PROC REMAINING
STARTTIME
3183 pxxxxxx Running 1 00:46:01 Wed
Mar 1 09:44:58
3184 pxxxxxx Running 1 00:46:04 Wed
Mar 1 09:45:01
3185 pxxxxxx Running 1 00:46:04 Wed
Mar 1 09:45:01
3186 pxxxxxx Running 1 00:46:04 Wed
Mar 1 09:45:01
4 Active Jobs 4 of 22 Processors Active (18.18%)
1 of 7 Nodes Active (14.29%)
A tracejob shows that these jobs completed and exited w/o any
errors:
$ tracejob 3186
Job: 3186.hw-emperor.lexpharma.com
03/01/2006 09:43:38 S enqueuing into batch, state 1 hop 1
03/01/2006 09:43:38 S Job Queued at request of
pxxxxxx at hw-underdog.xxxxxxxxx.com
owner =
pxxxxxx at hw-underdog.xxxxxxxxx.com, job
name =
PBS_TEST.87, queue = batch
03/01/2006 09:45:02 S Job Modified at request of
maui at hw-emperor.lexpharma.com
03/01/2006 09:45:02 S Job Run at request of
maui at hw-emperor.xxxxxxxxxx.com
03/01/2006 09:45:33 S Exit_status=0
resources_used.cpupercent=0
resources_used.cput=00:00:00
resources_used.mem=5408kb
resources_used.vmem=9280kb
resources_used.walltime=00:00:30
Any idea why these jobs are still shown in the queue? What is
the best way to get rid of them?
Prentice
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20060301/a7568db4/attachment-0001.html
More information about the mauiusers
mailing list