[Mauiusers] setres - problem draining batch queues
Paul Szczypka
paul.szczypka at gmail.com
Mon Nov 10 13:05:50 MST 2008
Hello,
I'm having a problem draining my cluster's batch queues using setres.
Adapting one of the examples given in the clusterresources documentation I
attempt to reserve every node and processor in my cluster during our
scheduled downtime:
# setres -s 18:00:00_11/18 -e 08:00:00_11/19 -n electricityDowntime ALL
I then check the reservations and see that the downtime is there and is
applied to all processors along with a few other reservations for
pre-existing jobs.
To test the reservation I submit a job which requests an excessive amount of
wallclock time (Job 918). It's scheduled to start when the downtime finishes
which is what I expect:
[root at lphesrv1 spool]# showres
Reservations
ReservationID Type S Start End Duration N/P
StartTime
909 Job R -1:21:00 83:06:39:00 83:08:00:00 1/1 Mon
Nov 10 15:40:32
912 Job R -1:09:05 4:12:50:55 4:14:00:00 1/1 Mon
Nov 10 15:52:27
913 Job R -00:49:57 4:13:10:03 4:14:00:00 1/1 Mon
Nov 10 16:11:35
918 Job I 8:14:58:28 91:22:58:28 83:08:00:00 1/1 Wed
Nov 19 08:00:00
electricityDowntime.0 User - 8:00:58:28 8:14:58:28 14:00:00 60/480
Tue Nov 18 18:00:00
17 reservations located
Unfortunately, upon checking the reservations ~ 10minutes later I find that
Job 918 has started despite the reservations overlapping:
[root at lphesrv1 spool]# showres
Reservations
ReservationID Type S Start End Duration N/P
StartTime
909 Job R -1:27:25 83:06:32:35 83:08:00:00 1/1 Mon
Nov 10 15:40:32
912 Job R -1:15:30 4:12:44:30 4:14:00:00 1/1 Mon
Nov 10 15:52:27
913 Job R -00:56:22 4:13:03:38 4:14:00:00 1/1 Mon
Nov 10 16:11:35
918 Job R -00:06:15 83:07:53:45 83:08:00:00 1/1 Mon
Nov 10 17:01:42
electricityDowntime.0 User - 8:00:52:03 8:14:52:03 14:00:00 60/480
Tue Nov 18 18:00:00
17 reservations located
Can anyone help me debug this/explain this behaviour? I can't find anything
in my maui dir logs and only:
11/10/2008 17:01:42;0100;PBS_Server;Req;;Type StatusJob request received
from root at lphesrv1.epfl.ch, sock=14
11/10/2008 17:01:42;0100;PBS_Server;Req;;Type ModifyJob request received
from root at lphesrv1.epfl.ch, sock=14
11/10/2008 17:01:42;0008;PBS_Server;Job;918.lphesrv1.epfl.ch;Job Modified at
request of root at lphesrv1.epfl.ch
11/10/2008 17:01:42;0100;PBS_Server;Req;;Type RunJob request received from
root at lphesrv1.epfl.ch, sock=14
11/10/2008 17:01:42;0008;PBS_Server;Job;918.lphesrv1.epfl.ch;Job Run at
request of root at lphesrv1.epfl.ch
in the pbs logs.
I'm using:
[root at lphesrv1 spool]# qmgr -c "p s"|grep pbs_ver
set server pbs_version = 2.3.0-snap.200801151629
[root at lphesrv1 spool]# setres -v
maui client version 3.2.6p20
Will post my maui.cfg if relevant.
Thanks,
Paul.
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Paul Szczypka, EPFL SB IPEP LPHE1, BSP 614, CH-1015 Lausanne
paul.szczypka at cern.ch Tel: +41 21 69 30495
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20081110/a00b2664/attachment.html
More information about the mauiusers
mailing list