[Mauiusers] standing reservation strangeness on Maui 64-bit

Peter Oettl poettl at CSCS.CH
Fri Mar 5 04:18:02 MST 2010


Hi,

we try to configure maui on a separate torque server with the same working configuration from our current setup.
Maui reports that the node is overcommitted. What looks strange to us is that checknode does only show a DISK
as "Configured Resources". On our working implementation we see PROCS, MEM, SWAP and DISK.
We tried already different simple reservations, so we think that the problem may be the "Configured Resources"


[root at lrms02 yum.repos.d]# checknode wn101


checking node wn101.lcg.cscs.ch

State:      Idle  (in current state for 00:14:24)
Configured Resources: DISK: 1M
Utilized   Resources: SWAP: 272M
Dedicated  Resources: [NONE]
Opsys:         linux  Arch:      [NONE]
Speed:      1.00  Load:       0.000
Network:    [DEFAULT]
Features:   [lcgpro]
Attributes: [Batch]
Classes:    [other 16:16]

Total Time: 4:08:27:30  Up: 18:13:13 (17.44%)  Active: 00:00:00 (0.00%)

Reservations:
  User '.0.0'(x1)  -00:14:24 ->   INFINITY (  INFINITY)
    Blocked Resources at -00:14:24   Procs: 2/1 (200.00%)
  User '.0.1'(x1)  -00:14:24 ->   INFINITY (  INFINITY)
    Blocked Resources at -00:14:24   Procs: 2/1 (200.00%)
ALERT:  node is overcommitted at time -00:14:24 (P: 12)


and here the output from the working server:

[root at ce01 ~]# checknode wn10


checking node wn10.lcg.cscs.ch

State:   Running  (in current state for 00:00:00)
Configured Resources: PROCS: 16  MEM: 31G  SWAP: 76G  DISK: 1M
Utilized   Resources: [NONE]
Dedicated  Resources: PROCS: 4  MEM: 8000M
Opsys:         linux  Arch:      [NONE]
Speed:      1.00  Load:       3.300
Network:    [DEFAULT]
Features:   [lcgpro]
Attributes: [Batch]
Classes:    [atlas 16:16][cms 13:16][lhcb 15:16][lcgadmin 16:16][ops 16:16][other 16:16][cscs 16:16]

Total Time:   INFINITY  Up:   INFINITY (98.02%)  Active:   INFINITY (86.39%)

Reservations:
  Job '3509992'(x1)  -1:12:07:47 -> 23:52:13 (2:12:00:00)
  Job '3514824'(x1)  -15:37:52 -> 1:20:22:08 (2:12:00:00)
  Job '3516656'(x1)  -7:47:44 -> 2:04:12:16 (2:12:00:00)
  Job '3516756'(x1)  -7:07:35 -> 2:04:52:25 (2:12:00:00)
  User 'atlas.0.0'(x1)   -INFINITY ->   INFINITY (  INFINITY)
    Blocked Resources at 00:00:00    Procs: 1/1 (100.00%)
  User 'cms.0.0'(x1)   -INFINITY ->   INFINITY (  INFINITY)
    Blocked Resources at 00:00:00    Procs: 0/1 (0.00%)
    Blocked Resources at 2:04:52:25  Procs: 1/1 (100.00%)
  User 'lhcb.0.0'(x1)   -INFINITY ->   INFINITY (  INFINITY)
    Blocked Resources at 00:00:00    Procs: 0/1 (0.00%)
    Blocked Resources at 23:52:13    Procs: 1/1 (100.00%)
  User 'sam_and_sgm.0.0'(x1)   -INFINITY ->   INFINITY (  INFINITY)
    Blocked Resources at 00:00:00    Procs: 1/1 (100.00%)
JobList:  3509992,3514824,3516656,3516756



We're using the following versions of torque and maui
	torque-server.x86_64		2.3.6-2cri.el5
	maui-server.x86_64		3.2.6p21-snap.1234905291.5.el5

and on the working server:
	maui-server.i386		3.2.6p21-snap.12247061
	torque-server.i386	2.3.6-1cri.slc4

Has anybody ran into this problem or has a clue what's going on?

Cheers,

  Peter


More information about the mauiusers mailing list