[Mauiusers] Maui/SLURM-wiki and consumable resources other than processors

Balle, Susanne susanne.balle at hp.com
Fri Jan 14 14:52:30 MST 2005


Hi 

I am trying to use the "consumable resources" feature in Maui. 

I did a test to see if Maui register the amount of memory used 
when running a job with srun (slurm) as it does with processors and 
it doesn't.

I am trying to use the "consumable resource" feature to allow jobs 
to be scheduled more efficiently. I tested this with processors and it 
works as expected. I didn't get any nodes overallocated. In the case of 
memory Maui overallocate my nodes.

As you can see the job run by "test" is running is using %MEM 36.3%. 
Something is wrong with these numbers as well but the basic idea 
is that the program uses a non negligeable amount of memory. 
This usage is not recorded in the output from "diagnose -n".

>From the output from "diagnose -n" we can see that we are using one 
processor. On xc14n16 but the amount of memory usage is not updated.

This point is further highlighed by the output from "checknode xc14n16" 
enclosed below. Only processors are tracked.

Is this a bug? a limitation in the Maui/Slurm-wiki integration?

What does SLURM need to provide Maui for this to work?

Thanks for any help,

Regards,

Susannne

-----------------------------

Output from top:
----------------
Mem:  3905352k av, 2813060k used, 1092292k free,       0k shrd,  188356k
buff
      2239404k active,             165076k inactive
Swap: 6291288k av,       0k used, 6291288k free                  325164k
cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU
COMMAND
 6655 test      25   0 1385M 1.4G   252 R    24.9 36.3   2:44   1
matmut2
    1 root      15   0   528  528   452 S     0.0  0.0   0:44   3 init
    2 root      RT   0     0    0     0 SW    0.0  0.0   0:00   0
migration/0
    3 root      RT   0     0    0     0 SW    0.0  0.0   0:00   1
migration/1
    4 root      RT   0     0    0     0 SW    0.0  0.0   0:00   2
migration/2
    5 root      RT   0     0    0     0 SW    0.0  0.0   0:00   3
migration/3

[root at xc14n16 etc]# diagnose -n
-------------------------------
diagnosing node table (5120 slots)
Name                    State  Procs     Memory         Disk
[snip]
xc14n13                  Idle   2:2     2981:2981    12283:12283      
]                         [NONE]                         [NONE]
xc14n14                  Idle   2:2     2981:2981    12283:12283      
]                         [NONE]                         [NONE]
xc14n15                  Idle   2:2     2981:2981    12283:12283      
]                         [NONE]                         [NONE]
xc14n16               Running   3:4     3813:3813     7867:7867       
]                         [NONE]                         [NONE]
-----                     ---   9:10   12756:12756   44716:44716      
Total Nodes: 4  (Active: 1  Idle: 3  Down: 0)

[root at xc14n16 etc]# checknode xc14n16

checking node xc14n16

State:   Running  (in current state for 00:00:00)
Configured Resources: PROCS: 4  MEM: 3813M  DISK: 7867M
Utilized   Resources: [NONE]
Dedicated  Resources: PROCS: 1
Opsys:        [NONE]  Arch:      [NONE]
Speed:      1.00  Load:       0.000
Features:   [NONE]
Attributes: [Batch]
Classes:    [NONE]



More information about the mauiusers mailing list