[torqueusers] memory limit -l mem is not working

Sergey Bulk sergey_bulk at list.ru
Tue Jan 10 07:24:41 MST 2012


James, thank you for answer.

Unfortunately, it seems that neither pmem, nor vmem 
does not work too.

For example, if I run 4 jobs requesting 6CPU and 48GB each  
on 48G 24-core node

#PBS -l pmem=8gb,nodes=node32:ppn=6,vmem=48gb

they all are running simultaneously.




30 декабря 2011, 22:14 от "Coyle, James J [ITACD]" <jjc at iastate.edu>:
> Sergey,
> 
> There are two options:
> 
> 1) For each queue, set reasonable low defaults for pmem and vmem
> e.g. for nodes which have 512gb and 32 processor cores, set to
> 512gb/32=16gb
> qmgr -c 'set queue large resources_default.pmem = 16gb'
> qmgr -c 'set queue large resources_default.vmem = 16gb'
> 
> This will force users to specify pmem= and vmem=
> if they want more than this, otherwise they just get
> 16gb for both.
> 
> 2) Write a submit filter which scan for mem= (and maybe vmem=
>     pmem= and ndoes=N:ppn=M
>    The you can alter the job submitted.
> E.g. on a 512GB node with 32 processors (i.e. 16gb per processor),
>  the submit filter could calculate ceiling(mem/mem_per_processor)
>                                  = ceiling(400gb/16gb)
>                                  = 25
>  Then use that value to change the ppn= to (max(8,25))
> in the job request.  This just reserves as many processors as
> needed with each getting their share of the memory, unless
> they already have more processors reserved.
> 
>  - Jim C.
> 
> >-----Original Message-----
> >From: torqueusers-bounces at supercluster.org [mailto:torqueusers-
> >bounces at supercluster.org] On Behalf Of Sergey Bulk
> >Sent: Thursday, December 29, 2011 6:19 AM
> >To: torqueusers at supercluster.org
> >Subject: [torqueusers] memory limit -l mem is not working
> >
> >I have torque 2.5.7-9.el6 from epel repo on SL6.
> >
> >When requesting resources with
> >
> >#PBS -l mem=400gb,nodes=node01:ppn=8
> >
> >torque does not take mem parameter into account.
> >
> >So, I users can run 2 jobs requesting 800gb memory in total
> >on a 500gb memory node.
> >
> >How to address this issue?
> >
> >Thank you,
> >SN
> >_______________________________________________
> >torqueusers mailing list
> >torqueusers at supercluster.org
> >http://www.supercluster.org/mailman/listinfo/torqueusers
> 


More information about the torqueusers mailing list