[torqueusers] Memory allocation issues

Aaron Knister aaron at iges.org
Mon Aug 13 14:45:47 MDT 2007


So how about this--

qsub -l nodes=1:ppn=8,mem=6G ./script.sh

In scheduling the job torque interprets this as "Give the job a node 
with 8 processors and 6Gigs of ram for the job", yet in reality what 
happens is torque gives me a node with 8 processors, however each 
process gets 6gigs of ram...this will oversubscribe and crash my compute 
node. Any suggestions?

-Aaron

Garrick Staples wrote:
> On Sun, Aug 12, 2007 at 11:42:09PM -0400, Aaron Knister alleged:
>   
>> I'm having problems with my users crashing compute nodes due to code 
>> bugs and poor coding. Even when they request -l mem=xyz they can still 
>> oversubscribe a node and crash it due to memory over subscription. Other 
>> users are becoming frustrated because there are frequently 2 or 3 jobs 
>> on a compute node and any given time. Is there a way to effectively jail 
>> in a process and subset of processes so that they cannot use more than a 
>> given amount of memory?
>>     
>
> The "mem" resource doesn't set ulimits.  You want the pmem, vmem, or pvmem
> (described in the pbs_resources manpage).
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>   


More information about the torqueusers mailing list