pmem/mem usage in Torque (was Re: [torqueusers] Online docs missing queue resource "nodes")

Chris Samuel csamuel at vpac.org
Tue Jan 10 18:11:49 MST 2006


On Saturday 07 January 2006 06:30, Garrick Staples wrote:

> Can this be resolved with one of the memory resources?  Perhaps pmem?
> If you request pmem=3000M, then you shouldn't have to worry about
> getting both procs on a 4GB box.

Gut feelings are that this relies on two things:

a) Everyone specifying memory sizes for jobs correctly
b) Users knowing how much memory they are going to need
c) Some users won't be able to specify pmem..

The problem with (a) is that a user who doesn't specify an amount may end up 
scheduled onto the same node as you and try and gobble up another 3GB RAM, 
unless you force them to specify an amount of memory, which neatly leads you 
into problem (b).

Some users have no clue about how much memory their process is going to need, 
especially when they've not written the code themselves, they're just using 
some random piece of code they've got from a supervisor/website/commercial 
vendor.  If they exceed the limit they've guessed at then (from my memory of 
testing this before) their job gets killed by the MOM for exceeding its 
allowed limit.

The (c) crowd are even harder - they're using some commercial applications GUI 
or web based front end that lets them click a button to submit a job, but 
they don't get to specify how much memory it's going to need.

My thought for a way around this would be to have a compile time option 
(though a config file option would be great too) that changes the mom 
behaviour to treat these limits as advisory and not to kill jobs that exceed 
them.

Then that would give me the freedom to apply default memory limits (probably 
the memory per CPU amount) to the queues that jobs would inherit if they'd 
not specified one and know that if the user did exceed that then their jobs 
would get mercilessly killed by the mom.

cheers,
Chris
-- 
 Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20060111/5513405c/attachment.bin


More information about the torqueusers mailing list