pmem/mem usage in Torque (was Re: [torqueusers] Online docs missing
queue resource "nodes")
Chris Samuel
csamuel at vpac.org
Tue Jan 10 18:11:49 MST 2006
On Saturday 07 January 2006 06:30, Garrick Staples wrote:
> Can this be resolved with one of the memory resources? Perhaps pmem?
> If you request pmem=3000M, then you shouldn't have to worry about
> getting both procs on a 4GB box.
Gut feelings are that this relies on two things:
a) Everyone specifying memory sizes for jobs correctly
b) Users knowing how much memory they are going to need
c) Some users won't be able to specify pmem..
The problem with (a) is that a user who doesn't specify an amount may end up
scheduled onto the same node as you and try and gobble up another 3GB RAM,
unless you force them to specify an amount of memory, which neatly leads you
into problem (b).
Some users have no clue about how much memory their process is going to need,
especially when they've not written the code themselves, they're just using
some random piece of code they've got from a supervisor/website/commercial
vendor. If they exceed the limit they've guessed at then (from my memory of
testing this before) their job gets killed by the MOM for exceeding its
allowed limit.
The (c) crowd are even harder - they're using some commercial applications GUI
or web based front end that lets them click a button to submit a job, but
they don't get to specify how much memory it's going to need.
My thought for a way around this would be to have a compile time option
(though a config file option would be great too) that changes the mom
behaviour to treat these limits as advisory and not to kill jobs that exceed
them.
Then that would give me the freedom to apply default memory limits (probably
the memory per CPU amount) to the queues that jobs would inherit if they'd
not specified one and know that if the user did exceed that then their jobs
would get mercilessly killed by the mom.
cheers,
Chris
--
Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager
Victorian Partnership for Advanced Computing http://www.vpac.org/
Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20060111/5513405c/attachment.bin
More information about the torqueusers
mailing list