[torqueusers] Job Allocation on Nodes

Gareth.Williams at csiro.au Gareth.Williams at csiro.au
Thu Mar 8 13:33:31 MST 2012


> -----Original Message-----
> From: Svancara, Randall [mailto:rsvancara at wsu.edu]
> Sent: Thursday, 8 March 2012 11:41 AM
> To: Torque Users Mailing List
> Subject: Re: [torqueusers] Job Allocation on Nodes
> 
> Hi,
> 
> Basically for the reason you described, prevent users from over
> subscribing a node in term of memory.  I am still working to get a
> better handling on the scheduling jobs.    Perhaps I need to look at
> the -l mem flag?   If I say I need five nodes, with 24GB of RAM per
> node, will -l mem=24GB give me a five nodes with 1 core and 24GB of
> RAM.   At this point I have been using nodes and ppn to regulate how
> much runs on each node, but I admit, it is problematic as there is no
> guarantee that someone else will not use the same node.

Hi Randall,

I'd look at -l vmem rather than mem.  vmem is whole-of-job so for 
exclusive access to 24GB nodes (because all the memory would be 
dedicated) you could have requests like -l nodes=12:ppn=3,vmem=288GB 
and -l nodes=5:ppn=1,vmem=120GB.

Gareth

> 
> Thanks,
> 
> Randall Svancara
> High Performance Computing Systems Administrator
> Washington State University
> 509-335-3039
> 
> 
> -----Original Message-----
> From: torqueusers-bounces at supercluster.org [mailto:torqueusers-
> bounces at supercluster.org] On Behalf Of Gareth.Williams at csiro.au
> Sent: Wednesday, March 07, 2012 4:31 PM
> To: torqueusers at supercluster.org
> Subject: Re: [torqueusers] Job Allocation on Nodes
> 
> > Perhaps this question has been answered before.  I have users who
> want to distribute jobs equally amongst nodes.  What I am observing at
> the moment is that when a user submits a job with nodes=12:ppn=3, the
> job uses three nodes with 12 cores per node.  Is there a way to make
> the job use only three cores per node.  How can I prevent this or setup
> some kind of affinity for following the user's job requirements?
> 
> Hi Randall,
> 
> Why would you want to do such a thing?  If the user submits four of the
> jobs they will align, and you will get worse contention.  I would
> suggest: if you need to spread jobs to access memory then you should
> schedule memory and/or if you need to avoid contention, say for memory
> bandwidth, then get the users to request whole nodes (all the available
> ppn) and only run as many processes as their scaling permits (they may
> need custom mpirun options).
> 
> Gareth
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers



More information about the torqueusers mailing list