[torqueusers] Job Allocation on Nodes

Svancara, Randall rsvancara at wsu.edu
Wed Mar 7 17:40:33 MST 2012


Basically for the reason you described, prevent users from over subscribing a node in term of memory.  I am still working to get a better handling on the scheduling jobs.    Perhaps I need to look at the -l mem flag?   If I say I need five nodes, with 24GB of RAM per node, will -l mem=24GB give me a five nodes with 1 core and 24GB of RAM.   At this point I have been using nodes and ppn to regulate how much runs on each node, but I admit, it is problematic as there is no guarantee that someone else will not use the same node.  


Randall Svancara
High Performance Computing Systems Administrator
Washington State University

-----Original Message-----
From: torqueusers-bounces at supercluster.org [mailto:torqueusers-bounces at supercluster.org] On Behalf Of Gareth.Williams at csiro.au
Sent: Wednesday, March 07, 2012 4:31 PM
To: torqueusers at supercluster.org
Subject: Re: [torqueusers] Job Allocation on Nodes

> Perhaps this question has been answered before.  I have users who want to distribute jobs equally amongst nodes.  What I am observing at the moment is that when a user submits a job with nodes=12:ppn=3, the job uses three nodes with 12 cores per node.  Is there a way to make the job use only three cores per node.  How can I prevent this or setup some kind of affinity for following the user's job requirements?

Hi Randall,

Why would you want to do such a thing?  If the user submits four of the jobs they will align, and you will get worse contention.  I would suggest: if you need to spread jobs to access memory then you should schedule memory and/or if you need to avoid contention, say for memory bandwidth, then get the users to request whole nodes (all the available ppn) and only run as many processes as their scaling permits (they may need custom mpirun options).

torqueusers mailing list
torqueusers at supercluster.org

More information about the torqueusers mailing list