[torqueusers] How do we achieve per node restrictions in torque, using a single queue that users can submit to?

Kamil Marcinkowski kamil at ualberta.ca
Fri Oct 14 12:52:50 MDT 2005


Hello, all

How do we achieve per node restrictions in torque, using a single  
queue that users can submit to?
We are using moab 4.2.0 and torque-1.2.0p6.
We run large multiprocessor shared memory machines.
We need the ability to specify one node may take jobs utilizing 4 to  
64 processors while another
node may only take jobs utilizing 32 to 64 processor jobs, etc.

1) If we create one queue with no restrictions it will schedule jobs  
to run on all nodes but with no restrictions.
     Problem: no restrictions.

2) If we create one queue with restrictions per each host and limit  
it to only to schedule to its corresponding
host we can achieve our per host restrictions however now we have  
multiple queues for our users to submit to.
     Problem: multiple queues used in user submission.

3) If we do as in (2) and set up a routing queue with the  
destinations for each host's queue.
     Problem:
         This setup works fine as long as there hosts are available  
to execute jobs immediately.
         If jobs cannot be run immediately they are put into the  
first queue on the list that will accept that particular job,
         the problem with this is that are users put a lot of jobs on  
the system, say before a weekend, all the jobs end up in a queue
         corresponding to one host, while the other hosts have only  
the job they are executing at the moment,
         which they soon finish and stand empty capable of executing  
jobs sitting in another host's queue.

Is there any way we could work around this problem?

Will the following work?
Execution queue for each mutually exclusive class of job with  
restriction to which nodes may run jobs
and one routing queue to route to them all.

Thanks

Kamil


Kamil Marcinkowski                           Westgrid System  
Administrator
kamil at ualberta.ca                             University of Alberta site
  Tel.780 492-0354                              Research Computing  
Support
Fax.780 492-1729                              Academic ICT (formerly  
CNS)
Edmonton, Alberta, CANADA           University of Alberta


"This communication is intended for the use of the recipient to which  
it is
addressed, and may contain confidential, personal, and/or privileged
information.  Please contact us immediately if you are not the intended
recipient of this communication.  If you are not the intended  
recipient of
this communication, do not copy, distribute, or take action on it. Any
communication received in error, or subsequent reply, should be  
deleted or
destroyed."



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20051014/410d0a57/attachment-0001.html


More information about the torqueusers mailing list