[torqueusers] specific nodes

Lloyd Brown lloyd_brown at byu.edu
Wed Nov 30 13:01:56 MST 2011


Not so much the wrong mailing list, but the wrong product.  In the end
Torque is really about resource management, launching jobs, etc., but
not the decision making.  They happen to include a very basic scheduler
("pbs_sched"), but it's very, very basic.  If you want anything more,
you're going to have to look at Moab or Maui, to use with Torque.  Or
there are other scheduling systems out there as well, that don't use Torque.

For such a small/simple cluster, I'd recommend Torque with Maui, but
you'll have to do some investigation.


Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu



On 11/30/2011 12:56 PM, Ricardo Román Brenes wrote:
> so wrong mailing list huh?
> 
> sorry to bother
> 
> thanks for your time
> 
> On Wed, Nov 30, 2011 at 1:52 PM, Lloyd Brown <lloyd_brown at byu.edu
> <mailto:lloyd_brown at byu.edu>> wrote:
> 
>     Ricardo,
> 
>     Have you seen section 4.1.4 ("Mapping a Queue to a Subset of Resources")
>     in the Torque documentation?  It might give you some ideas.  However,
>     the short answer to your question, as seen in that section is this:
> 
>     > TORQUE does not currently provide a simple mechanism for mapping
>     queues to nodes. However, schedulers such as Moab and Maui can
>     provide this functionality.
> 
> 
>     Lloyd Brown
>     Systems Administrator
>     Fulton Supercomputing Lab
>     Brigham Young University
>     http://marylou.byu.edu
> 
> 
> 
>     On 11/30/2011 12:37 PM, Ricardo Román Brenes wrote:
>     > Hello everyone thanks for the time of reading and the long post :P
>     >
>     >
>     > The question is about multiple queues with Torque:
>     >
>     >
>     > We have here different clusternodes with difrente architectures:
>     > 4 PS-3
>     > 3 CPU+GPU
>     > 2 CPU
>     >
>     > and i want to be able to send jobs to each of hte nodes independly
>     > (using torque). Im guessing that having several queues and that each
>     > node belonging to a queue in particular and then submittint jobs
>     to that
>     > queue will do the trick:
>     >
>     > say i got 4 queues
>     > IBMCELL with the 4 PS-3
>     > TESLA with the 3 nodes that have GPUs
>     > XEON with te 5 nodes that have Xeons (which in turn 3 of them have
>     > teslas :P)
>     >
>     > and when i submit a job:
>     > qsub -q IBMCELL a.pbs
>     > should run on the PS-3 only, but im not being able to make it work
>     like
>     > that.
>     >
>     > As a test i made 2 queues in the PS3 pbs_server ("uno" and "dos"):
>     >
>     >     #
>     >     # Create queues and set their attributes.
>     >     #
>     >     #
>     >     # Create and define queue uno
>     >     #
>     >     *create queue uno
>     >     **set queue uno queue_type = Execution
>     >     **set queue uno acl_host_enable = False
>     >     **set queue uno acl_hosts = zarate-0+zarate-1
>     >     **set queue uno enabled = True
>     >     **set queue uno started = True
>     >     *#
>     >     # Create and define queue dos
>     >     #
>     >     *create queue dos
>     >     **set queue dos queue_type = Execution
>     >     **set queue dos acl_host_enable = **False**
>     >     **set queue dos acl_hosts = zarate-2+zarate-3
>     >     **set queue dos enabled = True
>     >     **set queue dos started = True
>     >     *#
>     >     # Set server attributes.
>     >     #
>     >     set server scheduling = True
>     >     set server acl_hosts = zarate-0
>     >     set server log_events = 511
>     >     set server mail_from = adm
>     >     set server scheduler_iteration = 600
>     >     set server node_check_rate = 150
>     >     set server tcp_timeout = 6
>     >     set server next_job_number = 22
>     >
>     >
>     > and i changed the _nodes_ file in the server_priv directory so it is
>     > like this (zarate are just the hostname :P):
>     >
>     >
>     >     zarate-0 np=2 uno
>     >     zarate-1 np=2 uno
>     >     zarate-2 np=2 dos
>     >     zarate-3 np=2 dos
>     >
>     >
>     >
>     > but its not working... when i launch a job:
>     >
>     >     #PBS -N mpi_hello
>     >     /usr/local/bin/mpiexec -n 8 /home/rroman/a.out
>     >
>     >
>     > with teh command:
>     >
>     > #PBS -N mpi_hello
>     >
>     >     /usr/local/bin/mpiexec -n 8 /home/rroman/a.out
>     >
>     >
>     > the output file is:
>     >
>     >     zarate-1: hello world from process 2 of 8
>     >     zarate-2: hello world from process 5 of 8
>     >     zarate-2: hello world from process 6 of 8
>     >     zarate-3: hello world from process 0 of 8
>     >     zarate-3: hello world from process 7 of 8
>     >     zarate-1: hello world from process 3 of 8
>     >     zarate-0: hello world from process 4 of 8
>     >     zarate-3: hello world from process 1 of 8
>     >
>     >
>     >
>     > And there it shows that the job is running in ALL the nodes instead of
>     > running only in zarate-0 and zarate-1 as the queue said (according
>     to me :P)
>     >
>     >
>     >
>     >
>     > SO! the question is: is it possible to do waht i want like this?
>     and if
>     > so, what am i doing wrong! :P
>     >
>     > Thank you Kay!
>     >
>     > -ricardo
>     >
>     >
>     >
>     > _______________________________________________
>     > torqueusers mailing list
>     > torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
>     > http://www.supercluster.org/mailman/listinfo/torqueusers
>     _______________________________________________
>     torqueusers mailing list
>     torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
>     http://www.supercluster.org/mailman/listinfo/torqueusers
> 
> 
> 
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers


More information about the torqueusers mailing list