[torqueusers] specific nodes

Ricardo Román Brenes roman.ricardo at gmail.com
Wed Nov 30 13:08:51 MST 2011


Well I am using torque+maui but even so i cant get the maui to assign the
nodes correctly; a job just runs on all nodes not just the ones i want ...

On Wed, Nov 30, 2011 at 2:01 PM, Lloyd Brown <lloyd_brown at byu.edu> wrote:

> Not so much the wrong mailing list, but the wrong product.  In the end
> Torque is really about resource management, launching jobs, etc., but
> not the decision making.  They happen to include a very basic scheduler
> ("pbs_sched"), but it's very, very basic.  If you want anything more,
> you're going to have to look at Moab or Maui, to use with Torque.  Or
> there are other scheduling systems out there as well, that don't use
> Torque.
>
> For such a small/simple cluster, I'd recommend Torque with Maui, but
> you'll have to do some investigation.
>
>
> Lloyd Brown
> Systems Administrator
> Fulton Supercomputing Lab
> Brigham Young University
> http://marylou.byu.edu
>
>
>
> On 11/30/2011 12:56 PM, Ricardo Román Brenes wrote:
> > so wrong mailing list huh?
> >
> > sorry to bother
> >
> > thanks for your time
> >
> > On Wed, Nov 30, 2011 at 1:52 PM, Lloyd Brown <lloyd_brown at byu.edu
> > <mailto:lloyd_brown at byu.edu>> wrote:
> >
> >     Ricardo,
> >
> >     Have you seen section 4.1.4 ("Mapping a Queue to a Subset of
> Resources")
> >     in the Torque documentation?  It might give you some ideas.  However,
> >     the short answer to your question, as seen in that section is this:
> >
> >     > TORQUE does not currently provide a simple mechanism for mapping
> >     queues to nodes. However, schedulers such as Moab and Maui can
> >     provide this functionality.
> >
> >
> >     Lloyd Brown
> >     Systems Administrator
> >     Fulton Supercomputing Lab
> >     Brigham Young University
> >     http://marylou.byu.edu
> >
> >
> >
> >     On 11/30/2011 12:37 PM, Ricardo Román Brenes wrote:
> >     > Hello everyone thanks for the time of reading and the long post :P
> >     >
> >     >
> >     > The question is about multiple queues with Torque:
> >     >
> >     >
> >     > We have here different clusternodes with difrente architectures:
> >     > 4 PS-3
> >     > 3 CPU+GPU
> >     > 2 CPU
> >     >
> >     > and i want to be able to send jobs to each of hte nodes independly
> >     > (using torque). Im guessing that having several queues and that
> each
> >     > node belonging to a queue in particular and then submittint jobs
> >     to that
> >     > queue will do the trick:
> >     >
> >     > say i got 4 queues
> >     > IBMCELL with the 4 PS-3
> >     > TESLA with the 3 nodes that have GPUs
> >     > XEON with te 5 nodes that have Xeons (which in turn 3 of them have
> >     > teslas :P)
> >     >
> >     > and when i submit a job:
> >     > qsub -q IBMCELL a.pbs
> >     > should run on the PS-3 only, but im not being able to make it work
> >     like
> >     > that.
> >     >
> >     > As a test i made 2 queues in the PS3 pbs_server ("uno" and "dos"):
> >     >
> >     >     #
> >     >     # Create queues and set their attributes.
> >     >     #
> >     >     #
> >     >     # Create and define queue uno
> >     >     #
> >     >     *create queue uno
> >     >     **set queue uno queue_type = Execution
> >     >     **set queue uno acl_host_enable = False
> >     >     **set queue uno acl_hosts = zarate-0+zarate-1
> >     >     **set queue uno enabled = True
> >     >     **set queue uno started = True
> >     >     *#
> >     >     # Create and define queue dos
> >     >     #
> >     >     *create queue dos
> >     >     **set queue dos queue_type = Execution
> >     >     **set queue dos acl_host_enable = **False**
> >     >     **set queue dos acl_hosts = zarate-2+zarate-3
> >     >     **set queue dos enabled = True
> >     >     **set queue dos started = True
> >     >     *#
> >     >     # Set server attributes.
> >     >     #
> >     >     set server scheduling = True
> >     >     set server acl_hosts = zarate-0
> >     >     set server log_events = 511
> >     >     set server mail_from = adm
> >     >     set server scheduler_iteration = 600
> >     >     set server node_check_rate = 150
> >     >     set server tcp_timeout = 6
> >     >     set server next_job_number = 22
> >     >
> >     >
> >     > and i changed the _nodes_ file in the server_priv directory so it
> is
> >     > like this (zarate are just the hostname :P):
> >     >
> >     >
> >     >     zarate-0 np=2 uno
> >     >     zarate-1 np=2 uno
> >     >     zarate-2 np=2 dos
> >     >     zarate-3 np=2 dos
> >     >
> >     >
> >     >
> >     > but its not working... when i launch a job:
> >     >
> >     >     #PBS -N mpi_hello
> >     >     /usr/local/bin/mpiexec -n 8 /home/rroman/a.out
> >     >
> >     >
> >     > with teh command:
> >     >
> >     > #PBS -N mpi_hello
> >     >
> >     >     /usr/local/bin/mpiexec -n 8 /home/rroman/a.out
> >     >
> >     >
> >     > the output file is:
> >     >
> >     >     zarate-1: hello world from process 2 of 8
> >     >     zarate-2: hello world from process 5 of 8
> >     >     zarate-2: hello world from process 6 of 8
> >     >     zarate-3: hello world from process 0 of 8
> >     >     zarate-3: hello world from process 7 of 8
> >     >     zarate-1: hello world from process 3 of 8
> >     >     zarate-0: hello world from process 4 of 8
> >     >     zarate-3: hello world from process 1 of 8
> >     >
> >     >
> >     >
> >     > And there it shows that the job is running in ALL the nodes
> instead of
> >     > running only in zarate-0 and zarate-1 as the queue said (according
> >     to me :P)
> >     >
> >     >
> >     >
> >     >
> >     > SO! the question is: is it possible to do waht i want like this?
> >     and if
> >     > so, what am i doing wrong! :P
> >     >
> >     > Thank you Kay!
> >     >
> >     > -ricardo
> >     >
> >     >
> >     >
> >     > _______________________________________________
> >     > torqueusers mailing list
> >     > torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
> >     > http://www.supercluster.org/mailman/listinfo/torqueusers
> >     _______________________________________________
> >     torqueusers mailing list
> >     torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
> >     http://www.supercluster.org/mailman/listinfo/torqueusers
> >
> >
> >
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20111130/896926bf/attachment.html 


More information about the torqueusers mailing list