[torqueusers] specific nodes
Ricardo Román Brenes
roman.ricardo at gmail.com
Wed Nov 30 13:08:51 MST 2011
Well I am using torque+maui but even so i cant get the maui to assign the
nodes correctly; a job just runs on all nodes not just the ones i want ...
On Wed, Nov 30, 2011 at 2:01 PM, Lloyd Brown <lloyd_brown at byu.edu> wrote:
> Not so much the wrong mailing list, but the wrong product. In the end
> Torque is really about resource management, launching jobs, etc., but
> not the decision making. They happen to include a very basic scheduler
> ("pbs_sched"), but it's very, very basic. If you want anything more,
> you're going to have to look at Moab or Maui, to use with Torque. Or
> there are other scheduling systems out there as well, that don't use
> Torque.
>
> For such a small/simple cluster, I'd recommend Torque with Maui, but
> you'll have to do some investigation.
>
>
> Lloyd Brown
> Systems Administrator
> Fulton Supercomputing Lab
> Brigham Young University
> http://marylou.byu.edu
>
>
>
> On 11/30/2011 12:56 PM, Ricardo Román Brenes wrote:
> > so wrong mailing list huh?
> >
> > sorry to bother
> >
> > thanks for your time
> >
> > On Wed, Nov 30, 2011 at 1:52 PM, Lloyd Brown <lloyd_brown at byu.edu
> > <mailto:lloyd_brown at byu.edu>> wrote:
> >
> > Ricardo,
> >
> > Have you seen section 4.1.4 ("Mapping a Queue to a Subset of
> Resources")
> > in the Torque documentation? It might give you some ideas. However,
> > the short answer to your question, as seen in that section is this:
> >
> > > TORQUE does not currently provide a simple mechanism for mapping
> > queues to nodes. However, schedulers such as Moab and Maui can
> > provide this functionality.
> >
> >
> > Lloyd Brown
> > Systems Administrator
> > Fulton Supercomputing Lab
> > Brigham Young University
> > http://marylou.byu.edu
> >
> >
> >
> > On 11/30/2011 12:37 PM, Ricardo Román Brenes wrote:
> > > Hello everyone thanks for the time of reading and the long post :P
> > >
> > >
> > > The question is about multiple queues with Torque:
> > >
> > >
> > > We have here different clusternodes with difrente architectures:
> > > 4 PS-3
> > > 3 CPU+GPU
> > > 2 CPU
> > >
> > > and i want to be able to send jobs to each of hte nodes independly
> > > (using torque). Im guessing that having several queues and that
> each
> > > node belonging to a queue in particular and then submittint jobs
> > to that
> > > queue will do the trick:
> > >
> > > say i got 4 queues
> > > IBMCELL with the 4 PS-3
> > > TESLA with the 3 nodes that have GPUs
> > > XEON with te 5 nodes that have Xeons (which in turn 3 of them have
> > > teslas :P)
> > >
> > > and when i submit a job:
> > > qsub -q IBMCELL a.pbs
> > > should run on the PS-3 only, but im not being able to make it work
> > like
> > > that.
> > >
> > > As a test i made 2 queues in the PS3 pbs_server ("uno" and "dos"):
> > >
> > > #
> > > # Create queues and set their attributes.
> > > #
> > > #
> > > # Create and define queue uno
> > > #
> > > *create queue uno
> > > **set queue uno queue_type = Execution
> > > **set queue uno acl_host_enable = False
> > > **set queue uno acl_hosts = zarate-0+zarate-1
> > > **set queue uno enabled = True
> > > **set queue uno started = True
> > > *#
> > > # Create and define queue dos
> > > #
> > > *create queue dos
> > > **set queue dos queue_type = Execution
> > > **set queue dos acl_host_enable = **False**
> > > **set queue dos acl_hosts = zarate-2+zarate-3
> > > **set queue dos enabled = True
> > > **set queue dos started = True
> > > *#
> > > # Set server attributes.
> > > #
> > > set server scheduling = True
> > > set server acl_hosts = zarate-0
> > > set server log_events = 511
> > > set server mail_from = adm
> > > set server scheduler_iteration = 600
> > > set server node_check_rate = 150
> > > set server tcp_timeout = 6
> > > set server next_job_number = 22
> > >
> > >
> > > and i changed the _nodes_ file in the server_priv directory so it
> is
> > > like this (zarate are just the hostname :P):
> > >
> > >
> > > zarate-0 np=2 uno
> > > zarate-1 np=2 uno
> > > zarate-2 np=2 dos
> > > zarate-3 np=2 dos
> > >
> > >
> > >
> > > but its not working... when i launch a job:
> > >
> > > #PBS -N mpi_hello
> > > /usr/local/bin/mpiexec -n 8 /home/rroman/a.out
> > >
> > >
> > > with teh command:
> > >
> > > #PBS -N mpi_hello
> > >
> > > /usr/local/bin/mpiexec -n 8 /home/rroman/a.out
> > >
> > >
> > > the output file is:
> > >
> > > zarate-1: hello world from process 2 of 8
> > > zarate-2: hello world from process 5 of 8
> > > zarate-2: hello world from process 6 of 8
> > > zarate-3: hello world from process 0 of 8
> > > zarate-3: hello world from process 7 of 8
> > > zarate-1: hello world from process 3 of 8
> > > zarate-0: hello world from process 4 of 8
> > > zarate-3: hello world from process 1 of 8
> > >
> > >
> > >
> > > And there it shows that the job is running in ALL the nodes
> instead of
> > > running only in zarate-0 and zarate-1 as the queue said (according
> > to me :P)
> > >
> > >
> > >
> > >
> > > SO! the question is: is it possible to do waht i want like this?
> > and if
> > > so, what am i doing wrong! :P
> > >
> > > Thank you Kay!
> > >
> > > -ricardo
> > >
> > >
> > >
> > > _______________________________________________
> > > torqueusers mailing list
> > > torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
> > > http://www.supercluster.org/mailman/listinfo/torqueusers
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> >
> >
> >
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20111130/896926bf/attachment.html
More information about the torqueusers
mailing list