[torqueusers] Curious question

Gus Correa gus at ldeo.columbia.edu
Tue Mar 30 11:55:13 MDT 2010


Hi Fernando

Fernando Campos wrote:
> Hi all!!
> 
> I'm running a TORQUE cluster with "pbs_sched" scheduler. Which is the 
> policy for node allocation?? 

As far as I know, the standard pbs_sched job policy just first in first 
out (FIFO).

> Can you define it anywhere on Torque?? 

There are  a few things that can be done in your Torque server
on the ${TORQUE}/sched_priv/sched_config file.
I found it useful to reduce the job starvation limit
(default 24h, I think), to prevent large parallel jobs to
be bypassed forever by serial jobs and small parallel ones.

For more job control you need to install the Maui scheduler,
and use it instead of pbs_sched.

> Is 
> this case depending on the nodes order in "nodes" file? Should I then 
> put the better performance nodes at the beggining?
> 

I think Torque picks the nodes in the reverse order that you list them
in the nodes file.
Hence, if you want the better performance nodes to be used first,
put them on the bottom of the nodes file.

I hope this helps.
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------

> Thanks!
> 
> Fer.
> 
> 
> 
> On Tue, Mar 30, 2010 at 16:54, Charles Johnson 
> <charles.johnson at accre.vanderbilt.edu 
> <mailto:charles.johnson at accre.vanderbilt.edu>> wrote:
> 
>     On Mar 30, 2010, at 9:40 AM, Ken Nielson wrote:
> 
>      > Glen Beane wrote:
>      >>
>      >>
>      >> On Tue, Mar 30, 2010 at 10:07 AM, Glen Beane
>     <glen.beane at gmail.com <mailto:glen.beane at gmail.com>
>     <mailto:glen.beane at gmail.com <mailto:glen.beane at gmail.com>
>      >> >> wrote:
>      >>
>      >>
>      >>
>      >>    On Tue, Mar 30, 2010 at 9:55 AM, Charles Johnson
>      >>    <charles.johnson at accre.vanderbilt.edu
>     <mailto:charles.johnson at accre.vanderbilt.edu>
>      >>    <mailto:charles.johnson at accre.vanderbilt.edu
>     <mailto:charles.johnson at accre.vanderbilt.edu>>> wrote:
>      >>
>      >>        About the nodes file ... I have always been under the
>      >>        impression that
>      >>        the nodes file gives an ordering to nodes selected for jobs,
>      >> i.e.,
>      >>        nodes at the top of the list are considered before nodes at
>      >> the
>      >>        bottom. We are currently in a down time for refreshing
>      >>        hardware, and
>      >>        the whole cluster is quiescent. As a test of hardware we
>      >>        submitted a
>      >>        single job suitable for any one of several hundred nodes at
>      >>        the top of
>      >>        the nodes file. The job ran on a node roughly halfway
>     down the
>      >>        nodes
>      >>        file. Again, there were no other jobs on the cluster.
>      >>
>      >>        I am curious as to why? Any ideas?
>      >>
>      >>        We are using torque 2.4.5 and moab 5.3.6
>      >>
>      >>
>      >>
>      >>    This is really a Moab question since Moab selects the node that
>      >>    the job will run on, it has nothing to do with the order of the
>      >>    nodes in the TORQUE node file. I think the default for Moab might
>      >>    be "last fit", so as it scans the available nodes it will select
>      >>    the last one it finds that satisfies the requirements for the
>      >>    job.  There is a "first fit" and "best fit" option if I remember
>      >>    correctly.
>      >>
>      >>    With the fifo scheduler, then yes, I think the job would run on
>      >>    the first available node in the node list.
>      >>
>      >>
>      >> actually, I was thinking of LASTAVAILABLE, not "last fit", so the
>      >> definition would be different than what I stated.  This is the
>      >> correct definition: "This algorithm is a best fit in time algorithm
>      >> that minimizes the impact of reservation based node-time
>      >> fragmentation."  I think this might be the default, but I don't
>      >> remember for sure.
>      >>
>     ------------------------------------------------------------------------
>      >>
>      >> _______________________________________________
>      >> torqueusers mailing list
>      >> torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
>      >> http://www.supercluster.org/mailman/listinfo/torqueusers
>      >>
>      > If you were to run TORQUE without a scheduler and started the job
>      > manually TORQUE will choose the last node in the nodes file to
>     run on.
>      >
>      > Ken Nielson
>      > Adaptive Computing
> 
> 
>     Thanks to all who replied. Clarity reigns. :)
> 
>     ~Cheers--
> 
>     Charles
>     --
>     Charles Johnson
>     Advanced Computing Center for Research and Education
>     Office: 615-343-2776
>     Cell: 615-478-5743
>     _______________________________________________
>     torqueusers mailing list
>     torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
>     http://www.supercluster.org/mailman/listinfo/torqueusers
> 
> 
> 
> 
> -- 
> ---------------------------------------------------------------------------------------------------------
> Fernando Campos Del Pozo
> Departamento de Física Teórica
> Facultad de Ciencias / Módulo 15 (C-XI) / Despacho 512
> Universidad Autónoma de Madrid
> Tlf.: +34-914974893
> e-mail: fernando.campos at uam.es <mailto:fernando.campos at uam.es>
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers



More information about the torqueusers mailing list