[torqueusers] linking queues to different nodes

Dr. Christian Mück-Lichtenfeld cml at uni-muenster.de
Mon Jan 29 08:54:31 MST 2007


Jackie,

  thanks for your reply (and those of David and Thomas as well).
What I have learned so far is that there obviously is no easy way
of assigning nodes to queue. I have installed MAUI now and it
works properly, but I still was not successful in assigning nodes
to specific queues.

E.g. a job that ist sent with "qsub -q xeon2" to the following
queue:

Queue xeon2
        queue_type = Execution
        total_jobs = 1
        state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:1 Exiting:0 
        acl_host_enable = False
        acl_hosts = lyra04,lyra03,lyra02,lyra01
        resources_max.nodect = 2
        resources_min.nodect = 2
        resources_default.neednodes = xeon
        resources_default.nodes = 2
        mtime = Mon Jan 29 11:25:51 2007
        resources_assigned.nodect = 2
        enabled = True
        started = True

runs on two nodes (lyra71 and lyra75) with the property 'amd'

Node lyra71
        state = free
        np = 1
        properties = amd
        ntype = cluster

I found some (disturbing) description about setting up
standing reservations in MAUI and added the following
lines to maui.cfg:

SRCFG[xeon2]    TASKCOUNT=1 RESOURCES=PROCS:2
SRCFG[xeon2]    STARTTIME=0:00:00 ENDTIME=24:00:00
SRCFG[xeon2]    PERIOD=INFINITY
SRCFG[xeon2]    HOSTLIST=lyra0[1234]
SRCFG[xeon2]    CLASSLIST=xeon2
SRCFG[xeon2]    NODEFEATURES=xeon


Are these lines correct or have I run into some serious
misunderstanding by using the SRCFG for assigning nodes
to a queue.

diagnose -r gives me the following output (at the end, after
printing the reservations made for the running jobs).

.....
xeon2.0.0                  User DEF   -00:13:06    INFINITY     INFINITY    4    
4    8
    Flags: STANDINGRES
    ACL: RES==xeon2.0= CLASS==xeon2+ 
    CL:  RES==xeon2.0 
    Task Resources: PROCS: 2
    Attributes (HostList='lyra0[1234]')
    Active PH: 0.00/1.80 (0.00%)
    SRAttributes (TaskCount: 1  StartTime: 00:00:00  EndTime: 1:00:00:00  
Days: ALL)

Active Reserved Processors: 17
WARNING:  reservation table is corrupt:  active procs reserved does not equal 
active procs detected (17 != 14)


The last line worries me a bit (I have marked some nodes
as offline in TORQUE).


Thank you for reading (this 'newbie' stuff) until here 

Christian









On Wednesday 24 January 2007 19:24, scoggins wrote:
> Christian,
>
> I have a similar problem.  I have nodes with 2 different purposes.
> What I have found is this:
>
> set up the properties in the node file to be distinctly different -
> just like you have.
>
> setup the queues to be execution queues with no acl's set - you don't
> need them.
>
> setup the queues to be called by the users via qsub -q <queue-name>
>
>   And inside their script have them request -l node=<property tag>
> and if machine arch is set to amd and or xeon you can add arch=<amd>
> or arch=xeon.
>
> I had struggled with the resources just the other day and found them
> to be very frustrating.  The manual is not very clear on how things
> work and all of this is too cumbersome.
>
> Thanks
>
> Jackie
>
> On Jan 23, 2007, at 11:27 PM, Dr. Christian Mück-Lichtenfeld wrote:
> > Dear torque-users,
> >
> >   after struggling a few days with the new torque installation
> > on our cluster, I have chosen to put my question to this forum.
> > I have not found a direct answer to my problem in the archives.
> >
> > My problem is that I have two sorts of nodes ("amd" and "xeon"),
> > which I want to use from different queues ("amdNN" and "xeonNN").
> > I know that this was discussed in the documentation, so I have set
> > the "acl_hosts" fields and have defined the resources 'neednodes',
> > but it still does not work: If i submit a job to the queue "amd2",
> > the job
> > runs on two xeon-nodes (lyra01+lyra02), not on 'lyra71' and 'lyra72'
> > as I want it to.
> > The "acl_hosts" setting in the queue definition does obviously not
> > what it should do according to the documentation.
> >
> > The amd2 queue is configured like this:
> >  Queue amd2
> >         queue_type = Execution
> >         total_jobs = 1
> >         state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:1
> > Exiting:0
> >         acl_host_enable = False
> >         acl_hosts =
> > lyra89,lyra79,lyra88,lyra78,lyra87,lyra77,lyra86,lyra76,
> >
> > lyra85,lyra75,lyra94,lyra84,lyra74,lyra93,lyra83,lyra73,
> >
> > lyra92,lyra82,lyra72,lyra91,lyra81,lyra71,lyra90,lyra80
> >         resources_max.nodect = 2
> >         resources_min.nodect = 2
> >         resources_default.neednodes = amd
> >         resources_default.nodes = 2
> >         mtime = Sat Jan 20 19:46:39 2007
> >         resources_assigned.nodect = 2
> >         enabled = True
> >         started = True
> >
> > Here are the two different node configurations:
> > Qmgr: l n lyra01
> > Node lyra01
> >         state = free
> >         np = 4
> >         properties = xeon
> >         ntype = cluster
> >
> > Qmgr: l n lyra71
> > Node lyra71
> >         state = free
> >         np = 1
> >         properties = amd
> >         ntype = cluster
> >
> > I am using torque 2.1.6 with x86-64 Linux (SuSE).
> > I have not installed the maui scheduler.
> >
> > Best regards,
> >
> > Christian Mueck-Lichtenfeld
> >
> >
> >
> > --
> > -------------------------------------------------------------------
> > Dr. Christian Mück-Lichtenfeld
> > Westfaelische Wilhelms-Universität
> > Organisch-Chemisches Institut
> > Corrensstrasse 40
> > D-48149 Münster, Germany
> >
> > cml at uni-muenster.de    Tel +49 251 83 33301    Fax +49 251 83 36506
> > http://www.uni-muenster.de/Chemie/OC/research/grimme/group/cml.html
> > -------------------------------------------------------------------
> >       support the progress of Quantum Chemistry with your PC
> >                     http://qah.uni-muenster.de
> > -------------------------------------------------------------------
> > "There cannot be a crisis next week. My schedule is already full."
> > (Henry Kissinger)
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers

-- 
-------------------------------------------------------------------
Dr. Christian Mück-Lichtenfeld
Westfaelische Wilhelms-Universität
Organisch-Chemisches Institut
Corrensstrasse 40
D-48149 Münster, Germany

cml at uni-muenster.de    Tel +49 251 83 33301    Fax +49 251 83 36506
http://www.uni-muenster.de/Chemie/OC/research/grimme/group/cml.html
-------------------------------------------------------------------
      support the progress of Quantum Chemistry with your PC
                    http://qah.uni-muenster.de 
-------------------------------------------------------------------
"There cannot be a crisis next week. My schedule is already full."
(Henry Kissinger)


More information about the torqueusers mailing list