[torqueusers] linking queues to different nodes
Dr. Christian Mück-Lichtenfeld
cml at uni-muenster.de
Mon Jan 29 08:54:31 MST 2007
Jackie,
thanks for your reply (and those of David and Thomas as well).
What I have learned so far is that there obviously is no easy way
of assigning nodes to queue. I have installed MAUI now and it
works properly, but I still was not successful in assigning nodes
to specific queues.
E.g. a job that ist sent with "qsub -q xeon2" to the following
queue:
Queue xeon2
queue_type = Execution
total_jobs = 1
state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:1 Exiting:0
acl_host_enable = False
acl_hosts = lyra04,lyra03,lyra02,lyra01
resources_max.nodect = 2
resources_min.nodect = 2
resources_default.neednodes = xeon
resources_default.nodes = 2
mtime = Mon Jan 29 11:25:51 2007
resources_assigned.nodect = 2
enabled = True
started = True
runs on two nodes (lyra71 and lyra75) with the property 'amd'
Node lyra71
state = free
np = 1
properties = amd
ntype = cluster
I found some (disturbing) description about setting up
standing reservations in MAUI and added the following
lines to maui.cfg:
SRCFG[xeon2] TASKCOUNT=1 RESOURCES=PROCS:2
SRCFG[xeon2] STARTTIME=0:00:00 ENDTIME=24:00:00
SRCFG[xeon2] PERIOD=INFINITY
SRCFG[xeon2] HOSTLIST=lyra0[1234]
SRCFG[xeon2] CLASSLIST=xeon2
SRCFG[xeon2] NODEFEATURES=xeon
Are these lines correct or have I run into some serious
misunderstanding by using the SRCFG for assigning nodes
to a queue.
diagnose -r gives me the following output (at the end, after
printing the reservations made for the running jobs).
.....
xeon2.0.0 User DEF -00:13:06 INFINITY INFINITY 4
4 8
Flags: STANDINGRES
ACL: RES==xeon2.0= CLASS==xeon2+
CL: RES==xeon2.0
Task Resources: PROCS: 2
Attributes (HostList='lyra0[1234]')
Active PH: 0.00/1.80 (0.00%)
SRAttributes (TaskCount: 1 StartTime: 00:00:00 EndTime: 1:00:00:00
Days: ALL)
Active Reserved Processors: 17
WARNING: reservation table is corrupt: active procs reserved does not equal
active procs detected (17 != 14)
The last line worries me a bit (I have marked some nodes
as offline in TORQUE).
Thank you for reading (this 'newbie' stuff) until here
Christian
On Wednesday 24 January 2007 19:24, scoggins wrote:
> Christian,
>
> I have a similar problem. I have nodes with 2 different purposes.
> What I have found is this:
>
> set up the properties in the node file to be distinctly different -
> just like you have.
>
> setup the queues to be execution queues with no acl's set - you don't
> need them.
>
> setup the queues to be called by the users via qsub -q <queue-name>
>
> And inside their script have them request -l node=<property tag>
> and if machine arch is set to amd and or xeon you can add arch=<amd>
> or arch=xeon.
>
> I had struggled with the resources just the other day and found them
> to be very frustrating. The manual is not very clear on how things
> work and all of this is too cumbersome.
>
> Thanks
>
> Jackie
>
> On Jan 23, 2007, at 11:27 PM, Dr. Christian Mück-Lichtenfeld wrote:
> > Dear torque-users,
> >
> > after struggling a few days with the new torque installation
> > on our cluster, I have chosen to put my question to this forum.
> > I have not found a direct answer to my problem in the archives.
> >
> > My problem is that I have two sorts of nodes ("amd" and "xeon"),
> > which I want to use from different queues ("amdNN" and "xeonNN").
> > I know that this was discussed in the documentation, so I have set
> > the "acl_hosts" fields and have defined the resources 'neednodes',
> > but it still does not work: If i submit a job to the queue "amd2",
> > the job
> > runs on two xeon-nodes (lyra01+lyra02), not on 'lyra71' and 'lyra72'
> > as I want it to.
> > The "acl_hosts" setting in the queue definition does obviously not
> > what it should do according to the documentation.
> >
> > The amd2 queue is configured like this:
> > Queue amd2
> > queue_type = Execution
> > total_jobs = 1
> > state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:1
> > Exiting:0
> > acl_host_enable = False
> > acl_hosts =
> > lyra89,lyra79,lyra88,lyra78,lyra87,lyra77,lyra86,lyra76,
> >
> > lyra85,lyra75,lyra94,lyra84,lyra74,lyra93,lyra83,lyra73,
> >
> > lyra92,lyra82,lyra72,lyra91,lyra81,lyra71,lyra90,lyra80
> > resources_max.nodect = 2
> > resources_min.nodect = 2
> > resources_default.neednodes = amd
> > resources_default.nodes = 2
> > mtime = Sat Jan 20 19:46:39 2007
> > resources_assigned.nodect = 2
> > enabled = True
> > started = True
> >
> > Here are the two different node configurations:
> > Qmgr: l n lyra01
> > Node lyra01
> > state = free
> > np = 4
> > properties = xeon
> > ntype = cluster
> >
> > Qmgr: l n lyra71
> > Node lyra71
> > state = free
> > np = 1
> > properties = amd
> > ntype = cluster
> >
> > I am using torque 2.1.6 with x86-64 Linux (SuSE).
> > I have not installed the maui scheduler.
> >
> > Best regards,
> >
> > Christian Mueck-Lichtenfeld
> >
> >
> >
> > --
> > -------------------------------------------------------------------
> > Dr. Christian Mück-Lichtenfeld
> > Westfaelische Wilhelms-Universität
> > Organisch-Chemisches Institut
> > Corrensstrasse 40
> > D-48149 Münster, Germany
> >
> > cml at uni-muenster.de Tel +49 251 83 33301 Fax +49 251 83 36506
> > http://www.uni-muenster.de/Chemie/OC/research/grimme/group/cml.html
> > -------------------------------------------------------------------
> > support the progress of Quantum Chemistry with your PC
> > http://qah.uni-muenster.de
> > -------------------------------------------------------------------
> > "There cannot be a crisis next week. My schedule is already full."
> > (Henry Kissinger)
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
--
-------------------------------------------------------------------
Dr. Christian Mück-Lichtenfeld
Westfaelische Wilhelms-Universität
Organisch-Chemisches Institut
Corrensstrasse 40
D-48149 Münster, Germany
cml at uni-muenster.de Tel +49 251 83 33301 Fax +49 251 83 36506
http://www.uni-muenster.de/Chemie/OC/research/grimme/group/cml.html
-------------------------------------------------------------------
support the progress of Quantum Chemistry with your PC
http://qah.uni-muenster.de
-------------------------------------------------------------------
"There cannot be a crisis next week. My schedule is already full."
(Henry Kissinger)
More information about the torqueusers
mailing list