[Mauiusers] How to set Group and Class/Queue?

Mike Renfro renfro at tntech.edu
Mon Jun 5 08:12:05 MDT 2006


任彦亮 wrote:
> Dear every body:
>     my machine have following set:
>     GROUP:  USERS GUEST
>     CLASS/QUEUE:  DELL RCMM DAWN

> now, usera is belong to group GUEST, and userb is belong to group USERS, I want to
> limite that usera in Group GUEST only can submit the job to class/queue RCMM,
> while userb in group USERS can submit the job the all class/queue (DEL RCMM DAWN),
> how I can set in maui.cfg

If you're using Torque, I think you can handle that with the acl_groups 
and acl_group_enable attributes on each queue you want to restrict, and 
not make any changes to Maui:

   http://www.clusterresources.com/torquedocs21/4.1queueconfig.shtml

But I have a related question for the list. I may have overcomplicated 
my setup, but here's the financial and technical policies I'm working 
with (apologies for the length, but I don't know how to make it shorter):

1. Six different classes of systems, including from three different 
types of P4 single-CPU systems and three different types of Xeon 
systems. I made six separate queues and a routing queue, since I want to 
ensure that anyone who starts a multi-CPU job gets allocated nodes from 
one and only one class of system. Though the separate queues may not be 
strictly required, the node segregation is, since three of those system 
classes get rebooted on a consistent schedule for non-cluster work.

2. There are three budget lines that purchased these six sets of 
systems. One of those lines is for general student usage, and I want all 
of my users to have an equal shot at using those resources, so everyone 
would get an equal priority in those queues. The remaining two lines are 
from specific research groups, who want to ensure that their students 
get priority on the systems they purchased, but have no problems with 
users outside their group using idle resources. Preemption is not 
required, just that their students get a higher priority on jobs in 
those queues.

So far, the closest thing I've found for solving part 2 is from June 
2005 on job/queue affinity [1]. From the description, it would seem that 
by setting standing reservations with QOS affinities, I can effectively 
set default queues for different types of jobs, and if those queues are 
full, those jobs will spill over to less-desirable queues. I've set up 
queue affinities for the research groups' systems and the 
university-funded systems as follows:

=====
# ChE faculty bought ch226-11...ch226-19 in Spring 2006
GROUPCFG[dvisco] QDEF=pe1855-che
GROUPCFG[icarpen] QDEF=pe1855-che
GROUPCFG[vsubramanian] QDEF=pe1855-che
SRCFG[che_nodes] DAYS=[ALL]
SRCFG[che_nodes] ACCESS=DEDICATED
SRCFG[che_nodes] PERIOD=DAY
SRCFG[che_nodes] DEPTH=2
SRCFG[che_nodes] CLASSLIST=pe1855-che,DEFAULT,pe1850-cee
SRCFG[che_nodes] 
HOSTLIST=ch226-11,ch226-12,ch226-13,ch226-14,ch226-15,ch226-16,ch226-17,ch226-18,ch226-19
SRCFG[che_nodes] QOSLIST=pe1855-che,DEFAULT-,pe1850-cee-

# CEE faculty bought ch226-8 in Spring 2006
GROUPCFG[fhossain] QDEF=pe1850-cee
GROUPCFG[jliu] QDEF=pe1850-cee
GROUPCFG[sclick] QDEF=pe1850-cee
SRCFG[cee_nodes] DAYS=[ALL]
SRCFG[cee_nodes] ACCESS=DEDICATED
SRCFG[cee_nodes] PERIOD=DAY
SRCFG[cee_nodes] DEPTH=2
SRCFG[cee_nodes] CLASSLIST=pe1850-cee,DEFAULT,pe1855-che
SRCFG[cee_nodes] HOSTLIST=ch226-8
SRCFG[cee_nodes] QOSLIST=pe1850-cee,DEFAULT-,pe1855-che-

# TAF bought ch226-1...ch226-7 in Spring 2002...2006
GROUPCFG[users] QDEF=DEFAULT
SRCFG[taf_nodes] DAYS=[ALL]
SRCFG[taf_nodes] ACCESS=DEDICATED
SRCFG[taf_nodes] PERIOD=DAY
SRCFG[taf_nodes] DEPTH=2
SRCFG[taf_nodes] CLASSLIST=pe1850-cee,DEFAULT,pe1855-cee
SRCFG[taf_nodes] 
HOSTLIST=ch226-1,ch226-2,ch226-3,ch226-4,ch226-5,ch226-6,ch226-7
SRCFG[taf_nodes] QOSLIST=DEFAULT,pe1850-cee-,pe1855-che-
=====

And the relevant parts for Torque:

=====
create queue long
set queue long queue_type = Route
set queue long route_destinations = pe2650
set queue long route_destinations += pe1850
set queue long route_destinations += pe1855-che
set queue long route_destinations += pe1850-cee
set queue long enabled = True
set queue long started = True
set server default_queue = long
=====

But when I submit a job from one of my users who is a member of a 
research group, I get the right group and QOS, but the job goes right to 
  one the first open queue, ignoring the affinities I thought I had set 
up. Granted, my user's primary group is 'users', but he's got a 
secondary group membership in one of the research groups, and I'm 
explicitly setting his group via qsub:

=====
abcdefghij21 at ch208a:~$ qsub -W group_list=dvisco ./sleep.sh
1474.ch208a.cae.tntech.edu
abcdefghij21 at ch208a:~$ checkjob 1474


checking job 1474

State: Running
Creds:  user:abcdefghij21  group:dvisco  class:pe2650  qos:pe1855-che
WallTime: 00:00:00 of 1:00:00
SubmitTime: Mon Jun  5 08:53:11
   (Time Queued  Total: 00:00:01  Eligible: 00:00:01)

StartTime: Mon Jun  5 08:53:12
Total Tasks: 1

Req[0]  TaskCount: 1  Partition: DEFAULT
Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
Opsys: [NONE]  Arch: [NONE]  Features: [NONE]
Allocated Nodes:
[ch226-5:1]


IWD: [NONE]  Executable:  [NONE]
Bypass: 0  StartCount: 1
PartitionMask: [ALL]
Flags:       RESTARTABLE

Reservation '1474' (00:00:00 -> 1:00:00  Duration: 1:00:00)
PE:  1.00  StartPriority:  100
=====

What did I screw up on affinities, or do I have a fundamental 
misunderstanding of how they work?

[1] 
http://www.clusterresources.com/pipermail/mauiusers/2005-June/001591.html

-- 
Mike Renfro  / R&D Engineer, Center for Manufacturing Research,
931 372-3601 / Tennessee Technological University -- renfro at tntech.edu


More information about the mauiusers mailing list