[torqueusers] defining queues by user defined node features
P Spencer Davis
psdavis at bsu.edu
Mon Sep 17 07:44:30 MDT 2007
I tried shutting down Maui and running the default pbs_sched instead. No
change in behavior. I've set the resource_available.nodes to x86 or
x84-64 in the execution queues thinking that the routing queue would
then route the 32 bit requests to short or long and the 64 bit jobs to
short-64 or long-64 depending on the wall time requested, but that has
no effect. At this point I have no idea what I am doing wrong, Any ideas?
Thanks,
Spencer
P Spencer Davis wrote:
> Hello,
> I'm running v 2.1.6 of PBS as a resource manager with v 3.2.6p19 of
> the Maui scheduler. All the compute nodes are running RHEL 4 with the
> 2.6.9-55 kernel. The cluster is heterogious, 32 of the nodes are 32 bit
> dual processor, and the other 32 are 64 bit dual processor. The nodes
> file in server_priv is configured as follows (edited for brevity)
> ...
> n31 np=2 x86
> n32 np=2 x86-64
> ...
>
> with the idea being that submitting a job with nodes=x86-64 will select
> a 64 bit node. This worked fine until I created a routing queue with a
> short and a long execution queue, now the jobs are routed in a haphazard
> way. I tried creating short and long queues with the following properties:
> Queue short-64
> queue_type = Execution
> total_jobs = 0
> state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0
> Exiting:0
> resources_max.walltime = 24:00:00
> resources_default.neednodes = x86-64
> resources_default.nodes = x86-64
> mtime = Fri Sep 14 14:25:56 2007
> enabled = True
> started = True
> and they work fine as long as I submit jobs directly to them, but if the
> job is submitted to the default routing queue, it will only be routed by
> cpu or walltime.
> Any insight is appricaited,
> Spencer
> Here are my queue defintions:
> Queue short
> queue_type = Execution
> Priority = 20
> max_queuable = 62
> total_jobs = 4
> state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:4
> Exiting:0
> from_route_only = True
> resources_max.cput = 24:00:00
> resources_max.walltime = 24:00:00
> resources_min.cput = 00:00:00
> resources_default.neednodes = x86
> resources_default.nodes = x86
> mtime = Fri Sep 14 14:27:28 2007
> resources_assigned.mem = 16777216b
> resources_assigned.nodect = 4
> enabled = True
> started = True
>
> Queue routing
> queue_type = Route
> total_jobs = 0
> state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0
> Exiting:0
> resources_default.walltime = 00:10:00
> mtime = Fri Sep 14 14:06:20 2007
> route_destinations = short,long,long-64,short-64
> route_held_jobs = True
> route_waiting_jobs = True
> route_retry_time = 120
> route_lifetime = 604800
> enabled = True
> started = True
>
> Queue long-64
> queue_type = Execution
> total_jobs = 0
> state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0
> Exiting:0
> resources_min.walltime = 24:00:00
> resources_default.neednodes = x86-64
> mtime = Fri Sep 14 14:42:06 2007
> enabled = True
> started = True
>
> Queue bsu-research
> queue_type = Execution
> Priority = 80
> total_jobs = 0
> state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0
> Exiting:0
> from_route_only = False
> acl_group_enable = True
> acl_groups = ccnstaff
> mtime = Tue Aug 21 12:34:26 2007
> enabled = True
> started = True
>
> Queue long
> queue_type = Execution
> Priority = 20
> max_queuable = 62
> total_jobs = 0
> state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0
> Exiting:0
> acl_host_enable = False
> from_route_only = True
> resources_min.cput = 24:00:01
> resources_min.walltime = 24:00:01
> resources_default.neednodes = x86
> mtime = Fri Sep 14 14:01:39 2007
> resources_assigned.mem = 0b
> resources_assigned.nodect = 0
> enabled = True
> started = True
>
> Queue short-64
> queue_type = Execution
> total_jobs = 0
> state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0
> Exiting:0
> resources_max.walltime = 24:00:00
> resources_default.neednodes = x86-64
> resources_default.nodes = x86-64
> mtime = Fri Sep 14 14:25:56 2007
> enabled = True
> started = True
>
> my server configuration
> Server ccncluster.bsu.edu
> server_state = Active
> scheduling = True
> total_jobs = 4
> state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:4
> Exiting:0
> managers =
> operators =
> default_queue = routing
> log_events = 511
> mail_from = adm
> resources_default.mem = 4mb
> resources_assigned.mem = 16777216b
> resources_assigned.nodect = 4
> scheduler_iteration = 600
> node_check_rate = 150
> tcp_timeout = 6
> node_pack = False
> pbs_version = 2.1.6
>
>
> and the maui configuration:
>
> # maui.cfg 3.2.6p19
>
> SERVERHOST somehost.nowhere.net
> # primary admin must be first in list
> ADMIN1 00notreal
>
> # Resource Manager Definition
>
> #RMCFG[SOMEHOST] TYPE=PBS at RMNMHOST@
> RMCFG[base] TYPE=PBS
>
> # Allocation Manager Definition
>
> AMCFG[bank] TYPE=NONE
>
> # full parameter docs at
> http://supercluster.org/mauidocs/a.fparameters.html
> # use the 'schedctl -l' command to display current configuration
>
> RMPOLLINTERVAL 00:00:30
>
> SERVERPORT 42559
> SERVERMODE NORMAL
>
> # Admin: http://supercluster.org/mauidocs/a.esecurity.html
>
>
> LOGFILE maui.log
> LOGFILEMAXSIZE 10000000
> LOGLEVEL 3
>
> # Job Priority: http://supercluster.org/mauidocs/5.1jobprioritization.html
>
> QUEUETIMEWEIGHT 1
>
> # FairShare: http://supercluster.org/mauidocs/6.3fairshare.html
>
> #FSPOLICY PSDEDICATED
> #FSDEPTH 7
> #FSINTERVAL 86400
> #FSDECAY 0.80
>
> # Throttling Policies:
> http://supercluster.org/mauidocs/6.2throttlingpolicies.html
>
> # NONE SPECIFIED
>
> # Backfill: http://supercluster.org/mauidocs/8.2backfill.html
>
> BACKFILLPOLICY NONE
> RESERVATIONPOLICY CURRENTHIGHEST
>
> # Maui Feature polices
>
> ENABLEMULTIREQJOBS TRUE
>
> # Node Allocation: http://supercluster.org/mauidocs/5.2nodeallocation.html
>
> NODEALLOCATIONPOLICY MINRESOURCE
>
> # QOS: http://supercluster.org/mauidocs/7.3qos.html
>
> # QOSCFG[hi] PRIORITY=100 XFTARGET=100 FLAGS=PREEMPTOR:IGNMAXJOB
> # QOSCFG[low] PRIORITY=-1000 FLAGS=PREEMPTEE
>
> # Standing Reservations:
> http://supercluster.org/mauidocs/7.1.3standingreservations.html
>
> # SRSTARTTIME[test] 8:00:00
> # SRENDTIME[test] 17:00:00
> # SRDAYS[test] MON TUE WED THU FRI
> # SRTASKCOUNT[test] 20
> # SRMAXTIME[test] 0:30:00
>
> # Creds: http://supercluster.org/mauidocs/6.1fairnessoverview.html
>
> # USERCFG[DEFAULT] FSTARGET=25.0
> # USERCFG[john] PRIORITY=100 FSTARGET=10.0-
> # GROUPCFG[staff] PRIORITY=1000 QLIST=hi:low QDEF=hi
> # CLASSCFG[batch] FLAGS=PREEMPTEE
> # CLASSCFG[interactive] FLAGS=PREEMPTOR
>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
More information about the torqueusers
mailing list