[torqueusers] complex scheduling with pbs_sched

Michelangelo D'Agostino mdagost at berkeley.edu
Thu Feb 16 19:00:01 MST 2006


Hey everybody.  I'm trying to set up a rather convoluted scheduling 
system.  I have 3 nodes, each with two processors, and many users.  What 
I'd like to be able to do is allow each user to run up to 6 jobs, one on 
each processor.  If that same user submits a seventh job, it waits for 
one of the first six to finish. 

But if another user comes along and submits jobs (up to 6), I want those 
jobs to start running right away on some of the processors that the 
first user is using.  So far, I've only be able to spread 6 jobs out of 
6 nodes, regardless of whose they are, and any more jobs wait until one 
of those is finished.

Any help would be greatly appreciated.  Here's the relevant 
configuration information (I think...)

Michelangelo

qmgr -c 'p s':

#
# Create queues and set their attributes.
#
#
# Create and define queue batch
#
create queue batch
set queue batch queue_type = Execution
set queue batch resources_default.nodes = 1:ppn=1
set queue batch enabled = True
set queue batch started = True
#
# Set server attributes.
#
set server scheduling = True
set server operators = mdagost at icecube.berkeley.edu
set server operators += root at icecube.berkeley.edu
set server default_queue = batch
set server log_events = 511
set server mail_from = adm
set server resources_default.nodes = 1
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 6
set server default_node = 1
set server node_pack = False
set server pbs_version = 2.0.0p7

resource_group:

mdagost         50      root    6
itaboada        51      root    6
hardtke         52      root    6
kurt            53      root    6
...etc

sched_config:

# This is the config file for the scheduling policy
# FORMAT:  option: value prime_option
#       option          - the name of what we are changing defined in 
config.h
#       value           - can be boolean/string/numeric depending on the 
option
#       prime_option    - can be prime/non_prime/all ONLY FOR SOME OPTIONS

# Round Robin -
#       run a job from each queue before running second job from the
#       first queue.

round_robin: False      all


# By Queue -
#       run jobs by queues.
#       If it is not set, the scheduler will look at all the jobs on
#       on the server as one large queue, and ignore the queues set
#       by the administrator
#       PRIME OPTION

by_queue: True          prime
by_queue: True          non_prime


# Strict Fifo -
#       run jobs in strict fifo order.  If one job can not run
#       move onto the next queue and do not run any more jobs
#       out of that queue even if some jobs in the queue could
#       be run.
#       If it is not set, it could very easily starve the large
#       resource using jobs.
#       PRIME OPTION

strict_fifo: false      ALL


#
# fair_share - schedule jobs based on usage and share values
#       PRIME OPTION
#
fair_share: false       ALL

# Help Starving Jobs -
#       Jobs which have been waiting a long time will
#       be considered starving.  Once a job is considered
#       starving, the scheduler will not run any jobs
#       until it can run all of the starving jobs. 
#       PRIME OPTION

help_starving_jobs      true    ALL

#
# sort_queues - sort queues by the priority attribute
#       PRIME OPTION
#
sort_queues     true    ALL

#
# load_balancing - load balance between timesharing nodes
#       PRIME OPTION
#
load_balancing: true    ALL

# sort_by:
# key:
#       to sort the jobs on one key, specify it by sort_by
#       If multiple sorts are necessary, set sory_by to multi_sort
#       specify the keys in order of sorting

# if round_robin or by_queue is set, the jobs will be sorted in their
# respective queues.  If not the entire server will be sorted.


# different sorts - defined in globals.c
# no_sort shortest_job_first longest_job_first smallest_memory_first
# largest_memory_first high_priority_first low_priority_first multi_sort
# fair_share large_walltime_first short_walltime_first
#
#       PRIME OPTION
sort_by: shortest_job_first     ALL

# filter out prolific debug messages
# 256 are DEBUG2 messages
#       NO PRIME OPTION
log_filter: 256

# all queues starting with this value are dedicated time queues
# i.e. dedtime or dedicatedtime would be dedtime queues
#       NO PRIME OPTION
dedicated_prefix: ded

# this defines how long before a job is considered starving.  If a job has
# been queued for this long, it will be considered starving
#       NO PRIME OPTION
max_starve: 24:00:00

# The following three config values are meaningless with fair share 
turned off

# half_life - the half life of usage for fair share
#       NO PRIME OPTION
half_life: 24:00:00

# unknown_shares - the number of shares for the "unknown" group
#       NO PRIME OPTION
unknown_shares: 10

# sync_time - the amount of time between syncing the usage information 
to disk
#       NO PRIME OPTION
sync_time: 1:00:00




More information about the torqueusers mailing list