[torqueusers] Memory request ignored

Matthias Meinke m.meinke at aia.rwth-aachen.de
Thu Aug 11 06:58:47 MDT 2011


Hi,

we have just made the transition from gridengine to torque for a small cluster 
with about 300 cpu cores. That is why I am not yet an experienced torque user.

Right now I have the following question:

Although all nodes have a physical memory of less or equal 32 GByte, a job 
which requires the resources -lmem=90GB or -lpmem=90GB which I just specified 
for testing purposes is accepted and executed in one of the configured queues. 
I expected that such a job is queued, but never executed because of non 
available resources.

The configuration I am using is:

maui-3.3.1.tar.gz
torque-3.0.2.tar.gz

What I sound in the mailing list is a suggestion to specify the option

RESOURCELIMITPOLICY MEM:ALWAYS:CANCEL


what I also specified in maui.cfg, but which did not  stop scheduling those 
jobs, which require more memory than available. Is this the default behaviour 
or did I miss something in the configuration?

I am grateful for any hints.

The pbs server configuration is here

qmgr -c 'print server'
# Set server attributes.
#
set server scheduling = True
set server acl_hosts = aia256
set server managers = root at aia256
set server operators = root at aia256
set server default_queue = default
set server log_events = 511
set server mail_from = adm
set server query_other_jobs = True
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 6
set server mom_job_sync = True
set server keep_completed = 300
set server allow_node_submit = True
set server next_job_number = 104
# Create and define queue default
#
create queue default
set queue default queue_type = Route
set queue default route_destinations = four
set queue default enabled = True
set queue default started = True
# Create and define queue four
#
create queue four
set queue four queue_type = Execution
set queue four from_route_only = True
set queue four resources_max.ncpus = 16
set queue four resources_max.nodect = 4
set queue four resources_min.ncpus = 1
set queue four resources_min.nodect = 1
set queue four resources_default.ncpus = 1
set queue four resources_default.nodect = 1
set queue four enabled = True
set queue four started = True





More information about the torqueusers mailing list