[torqueusers] routing queue not assigning jobs as per expectedattributes

Davis, Phillip Spencer psdavis at bsu.edu
Sat Sep 29 00:22:02 MDT 2007


Mr. DuChene,
I had a similar problem a few weeks back. Separate queues for 32 bit and 64 bit nodes proved to be unworkable, but with individual node properties definition in the Maui config file, I got a mostly workable setup. I've seen suggestions on the list that you should request the jobs with -l nodes arch=[ia64,x86_64,whatever] and was reading today that this is a definable setting in the pbs server's nodes file, as opposed to something that the pbs_mom's reported back to the pbs_server, so you might try setting arch= with qmgr or in your server_priv/nodes file. I have gotten the impression from the list that the arch= should just work without any setup and had assumed that my problems where just due to running an older version of the resource manager. If it works, please let me know, I was going to explore those settings while I was testing the 2.1.9 version of Torque...  
 
                             Spencer

________________________________

From: torqueusers-bounces at supercluster.org on behalf of Steven DuChene
Sent: Fri 9/28/2007 5:03 PM
To: torqueusers at supercluster.org
Subject: [torqueusers] routing queue not assigning jobs as per expectedattributes



I am struggling with a multiple architecture setup where some nodes are
x86_64 and another piece is ia64 shared memory system. I have set
attributes on the nodes of either x86_64 or ia64 and I have two queues
setup for each group of nodes.
I also have a routing queue setup as the default queue.

The attributes on the nodes I set using:

qmgr -c "set node oscarnode1 properties += x86_64"

or

qmgr -c "set node oscar_a450 properties += ia64"

When I submit jobs I do:

qsub -l nodes=1:ia64 myjob_ia64.pbs

or

qsub -l nodes=1:x86_64 myjob_x86_64.pbs

where both of these job scripts do a veryt simple call to hostname.

When I submit either of the above jobs it ends up going to the x86_64
queue. If it is a job asking for x86_64 attribute
nodes it runs right through and I get the expected output. If I ask for
nodes with an attribute of ia64, it still gets sent to the
x86_64 queue but the job just stalls since that queue does not have any
resources with attributes of ia64. I was thinking
this might be a problem with my moab configuration but Doug Wightman
from CRI said I might not have my routing
queue setup properly and suggested I ask about that over here on the
torqueusers mailing list.

My print server output looks like the following:

# Create queues and set their attributes.
#
#
# Create and define queue batchx86
#
create queue batchx86
set queue batchx86 queue_type = Execution
set queue batchx86 acl_host_enable = False
set queue batchx86 acl_hosts = oscarnode4
set queue batchx86 acl_hosts += oscarnode3
set queue batchx86 acl_hosts += oscarnode2
set queue batchx86 acl_hosts += oscarnode1
set queue batchx86 resources_default.walltime = 01:00:00
set queue batchx86 enabled = True
set queue batchx86 started = True
#
# Create and define queue batchia64
#
create queue batchia64
set queue batchia64 queue_type = Execution
set queue batchia64 acl_host_enable = False
set queue batchia64 acl_hosts = oscar_a450
set queue batchia64 resources_default.walltime = 01:00:00
set queue batchia64 enabled = True
set queue batchia64 started = True
#
# Create and define queue route
#
create queue route
set queue route queue_type = Route
set queue route route_destinations = batchx86
set queue route route_destinations += batchia64
set queue route enabled = True
set queue route started = True
#
# Set server attributes.
#
set server default_queue = route
set server log_events = 511
set server mail_from = adm
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 6
set server pbs_version = 2.1.9

Does this look like it should be working ok to send jobs asking for
nodes with ia64 attributes to the right execution queue
with those resources?

Any hints or informational pointers would be most appreciated.
--
Steven A. DuChene
_______________________________________________
torqueusers mailing list
torqueusers at supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers





More information about the torqueusers mailing list