[torqueusers] My Torque's settings was incorrect. Please help.

notinh notien notinhnotien7 at hotmail.com
Tue Aug 7 18:38:32 MDT 2007

Hi, all.  My pbs system has some running jobs on the computing nodes and 
since I only have this cluster, I changed my torque set up and create 
routing default queue instead of a single execution default queue.

After I changed it, I did qterm -t quick and then restarted pbs server.  I 
then found out that my pbs server still ran ok.  However, I modified the 
queue's configuration a bit more and also qterm -t quick a couple of times 
(with restarts) then I got this error message.  At that point I could not 
start my pbs server anymore and I had to revert back to my original 
configuration and remove serverdb in server_priv.

/etc/init.d/pbs_server start
Starting TORQUE Server: pbs_server: svr_func.c:222: set_resc_assigned: 
Assertion `pjob->ji_qhdr->qu_qs.qu_type == 1' failed.

Anyway, my goal is to limit one user to only be able to use a set of nodes 
(I modified nodes file to include the appropriate attributes: mats or 
single).  I also want to set aside set of nodes for jobs that required only 
1 cpu and the rest of the jobs should go to the regular queue for the rest 
of the nodes.

Could my configuration have something that prevented my pbs server to start? 
  Is my configuration correct for the goals I want?  Anything wrong or 
recommendation for improvement here?

During the initial testing, I submitted a job that requires 1 node and 2 
cpus (my node is dual procs), but somehow this job always ended up in single 
queue and got queued and delayed.  I want such jobs require 2 or more cpus 
ended up in reg queue and got run.

Thank you very much for all the helps.

#create queue default
set queue default queue_type = Route
set queue default route_destinations = Mats
set queue default route_destinations += single
set queue default route_destinations += reg
set queue default kill_delay = 90
set queue default enabled = True
set queue default started = True

#Create queue reserve for Mats
create queue Mats
set queue Mats queue_type = Execution
set queue Mats resources_default.neednodes = mats

#host_enable = false to have nodes mapped to queue
set queue Mats acl_hosts = "node11,node12,node13,node16"
set queue Mats acl_host_enable = False
set queue Mats acl_users  = Mats
set queue Mats acl_user_enable = True
set queue Mats enabled = True
set queue Mats started = True

#Create queue for single cpu jobs
#create queue single
set queue single queue_type = Execution
set queue single resources_default.neednodes = single

#host_enable = false to have nodes mapped to queue
set queue single acl_hosts = "node01,node02,node03,node04"
set queue single acl_host_enable = False
set queue single acl_users = Susan
set queue single acl_users += Ben
set queue single acl_user_enable = True
set queue single resources_max.ncpus = 1
#set queue single resources_default.ncpus = 1
set queue single enabled = True
set queue single started = True

#Create queue for the rest of the other jobs
#create queue reg
set queue reg queue_type = Execution
set queue reg acl_users = Susan
set queue reg acl_users += Ben
set queue reg acl_user_enable = True
set queue reg resources_min.ncpus = 2
#set queue reg resources_min.nodect = 1
set queue reg resources_max.ncpus = 16
#set queue reg resources_max.nodect = 8
#set queue reg resources_default.ncpus = 2
#set queue reg resources_default.nodes = 1
set queue reg enabled = True
set queue reg started = True

# Set server attributes.
set server scheduling = True
set server managers = root at mars.myhost.com
set server operators = root at mars.myhost.com
set server default_queue = default
set server log_events = 511
set server mail_from = adm
set server query_other_jobs = True
set server resources_default.walltime = 168:00:00
set server scheduler_iteration = 60
set server node_ping_rate = 300
set server node_check_rate = 600
set server tcp_timeout = 6
set server node_pack = False

Don't just search. Find. Check out the new MSN Search! 

More information about the torqueusers mailing list