[torqueusers] Unable to run sequential job
Simard, Jonathan
jsimard at teraxion.com
Tue Feb 12 11:00:07 MST 2013
Dear,
I'm unable to run sequential job instead I set the max_running setting to one.
I can run multiple job at the same time and I would like to wait for the first job finish before the other start :
lumerical at XXX:~/Simulation> qstat
Job id Name User Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
107.XXX STDIN lumerical 0 R test
108.XXX STDIN lumerical 0 R test
109.XXX STDIN lumerical 0 R test
110.XXX STDIN lumerical 0 R test
111.XXX STDIN lumerical 0 R test
112.XXX STDIN lumerical 0 R test
If I try to start job with resource contention my job stay in queue and did not start automatically after de resources are free:
lumerical at XXX:~/Simulation/Jonathan> qrun 105
qrun: Resource temporarily unavailable MSG=job allocation request exceeds currently available cluster nodes, 1 requested, 0 available 106.XXX.teraxion
lumerical at XXX:~/Simulation/Jonathan> qstat
Job id Name User Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
102.XXX STDIN lumerical 00:00:00 C test
103.XXX STDIN lumerical 00:00:00 C test
104.XXX STDIN lumerical 0 R test
105.XXX STDIN lumerical 0 Q test
106.XXX STDIN lumerical 0 Q test
lumerical at XXX:~/Simulation/Jonathan> qstat
Job id Name User Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
102.XXX STDIN lumerical 00:00:00 C test
103.XXX STDIN lumerical 00:00:00 C test
104.XXX STDIN lumerical 00:00:00 C test
105.XXX STDIN lumerical 0 Q test
106.XXX STDIN lumerical 0 Q test
I use Torque 4.2.0
XXX:~ # qmgr -c 'p s'
#
# Create queues and set their attributes.
#
#
# Create and define queue test
#
create queue test
set queue test queue_type = Execution
set queue test max_queuable = 10
set queue test max_running = 2
set queue test resources_default.nodes = 1
set queue test resources_default.walltime = 01:00:00
set queue test enabled = True
set queue test started = True
#
# Create and define queue batch
#
create queue batch
set queue batch queue_type = Execution
set queue batch max_running = 1
set queue batch resources_max.ncpus = 24
set queue batch resources_default.nodes = 1
set queue batch resources_default.walltime = 01:00:00
set queue batch enabled = True
set queue batch started = True
#
# Set server attributes.
#
set server scheduling = True
set server acl_hosts = XXX
set server managers = lumerical at XXX.Teraxion<mailto:lumerical at XXX.Teraxion>
set server operators = lumerical at XXX.Teraxion<mailto:lumerical at XXX.Teraxion>
set server default_queue = test
set server log_events = 511
set server mail_from = adm
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 300
set server job_stat_rate = 45
set server poll_jobs = True
set server keep_completed = 300
set server next_job_number = 113
Thanks for your help.
Jonathan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20130212/737a1a5c/attachment-0001.html
More information about the torqueusers
mailing list