[torqueusers] Re: Scalability issues with pbs_sched_cc

Dwight Kelly dkelly at apago.com
Mon Nov 29 10:09:44 MST 2004


> We have experienced problems when submitting large numbers of small
> jobs to our system. We have about 35 nodes and when we submit say
> 10,000 jobs that average 5 minutes each the system struggles to keep
> all nodes busy. I haven't had time to investigate though.

We have also noticed this behavior with the FIFO scheduler. I adjusted the 
"scheduler_iteration" variable from the default of 600 to 200 and got some 
improvement. I can also force the scheduler to run jobs by submitting a 
new job.

It appears that the scheduler will iterate over the queued jobs trying to 
submit them. After a certain number of passes it gives up and waits some 
amount of time before trying to schedule new jobs. However, if a new job 
is submitted it immediately tries to schedule pending jobs. This behavior 
is most apparent if you have a lot of short-runtime jobs queued.

---
Dwight Kelly
Apago, Inc.  4080 McGinnis Ferry Rd  Suite 601 Alpharetta, GA 30005
voice:(770) 619-1884  fax:(770) 619-1885
email: dkelly at apago.com web: http://www.apago.com

PDF Enhancer 2.6 - Assemble, optimize, shrink, repurpose, secure, stamp 
and impose PDF files. Available for Windows and Mac OS X. 
http://www.apago.com/pdfenhancer


More information about the torqueusers mailing list