[torqueusers] Submission number limits?

Brock Palen brockp at umich.edu
Thu May 8 09:11:28 MDT 2008


What is the true limit in such a situation?  We have some users who  
submit a few thousand (not near 140k)  and we don't notice much of an  
issue (we use Moab).
Is it disk bound on the server?  Could it be sped up with 15k drive  
for /var/spool/torque/server_priv/jobs/  ?

What about solid state drives?  Our jobs directory never has been  
over a gig I think.  With 609 jobs right now (very low for us)  its  
8.5MB.

or is the problem in torque?

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
brockp at umich.edu
(734)936-1985



On May 7, 2008, at 4:15 PM, Nate Woody wrote:
> Jeremy,
>
> I've done similar things like this though I've always used Maui.   
> One thing I have had problems with when doing things like this is  
> the rate of job submission.  If you got all 140000 submitted in  
> less than an hour is something like 40/s and I've never  
> successfully submitted jobs that quickly (though perhaps others can  
> speak otherwise to that).  I assume that you've got the submission  
> locked in a tight loop in a shell script or something, and it might  
> be worth it to put a pause in between each submission.  That's  
> going to suck for 140000 jobs, but it might be worth seeing if  
> you're able to get the jobs in that way.
>
> Best,
> Nate
>
>
> ----- Start Original Message -----
> Sent: Wed, 7 May 2008 14:53:04 -0500 (CDT)
> From: "Jeremy Mann" <jeremy at biochem.uthscsa.edu>
> To: torqueusers at supercluster.org
> Subject: [torqueusers] Submission number limits?
>
>> Good afternoon all, I have one user that wants to submit roughly  
>> 140,000
>> jobs to our queue. We tried it last week and it never worked. It took
>> nearly an hour to submit all of them, then the PBS scheduler would  
>> stop
>> responding and give:
>>
>> 05/02/2008 14:39:50;0100; pbs_sched;Req;;Leaving schedule
>>
>> 05/02/2008 14:39:50;0080; pbs_sched;Svr;main;brk point 760373248
>> 05/02/2008 14:39:53;0100; pbs_sched;Req;;Entering Schedule
>> 05/02/2008 14:42:53;0002; pbs_sched;Svr;toolong;alarm call
>>
>> The jobs are quite small and they run for about a minute. Now we're
>> thinking about breaking them up into 100 or 1000 job chunks.
>>
>> I'm curious if the number of jobs being submitted, in our case  
>> 140,000, is
>> too large for PBS/Torque to handle.
>>
>> Torque 2.1.2 x86_64 and the built in scheduler (not MAUI)
>>
>> -- 
>> Jeremy Mann
>> jeremy at biochem.uthscsa.edu
>>
>> University of Texas Health Science Center
>> Bioinformatics Core Facility
>> http://www.bioinformatics.uthscsa.edu
>> Phone: (210) 567-2672
>>
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>
>
> ----- End Original Message -----
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>



More information about the torqueusers mailing list