[torqueusers] Submission number limits?

Brock Palen brockp at umich.edu
Thu May 8 11:26:09 MDT 2008


We use qsub right now.

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
brockp at umich.edu
(734)936-1985



On May 8, 2008, at 11:41 AM, Nate Woody wrote:
> Brock,
>
> Out of curiosity, do you submit through Moab (msub) or Torque (qsub)?
>
> Best,
> Nate
>
>
> ----- Start Original Message -----
> Sent: Thu, 8 May 2008 11:11:28 -0400
> From: Brock Palen <brockp at umich.edu>
> To: "Nate Woody" <Nate.A.Woody at runbox.com>
> Subject: Re: [torqueusers] Submission number limits?
>
>> What is the true limit in such a situation?  We have some users who
>> submit a few thousand (not near 140k)  and we don't notice much of an
>> issue (we use Moab).
>> Is it disk bound on the server?  Could it be sped up with 15k drive
>> for /var/spool/torque/server_priv/jobs/  ?
>>
>> What about solid state drives?  Our jobs directory never has been
>> over a gig I think.  With 609 jobs right now (very low for us)  its
>> 8.5MB.
>>
>> or is the problem in torque?
>>
>> Brock Palen
>> www.umich.edu/~brockp
>> Center for Advanced Computing
>> brockp at umich.edu
>> (734)936-1985
>>
>>
>>
>> On May 7, 2008, at 4:15 PM, Nate Woody wrote:
>>> Jeremy,
>>>
>>> I've done similar things like this though I've always used Maui.
>>> One thing I have had problems with when doing things like this is
>>> the rate of job submission.  If you got all 140000 submitted in
>>> less than an hour is something like 40/s and I've never
>>> successfully submitted jobs that quickly (though perhaps others can
>>> speak otherwise to that).  I assume that you've got the submission
>>> locked in a tight loop in a shell script or something, and it might
>>> be worth it to put a pause in between each submission.  That's
>>> going to suck for 140000 jobs, but it might be worth seeing if
>>> you're able to get the jobs in that way.
>>>
>>> Best,
>>> Nate
>>>
>>>
>>> ----- Start Original Message -----
>>> Sent: Wed, 7 May 2008 14:53:04 -0500 (CDT)
>>> From: "Jeremy Mann" <jeremy at biochem.uthscsa.edu>
>>> To: torqueusers at supercluster.org
>>> Subject: [torqueusers] Submission number limits?
>>>
>>>> Good afternoon all, I have one user that wants to submit roughly
>>>> 140,000
>>>> jobs to our queue. We tried it last week and it never worked. It  
>>>> took
>>>> nearly an hour to submit all of them, then the PBS scheduler would
>>>> stop
>>>> responding and give:
>>>>
>>>> 05/02/2008 14:39:50;0100; pbs_sched;Req;;Leaving schedule
>>>>
>>>> 05/02/2008 14:39:50;0080; pbs_sched;Svr;main;brk point 760373248
>>>> 05/02/2008 14:39:53;0100; pbs_sched;Req;;Entering Schedule
>>>> 05/02/2008 14:42:53;0002; pbs_sched;Svr;toolong;alarm call
>>>>
>>>> The jobs are quite small and they run for about a minute. Now we're
>>>> thinking about breaking them up into 100 or 1000 job chunks.
>>>>
>>>> I'm curious if the number of jobs being submitted, in our case
>>>> 140,000, is
>>>> too large for PBS/Torque to handle.
>>>>
>>>> Torque 2.1.2 x86_64 and the built in scheduler (not MAUI)
>>>>
>>>> -- 
>>>> Jeremy Mann
>>>> jeremy at biochem.uthscsa.edu
>>>>
>>>> University of Texas Health Science Center
>>>> Bioinformatics Core Facility
>>>> http://www.bioinformatics.uthscsa.edu
>>>> Phone: (210) 567-2672
>>>>
>>>> _______________________________________________
>>>> torqueusers mailing list
>>>> torqueusers at supercluster.org
>>>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>>>
>>>
>>> ----- End Original Message -----
>>> _______________________________________________
>>> torqueusers mailing list
>>> torqueusers at supercluster.org
>>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>>
>>>
>>
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>
>
> ----- End Original Message -----
>
>



More information about the torqueusers mailing list