[torqueusers] Submission number limits?
Nate.A.Woody at runbox.com
Thu May 8 09:41:34 MDT 2008
Out of curiosity, do you submit through Moab (msub) or Torque (qsub)?
----- Start Original Message -----
Sent: Thu, 8 May 2008 11:11:28 -0400
From: Brock Palen <brockp at umich.edu>
To: "Nate Woody" <Nate.A.Woody at runbox.com>
Subject: Re: [torqueusers] Submission number limits?
> What is the true limit in such a situation? We have some users who
> submit a few thousand (not near 140k) and we don't notice much of an
> issue (we use Moab).
> Is it disk bound on the server? Could it be sped up with 15k drive
> for /var/spool/torque/server_priv/jobs/ ?
> What about solid state drives? Our jobs directory never has been
> over a gig I think. With 609 jobs right now (very low for us) its
> or is the problem in torque?
> Brock Palen
> Center for Advanced Computing
> brockp at umich.edu
> On May 7, 2008, at 4:15 PM, Nate Woody wrote:
> > Jeremy,
> > I've done similar things like this though I've always used Maui.
> > One thing I have had problems with when doing things like this is
> > the rate of job submission. If you got all 140000 submitted in
> > less than an hour is something like 40/s and I've never
> > successfully submitted jobs that quickly (though perhaps others can
> > speak otherwise to that). I assume that you've got the submission
> > locked in a tight loop in a shell script or something, and it might
> > be worth it to put a pause in between each submission. That's
> > going to suck for 140000 jobs, but it might be worth seeing if
> > you're able to get the jobs in that way.
> > Best,
> > Nate
> > ----- Start Original Message -----
> > Sent: Wed, 7 May 2008 14:53:04 -0500 (CDT)
> > From: "Jeremy Mann" <jeremy at biochem.uthscsa.edu>
> > To: torqueusers at supercluster.org
> > Subject: [torqueusers] Submission number limits?
> >> Good afternoon all, I have one user that wants to submit roughly
> >> 140,000
> >> jobs to our queue. We tried it last week and it never worked. It took
> >> nearly an hour to submit all of them, then the PBS scheduler would
> >> stop
> >> responding and give:
> >> 05/02/2008 14:39:50;0100; pbs_sched;Req;;Leaving schedule
> >> 05/02/2008 14:39:50;0080; pbs_sched;Svr;main;brk point 760373248
> >> 05/02/2008 14:39:53;0100; pbs_sched;Req;;Entering Schedule
> >> 05/02/2008 14:42:53;0002; pbs_sched;Svr;toolong;alarm call
> >> The jobs are quite small and they run for about a minute. Now we're
> >> thinking about breaking them up into 100 or 1000 job chunks.
> >> I'm curious if the number of jobs being submitted, in our case
> >> 140,000, is
> >> too large for PBS/Torque to handle.
> >> Torque 2.1.2 x86_64 and the built in scheduler (not MAUI)
> >> --
> >> Jeremy Mann
> >> jeremy at biochem.uthscsa.edu
> >> University of Texas Health Science Center
> >> Bioinformatics Core Facility
> >> http://www.bioinformatics.uthscsa.edu
> >> Phone: (210) 567-2672
> >> _______________________________________________
> >> torqueusers mailing list
> >> torqueusers at supercluster.org
> >> http://www.supercluster.org/mailman/listinfo/torqueusers
> > ----- End Original Message -----
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> torqueusers mailing list
> torqueusers at supercluster.org
----- End Original Message -----
More information about the torqueusers