[torqueusers] Submission number limits?

Nate Woody Nate.A.Woody at runbox.com
Thu May 8 09:41:34 MDT 2008


Brock,

Out of curiosity, do you submit through Moab (msub) or Torque (qsub)?  

Best,
Nate


----- Start Original Message -----
Sent: Thu, 8 May 2008 11:11:28 -0400
From: Brock Palen <brockp at umich.edu>
To: "Nate Woody" <Nate.A.Woody at runbox.com>
Subject: Re: [torqueusers] Submission number limits?

> What is the true limit in such a situation?  We have some users who  
> submit a few thousand (not near 140k)  and we don't notice much of an  
> issue (we use Moab).
> Is it disk bound on the server?  Could it be sped up with 15k drive  
> for /var/spool/torque/server_priv/jobs/  ?
> 
> What about solid state drives?  Our jobs directory never has been  
> over a gig I think.  With 609 jobs right now (very low for us)  its  
> 8.5MB.
> 
> or is the problem in torque?
> 
> Brock Palen
> www.umich.edu/~brockp
> Center for Advanced Computing
> brockp at umich.edu
> (734)936-1985
> 
> 
> 
> On May 7, 2008, at 4:15 PM, Nate Woody wrote:
> > Jeremy,
> >
> > I've done similar things like this though I've always used Maui.   
> > One thing I have had problems with when doing things like this is  
> > the rate of job submission.  If you got all 140000 submitted in  
> > less than an hour is something like 40/s and I've never  
> > successfully submitted jobs that quickly (though perhaps others can  
> > speak otherwise to that).  I assume that you've got the submission  
> > locked in a tight loop in a shell script or something, and it might  
> > be worth it to put a pause in between each submission.  That's  
> > going to suck for 140000 jobs, but it might be worth seeing if  
> > you're able to get the jobs in that way.
> >
> > Best,
> > Nate
> >
> >
> > ----- Start Original Message -----
> > Sent: Wed, 7 May 2008 14:53:04 -0500 (CDT)
> > From: "Jeremy Mann" <jeremy at biochem.uthscsa.edu>
> > To: torqueusers at supercluster.org
> > Subject: [torqueusers] Submission number limits?
> >
> >> Good afternoon all, I have one user that wants to submit roughly  
> >> 140,000
> >> jobs to our queue. We tried it last week and it never worked. It took
> >> nearly an hour to submit all of them, then the PBS scheduler would  
> >> stop
> >> responding and give:
> >>
> >> 05/02/2008 14:39:50;0100; pbs_sched;Req;;Leaving schedule
> >>
> >> 05/02/2008 14:39:50;0080; pbs_sched;Svr;main;brk point 760373248
> >> 05/02/2008 14:39:53;0100; pbs_sched;Req;;Entering Schedule
> >> 05/02/2008 14:42:53;0002; pbs_sched;Svr;toolong;alarm call
> >>
> >> The jobs are quite small and they run for about a minute. Now we're
> >> thinking about breaking them up into 100 or 1000 job chunks.
> >>
> >> I'm curious if the number of jobs being submitted, in our case  
> >> 140,000, is
> >> too large for PBS/Torque to handle.
> >>
> >> Torque 2.1.2 x86_64 and the built in scheduler (not MAUI)
> >>
> >> -- 
> >> Jeremy Mann
> >> jeremy at biochem.uthscsa.edu
> >>
> >> University of Texas Health Science Center
> >> Bioinformatics Core Facility
> >> http://www.bioinformatics.uthscsa.edu
> >> Phone: (210) 567-2672
> >>
> >> _______________________________________________
> >> torqueusers mailing list
> >> torqueusers at supercluster.org
> >> http://www.supercluster.org/mailman/listinfo/torqueusers
> >>
> >
> > ----- End Original Message -----
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> >
> >
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
> 

----- End Original Message -----


More information about the torqueusers mailing list