[torqueusers] maui + torque job start rate

Josh Butikofer josh at clusterresources.com
Wed Apr 1 07:11:46 MDT 2009


First of all, what are the average size of these jobs? Are they single node jobs, or is there a good mix between parallel and single node jobs? A parallel job will take a bit longer to start-up due to the sisters needing to be contacted by the mother superior, etc.

Yeah, Moab's ASYNCSTART option really does help. There are a few other options that can also give a speed boost. In our best tests, Moab & TORQUE can start 50 jobs/sec. I haven't tried the same benchmark with Maui. I'll look through my benchmark setup to see if there are more options/tweaks that Maui can take advantage of.

Josh Butikofer
Cluster Resources, Inc.
#############################

----- "Stijn De Weirdt" <stijn.deweirdt at ugent.be> wrote:

> hi all,
> 
> (this is a crosspost to both maui and torque users list)
> 
> we are having issues with the job start rate using maui+torque.
> starting
> a job takes on average 2 seconds, which is slow for what our users
> are
> dumping in our queues.
> 
> with a job start i mean the following cycle
> 04/01 10:01:08 MRMJobStart(374900,Msg,SC)
> 04/01 10:01:08 MPBSJobStart(374900,gengar,Msg,SC)
> 04/01 10:01:08
> MPBSJobModify(374900,Resource_List,Resource,node088.gengar.gent.vsc)
> 04/01 10:01:10 MPBSJobModify(374900,Resource_List,Resource,1)
> 04/01 10:01:10 INFO:     job '374900' successfully started
> 04/01 10:01:10 INFO:     command sent to server
> 04/01 10:01:10 INFO:     response received from server
> 
> i've already tried to follow the "large cluster" tuning tips to see
> if
> it helps, but no real result. (the only tip that might solve the
> problemn is the asyncstart option from moab ;). (we have a 200 node,
> 8
> core/node cluster (i actually don't think this is "large"))
> 
> anyway, before i dig in the code looking for options, i'm wondering
> what
> other people are seeing as minimal start time, so i know if it is
> possible at all.
> 
> many thanks,
> 
> stijn
> -- 
> The system will shutdown in 5 minutes.
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers


More information about the torqueusers mailing list