[torqueusers] Submission script: Requesting cpus rather than nodes with cpus.

Oded Ben-Ozer oded.ben-ozer at opal-systems.co.il
Mon Aug 11 08:53:04 MDT 2008


I would use CONTIGUOUS*  *NODEALLOCATEIONPOLICY,get the MPI jobs higher
priority (via QOS of CLASS) and enable  "Backfill
Chunking"<http://www.clusterresources.com/products/maui/docs/8.2backfill.shtml#config>
.
just remember to enforce realistic walltime on all jobs to get Backfill
working efficiently.


On Tue, Jul 29, 2008 at 12:37 AM, Rob Lines <rlinesseagate at gmail.com> wrote:

> We are running into a problem with our nodes as we have a wide mix of users
> between people that have single cpu jobs and need to run 1000s of them each
> time they run and other users that only have one job but that needs a larger
> number of cpus. We are running into the single jobs being schedualed and the
> multi node/cpu jobs taking forever to get schedualed because they can't find
> the required number of nodes with X cpus available.  In our case we have 48
> nodes with 45 of them having Infiniband for MPI and our mpi jobs are 40 to
> 64 cores.  We would like to have a way to just ask for 40 or 64 cores.  The
> 64 one dies when you ask for 64 nodes so the work around had been to ask for
> 16 nodes with ppn=4 but we don't end up with 16 nodes completely empty
> hardly ever as we have some single cpu jobs that have run for a week or
> better and we had our NODEALLOCATEIONPOLICY set to CPULOAD but that results
> in single cpu jobs being spread out across lots of nodes so it takes a while
> before it before they become free.
>
> So we are looking for a way to request cpus(cores) rather than cpus per
> machine because the simulations could just as easily be spread out through
> all the IB nodes.  I could not find anything in the docs on how to do that.
>
> We are using Torque with the Maui Schedualer.  If anyone has a suggestion
> on a good configuration for a cluster that has a wide mix of job types and
> also has some applications running on the cluster that are outside of torque
> or can point me at one that would work I would appreciate it.
>
>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>


-- 
Oded Ben Ozer
+972 544 825290
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20080811/a7e16766/attachment-0001.html


More information about the torqueusers mailing list