[Mauiusers] Cluster Size Detection

Chris Samuel csamuel at vpac.org
Sat Feb 23 20:56:55 MST 2008


----- "Jim Kusznir" <jkusznir at gmail.com> wrote:

> When I first brought up my cluster of 24 8-core systems, maui
> mis-detected the size of the system and rejected jobs that requested
> more than a certian number of cores.

This is the age old problem with nodes=n being used to specify
the number of cpus and then getting checked that n < number of nodes.

> I had to change a config file to let users use the entire cluster.

Would that be nodect ?

> That is working; however, it also lets users submit
> jobs larger than the entire cluster, so they just
> sit in the queue until the user finds me and I
> explain where they went wrong.

Yup, that's what happens - I've long argued that Torque
and Maui/Moab should have a new parameter (perhaps cpus=n)
to disambiguate the situation.

> It would be preferred if I could re-enable the admission control, but
> have it accurately represent the size of the cluster.  Is this
> possible?

I don't believe so. :-(

> I think this is a maui problem, but if its actually a torque problem,
> please let me know and I'll ask on the torque list...I'm not yet
> familiar enough with where the dividing line is....

It's more of a Torque issue, I've CC'd this to the
Torque Users & Development lists in case there's
something that I've missed (or something on the
cards).

cheers,
Chris
-- 
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency


More information about the mauiusers mailing list