[torquedev] Re: [Mauiusers] Cluster Size Detection
csamuel at vpac.org
Sat Feb 23 20:56:55 MST 2008
----- "Jim Kusznir" <jkusznir at gmail.com> wrote:
> When I first brought up my cluster of 24 8-core systems, maui
> mis-detected the size of the system and rejected jobs that requested
> more than a certian number of cores.
This is the age old problem with nodes=n being used to specify
the number of cpus and then getting checked that n < number of nodes.
> I had to change a config file to let users use the entire cluster.
Would that be nodect ?
> That is working; however, it also lets users submit
> jobs larger than the entire cluster, so they just
> sit in the queue until the user finds me and I
> explain where they went wrong.
Yup, that's what happens - I've long argued that Torque
and Maui/Moab should have a new parameter (perhaps cpus=n)
to disambiguate the situation.
> It would be preferred if I could re-enable the admission control, but
> have it accurately represent the size of the cluster. Is this
I don't believe so. :-(
> I think this is a maui problem, but if its actually a torque problem,
> please let me know and I'll ask on the torque list...I'm not yet
> familiar enough with where the dividing line is....
It's more of a Torque issue, I've CC'd this to the
Torque Users & Development lists in case there's
something that I've missed (or something on the
Christopher Samuel - (03) 9925 4751 - Systems Manager
The Victorian Partnership for Advanced Computing
P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency
More information about the torquedev