[torqueusers] Node np parameter adjusted automatically since
widyono at seas.upenn.edu
Wed Jun 21 09:24:55 MDT 2006
> We're trying to get to an easier initial configuration.
I understand and agree with your reasoning, just am trying to clean up file
handling implementation (as I understand it from this conversation).
> For the first time setup, I don't know if we can "assume" a proper setup.
But for a first time setup there wouldn't be a nodes file so there is no need
to alter anything. I don't have a problem with this logic:
no nodes file exists? set it up for them, use your algorithm for determining
nodes file exists? make a nodes.suggested file OR move nodes file to
nodes.previous and make new nodes file
Changing an existing manually-created nodes file in situ is not kosher from a
sysadmin perspective (mine). At the least, make a backup if a nodes file
already exists. How to determine manually-created vs. auto? Easy, in your
"create/modify the nodes file" add a header (does pbs_server allow comments
in the nodes file? if not, that might be handy for sysadmin/queue mgr).
Other than that, I'm completely fine with whatever algorithm you end up using
to calculate np.
> That particular
> config seems redundant to me anyways. If MOM already reads the number of
> CPUs, and advertises it to pbs_server, why shouldn't the config be automatic?
I agree, if no manual configuration exists, then pbs_server should know what
to do given the provided resource information. That's what we did in
Clubmask; new node comes up? just push it into the database.
> I'm thinking along the lines of a tri-state value:
> Unset means "set np=ncpus if (np==1 && ncpus>np)"
> True is more strict with "set np=ncpus if (np!=ncpus)"
> False completely disables the feature.
> The default would be unset.
> Does that sound reasonable?
Sure. Again, I'm talking about the logic dealing with file handling, not the
values contained therein, and I'm only concerned with previously existing
nodes file configuration being altered without the user's consent, or worse,
without their knowledge.
If you agree with my logic surrounding file handling, I could take a stab at
coding the patch -- but no guarantees on the quality. I'm a sysadmin, not a
doctor! I mean, programmer. Not a programmer.
On another tack, if we don't want pbs_server in the role of file management,
then let's not use a file at all.
Finally, if the file is only intended for caching purposes at pbs_server
startup (as oppposed to required initial configuration by sysadmin), then
that should be clearly documented (and put in the CHANGES as a new
purpose/way of thinking about the nodes file).
More information about the torqueusers