[torqueusers] Non-cummulative pbsnodes -o command

John Wang jwang at dataseekonline.com
Thu Feb 14 11:02:17 MST 2008


I have two clusters, one is running torque 2.0.0p7 and the other torque
2.2.1.   Both versions behave with the "pbsnodes -o" effect being
non-cummulative thereby requiring all nodes known to be down to be specified
on the command line.   The 2.0.0p7 man page on pbsnodes actually state that
this would be the behaviour.   The man page reads as follows:

> ³It is important that all the nodes known to be down are given as arguments on
> the command line.   This is because nodes which are not listed are assumed to
> be UP and will be indicated as such if they were previously marked DOWN.²

The 2.0.0p7 installation is a legacy install that predates the current
support staff so the history of that installation is unknown but the 2.2.1
installation was compiled from source files obtained from Cluster Resources
with default options.


On 2/13/08 4:51 PM, "Garrick Staples" <garrick at usc.edu> wrote:

> On Wed, Feb 13, 2008 at 04:41:10PM -0600, John Wang alleged:
>> This is probably a fairly basic beef about Torque but it has been bugging
>> the hell out of me.
>> When using the command ³pbsnode ­o ­N Œmessage¹ node1 node2 ...² to drain
>> nodes, we have to list all the nodes that we want to stay drained in that
>> single command ie.: even if a node was previously drained for another
>> reason, issuing a ³pbsnode ­o² command without specifying the previously
>> drained nodes results in those nodes being marked online instead of offline.
>> This is truly ridiculous, it¹s like playing a game of bonk the gopher.   It
>> is conceivable to have more nodes that you would want drained then can be
>> specified on a single line and there should be no reason for us to have to
>> independently track what nodes should be offline.
>> As near as I can tell, most people avoid this with creative workarounds such
>> as shutting down the pbs_mom daemon on the nodes to be offline or by
>> creating very specific reservations.   I¹d imagine that there may be more
>> such creative workarounds spurring more diversity in operational practices.
>> So the question is, how do you work around this ridiculous behaviour at your
>> site and is there any valid technical reason for pbsnodes to work in this
>> fashion?
> I have never seen or heard of these behaviours that you are describing.  In
> fact, I rarely ever specify more than one node at time.
> What version of torque are you using?
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20080214/ac94c12b/attachment.html

More information about the torqueusers mailing list