[torqueusers] downing a node via qmgr
Garrick Staples
garrick at usc.edu
Wed Sep 21 13:02:28 MDT 2005
On Wed, Sep 21, 2005 at 12:22:03PM -0400, Stewart.Samuels at sanofi-aventis.com alleged:
> I have just experienced strange behaviour with qmgr. We currently
> have a node which is rebooting itself constantly. To take the system
> out of the cluster to diagnose the problem, I have specify the
> following command:
>
> qmgr -c 's n node-name state=down'
>
> For a few moments, once the qmgr command is issued, subsequent
> "pbsnode -a" commands show node-name "down". But for some reason, it
> then shows the node as "free" again.
As was already pointed out, use pbsnodes -o/-c.
> When there is such a failure (this has occurred a few times in our
> cluster), is there a way (other than qmgr) of temporarily removing
> nodes in which deleting the node in the server_prive/nodes file and
> restarting pbs_server using the "-t create" argument is not necessary?
"-t create" is only used once, when pbs_server is initially installed.
It should never be used once the db is created.
--
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20050921/8e5a08c7/attachment.bin
More information about the torqueusers
mailing list