[torqueusers] downing a node via qmgr

Garrick Staples garrick at usc.edu
Wed Sep 21 13:02:28 MDT 2005


On Wed, Sep 21, 2005 at 12:22:03PM -0400, Stewart.Samuels at sanofi-aventis.com alleged:
> I have just experienced strange behaviour with qmgr.  We currently
> have a node which is rebooting itself constantly.  To take the system
> out of the cluster to diagnose the problem, I have specify the
> following command:
> 
> 	qmgr -c 's n node-name state=down'
> 
> For a few moments, once the qmgr command is issued, subsequent
> "pbsnode -a" commands show node-name "down".  But for some reason, it
> then shows the node as "free" again.

As was already pointed out, use pbsnodes -o/-c.
 

> When there is such a failure (this has occurred a few times in our
> cluster), is there a way (other than qmgr) of temporarily removing
> nodes in which deleting the node in the server_prive/nodes file and
> restarting pbs_server using the "-t create" argument is not necessary?

"-t create" is only used once, when pbs_server is initially installed.
It should never be used once the db is created.

-- 
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20050921/8e5a08c7/attachment.bin


More information about the torqueusers mailing list