[torqueusers] taking node offline w/o killing running job

Garrick Staples garrick at usc.edu
Mon Jan 9 12:27:28 MST 2006

On Mon, Jan 09, 2006 at 02:02:59PM -0500, Caird, Andrew J alleged:
> > Use 'pbsnodes -c nodename', all it does is clears the offline bit.
> This sets the node state to free:
> qmgr -c 'p n mor153'
> set node mor153 state = free
> set node mor153 properties = myrinet
> set node mor153 ntype = cluster
> set node mor153 status = opsys=linux
> set node mor153 status += uname=Linux mor153 2.6.9-22.0.1.ELsmp #1 SMP
> Tue Oct 18 18:39:27 EDT 2005 i686
> set node mor153 status += sessions=20069
> set node mor153 status += nsessions=1
> set node mor153 status += nusers=1
> set node mor153 status += idletime=2081510
> set node mor153 status += totmem=1554284kb
> set node mor153 status += availmem=1442688kb
> set node mor153 status += physmem=1554284kb
> set node mor153 status += ncpus=2
> set node mor153 status += loadave=2.00
> set node mor153 status += netload=918193721
> set node mor153 status += state=free
> set node mor153 status += jobs=6068.morpheus.engin.umich.edu
> set node mor153 status += rectime=1136833218
> where is is really in use:
> # qstat -an1 | grep mor153
> 6068.mor  xxxxxxx  yyyyyy zzzzzz --   1  --  --  120:0 R 71:01
> mor153/1+mor153/0
> Before setting it offline ("qmgr -c 's n mor153 state=offline'") and
> then running "pbsnodes -c mor153" it was "state = busy", which I would
> prefer.

"free" is the absence of all state bits.  "in use" != "busy".

Just use pbsnodes -o/-c.  Don't use qmgr and you won't overwrite state

When you used qmgr to manually set the node's state to offline, you
overwrote all state bits, wiping out the busy.  Then 'pbsnodes -c'
removed the offline bit, leaving it with 0 state bits.

Is the node reporting itself as busy?  Look for the state inside of the
status in 'pbsnodes -a nodename'.  If so, then just wait a minute and
server will find it again soon.

Really, just use pbsnodes for this:
  pbsnodes -o mor153
  pbsnodes -c mor153

If you must use qmgr, use the INCR/DEC operators (this is what 'pbsnodes
-o/-c' does internally):
  s n mor153 state+=offline
  s n mor153 state-=offline

Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20060109/4d1eb8f3/attachment.bin

More information about the torqueusers mailing list