[torqueusers] getting torque/ pbs to reboot a node periodically.

Bogdan Costescu Bogdan.Costescu at iwr.uni-heidelberg.de
Tue Dec 9 13:20:03 MST 2008


> First, you need to drain the nodes by marking them offline.  Then 
> you need to mark them for reboot using the node note.  Then a script 
> can reboot nodes when it finds them offline, without a job, and 
> marked for reboot.

I've recently done something similar (reboot node after whatever jobs 
run on it finish) using pbs_python in only a few lines of (Python) 
code. There is no extra script looking for the node note, the Python 
script polls the state of the node until it's only "offline", proceeds 
to do whatever it needs to reboot the node and as soon as the node 
goes into state "down" it clears the "offline" state.

-- 
Bogdan Costescu

IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany
Phone: +49 6221 54 8240, Fax: +49 6221 54 8850
E-mail: bogdan.costescu at iwr.uni-heidelberg.de


More information about the torqueusers mailing list