[torqueusers] Modifying Torque to allow nodes to be turned off

Joshua Bernstein jbernstein at penguincomputing.com
Thu Apr 3 17:06:25 MDT 2008


Hey Jeff,

	Good to catch you on the list instead of on the phone ;-)
> Good afternoon,
> 
> Myself, as well as many others, have been thinking about  how to
> modify job schedulers to allow nodes to be turned off when they
> haven't been used for a while but still have them as available
> resources. Since I'm more familiar with PBS than anything else,
> I thought I would run this by the list to get some reaction and
> perhaps some help.
>... 
> So with that said, does this look to be a fairly easy mod that can be
> made to torque? Do you think it's something that should be done?

This is actually something I've been playing with already. In fact at 
SuperComputing this past November I was demoing something that shut down 
the nodes when they weren't in use.

The trick of course is to still allow jobs to be accepted by the 
scheduler. If a node is otherwise marked down, then a job wouldn't be 
accepted. What I did as a quick hack was to user the qmgr variable to 
tell TORQUE that I had more processors then pbs_mom reported, and 
therefore allowing jobs to be accepted even when the nodes were down. 
This is done via:

set server resources_available.nodect = ???
set queue batch resources_available.nodect = ???

I've written a secondary daemon that monitors the TORQUE queue and 
powers up and down nodes based on thresholds set by the user. The demo 
also included the nodes being connected to a monitored PDU so we could 
show a power savings versus workload. If you'd like more information or 
more detail about any of this, I'd be happy to share.

-Joshua Bernstein
Software Engineer
Penguin Computing



More information about the torqueusers mailing list