[torqueusers] Using Torque as a meta-scheduler over several clusters; any advice?
mej at lbl.gov
Mon Feb 6 13:21:10 MST 2012
On Monday, 06 February 2012, at 17:56:54 (+0000),
Coyle, James J [ITACD] wrote:
> Currently, we use a different head node for each cluster we deploy.
> The head node runs the server and scheduler. I've been asked if it possible to
> schedule several clusters from a single login node.
> From a hardware standpoint, one could use virtual machines to accomplish
> this, but the idea was to somehow direct all the jobs from a single login node
> to somehow make it easier for users.
> What is being described sounds a lot like a grid.
> Has anyone done this?
We have a single instance of TORQUE and Moab governing roughly 20
clusters, but you can also use Moab in a grid scenario as
master/slaves or equal peers.
All our "supercluster" clusters all share common interactive nodes,
login gateway, NFS- and Lustre-based storage, and master node.
> Can schedulers run on the "head node" of each clusters with a single pbs_server
> running on the master head node to interact with the users?
> Is the way to do this to set specific properties on the nodes from specific clusters
> (cluster1, cluster2, ...) to use MAUI and to have need_nodes set for different queues
> small_cluster1 ... ?
That's certainly one way to do it. We have one or more queues for
each cluster, and the ACLs are set up within TORQUE and Moab to
restrict access to those users/groups who own each cluster.
> Has someone "rolled their own" meta-scheduler?
You're likely to spend a lot more money doing this than it would cost
for a Moab license. I don't know of any meta-scheduling packages
which currently exist, but Moab's architecture supports a wide variety
of functionality in this vein.
Michael Jennings <mej at lbl.gov>
Senior HPC Systems Engineer
High-Performance Computing Services
Lawrence Berkeley National Laboratory
Bldg 50B-3209E W: 510-495-2687
MS 050B-3209 F: 510-486-8615
More information about the torqueusers