[torqueusers] Hardening of Opteron Scyld GigE Cluster

Joshua Bernstein jbernstein at penguincomputing.com
Thu Oct 25 15:25:53 MDT 2007


Hi Gordon, always nice to see a new customer on the list!

> I'm getting : Opteron Scyld GigE Cluster   from Penguin
> 
> Running Scyld and Torque.
> When I read about it, it says that all the messaging is done via demons
> _server to _mom, so there is no need to worry about things like rsh or scp
> being needed.
> 
> But then I see things like this, in archives:
> ****************************************
> .....> It appears to be some kind of permissions error
> .............
> My guess would have to be that PBS is trying to copy those jobs back via rcp
> or scp and that side of things hasn't been set up correctly..
> Certainly with our Torque builds I always use:
>         ./configure  --with-scp
> 
> to make sure it doesn't try and use rcp (even though rcp is just a symlink
> to scp on our boxes).

> As part of hardening, like the folks above, I get rid of things like rcp.
> And I make sure the net parameters can't do forward and redirect, along with
> many other things.

As of Scyld ClusterWare 4.1.4, (which your cluster will likely ship 
with,) TORQUE is configured in a tradition way. Which means that a 
pbs_mom is running on each and every compute node in the cluster.

For non-MPI jobs, TORQUE simply asks the mom on the compute node 
assigned to the job, to fork the job and execute on the node.

For MPI jobs linked against the MPICH libraries the come with Scyld, RSH 
   and other associated commands aren't used.

In fact, RSH is disabled by default on Scyld clusters. You only need to 
enable it for applications that absolutely depend on it.

> But I'm worried I'll break the "communication" between server and compute
> nodes
> 
> Thoughts?

Understand that users do not login to Scyld compute nodes, instead they 
launch jobs either through TORQUE or via using the Scyld commands such 
as bpsh and beorun.

If you have any questions after you receive your cluster, please don't 
hesitate to contact Penguin's tech support team which is support at 
penguincomputing.com

-Joshua Bernstein
Software Engineer
Penguin Computing


More information about the torqueusers mailing list