[torqueusers] VNC service on cluster nodes with torque
Gareth.Williams at csiro.au
Gareth.Williams at csiro.au
Tue Jun 18 05:28:15 MDT 2013
> -----Original Message-----
> From: Chris Hunter [mailto:chris.hunter at yale.edu]
> Sent: Saturday, 15 June 2013 2:43 AM
> To: torqueusers at supercluster.org
> Subject: [torqueusers] VNC service on cluster nodes with torque
> I am working on a method to provide easy remote VNC access to cluster
> nodes using regular torque "qsub" job submission. The objective is to
> allow cluster users to run GUI based software (eg. matlab, visit,
> paraview, etc.) on cluster nodes via remote access. We have run into
> several practical difficulties not directly related torque.
> For example, users submit jobs on a public-facing server but the VNC
> service starts on a cluster node on a private network. We need to
> create a network path between the remote user and the cluster node,
> using the public-facing server as the intermediate bridge. We have
> found no good method to automate this as part of the job submission. We
> tested various port forwarding schemes that require manual intervention
> but nothing that is fully automated.
> Another issue is VNC traffic is unencrypted. We would like to encrypt
> traffic (ie. using a SSH tunnel) to the remote user. However,
> installing and configuring the required VNC & SSH client software on a
> remote PC is a support nightmare (eg. supporting unmanaged windows, mac
> & linux desktop & laptops).
> Does any manage a cluster where users submit jobs to use VNC ?
> Are you able to fully automate the VNC setup and connection ?
> Any advice on avoiding common pitfalls ? Is a fully automated VNC
> service unrealistic ?
> chris hunter
> chris.hunter at yale.edu
Our site does a couple of different things in this space.
- We allow vnc session on the cluster head/login node and have many concurrent sessions with a custom system for limiting resource usage of individual processes. We have a helper script for users to setup/start/find a vnc session just to make things easier. We could scale this to multiple head nodes but have not needed to (yet).
- We unset the sshd uselocalhost option so that if users come into the head node with ssh X forwarding.
- In both cases users get a DISPLAY that is network accessible within the cluster and has an xauth entry in the users (shared) home directory
- we recommend that users who need a DISPLAY in a batch job use the qsub option '-v DISPLAY' (the performance in this case seems much better than that with qsub -X, but relies on the DISPLAY being network accessible and xauth based security).
Our documentation on this is at: https://wiki.csiro.au/display/ASC/Quick+Start+Guide+for+Linux and pointers therein.
This setup has proved to work well for low-end visualization and running software that effectively needs a DISPLAY - nice for integrating HPC with other workflow. The batch sessions can be explicitly interactive (-I) or not. It allows users to do a fair amount of development work on the head node and burst to a compute node when more resource is needed (and we schedule to usually have some batch resource available for development work at short notice). Our users are very geographically dispersed, and vnc moderates interactivity issues with long latency connections significantly, but users probably value the persistence even more. We have a license for RealVNC which comes with support for encrypting the client/server vnc connection (there may be other options with similar capability).
For high end visualization we have a setup with virtualGL and turbovnc with a canned batch job to start a relatively short term vnc visualization session. The high end viz setup could be considered to have scalability issues. We haven't actually had that much demand for this so far. Most needs have been met with software graphics rendering, but there are certainly niche use cases and it is good to have this capability for high end viz with HPC data/workflow integration.
One of our Australian partner centres has a more extensive but fundamentally similar high end viz setup. They have a desktop tool to simplify setup (and we are working on something similar but ours will probably never make it outside our organization). See: https://www.massive.org.au/userguide/cluster-instructions/using-the-massive-desktop and https://github.com/CVL-dev/cvl-fabric-launcher If you explore more there you will see they are applying the same ideas to a cloud platform (though perhaps most of the value in that CVL project is in the tools/software delivery/integration for specific end-user communities).
Cheers and thanks for sharing,
More information about the torqueusers