[torqueusers] connect to specific nodes within a cluster
jdsmit at sandia.gov
Tue Apr 28 09:12:59 MDT 2009
> Thanks for your answer, Jerry. It seems the problem is related to your
> comment "Are these nodes allocated to a job of yours?"
> In effect, I've realized that if I've got a job allocated to a node, I can
> simply access to it through
>> ssh nodename
> Thus, more specifically, the problem would be to access to a node without
> allocated active jobs of my own. The problem is that if a job is
> interrupted with qdel, I've realized that some remaining related processes
> may stay working in the slave nodes. Thus it would useful to me as a user
> to also access these nodes to manually kill these active processes.
The administrators of this cluster need to have some process cleanup
happen in the epilogue to make sure that "leftover" user processes are
purged at job end.
Allowing a user access to a node not running a job of theirs is a
security risk IMHO, as you could ssh to a node running someone else's
job, and possibly access data not belonging to them. Or on a
non-nefarious note, you as a user not assigned to the node, could
"accidentally" kill the wrong process, and affect the other user's job
> Actually my ``remaining'' processes have just finished, but this would
> still be useful for a near future.
>>> Hello all,
>>> (I'm new to cluster usage)
>>> If I log into a torque cluster, e.g.:
>>>> ssh -Y myusername at cluster.domain.org
>>> and this cluster has nodes with the names:
>>> How could I, after I have logged in the cluster, to connect to a
>>> node to see the active processes in it? I would like to monitor specific
>>> processes (and probably kill them) within specific nodes. I've tried
>>> several options:
>>> [myusername at master00 ~]ssh -Y cluster01
>>> [myusername at master00 ~]ssh -Y myusername at cluster01
>>> [myusername at master00 ~]ssh -Y myusername at cluster01.domain.org
>>> without success. In all of them my password is required and my login
>>> password is not accepted.
>> It all depends on how the security and access is setup for that specific
>> cluster. What is the access model, ssh,rsh etc, Is it pam based, or
>> /etc/security/access.conf? Do they use ssh-keys? Are these nodes
>> allocated to a job of yours?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the torqueusers