Anthony R Fletcher arif at mail.nih.gov
Fri Mar 23 11:30:38 MDT 2007

I started to look at the torque PAM module. It works just fine whilst
the job is running on the selected nodes.

However we need to clean up the nodes afterwards. We currently use an
epilogue script which runs on the job's master node as root. This
becomes the user and logs on to each of the slave nodes killing
old processes and deleting old files.

With the PAM module in place, the epilogue no longer works because the
jobs has finished and the 'user' can no longer log on to the slaves.
Root on the master node cannot log on to the slaves.

Is there an alternative solution? Can the epilogue script be made to run
on all the nodes after the job finishes and not just the master node.
Can the PAM module be configured to allow the user to log on whilst the
epilogue script runs?


