[torqueusers] epilogue and node access policy

Seth T Graham sether at fnal.gov
Tue Sep 10 09:40:23 MDT 2013


In our epilogue, we use this command:

/usr/sbin/lsof |/bin/grep ${JOBID}|/bin/awk '{print $2}'|/bin/sort -u

Which will list all pids for a job, which you can then feed into a loop to kill them off.

The JOBID is passed to the epilogue as $1. 



On Sep 10, 2013, at 4:02 AM, Sakhile Masoka <sakhile.harvey at gmail.com>
 wrote:

> I have epilogues set on nodes to clean up processes ( stray processes) after a job completes using "userid". But also I have implemented on Moab, ENFORCENODEACCESS SINGLEUSER, meaning  jobs of the same user can be scheduled to the same node if resources are still available. This helps with users running many single task jobs with less small memory requirements. 
> 
> The issue now, I have a user saying his jobs (single tasks) are cancelled when one finishes. And I see how that can be, since epilogues will clean all processes on the node that belongs to that user.
> 
> Is there a way to work arround this issue, config sugestions etc.... 
> 
> Otherwise I'll have to disable epilogues and work with prologues alone. 
> 
> Regards
> Sakhile Masoka 
> Sys Admin, CHPC
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers



More information about the torqueusers mailing list