[torqueusers] epilogue and node access policy

Sakhile Masoka sakhile.harvey at gmail.com
Tue Sep 10 10:18:01 MDT 2013


I have this command on my epilogue

*user_procs=`/bin/ps -e -o pid= -o user= | /bin/grep -e "$2" | \
while read pid owner*.....

which the issue is, if one user is running multiple jobs in one node, my
epilogue will kill all of them.
I need a way to link JOBID's ($1) to the processes on the node. But also
even with that, processes can start other processes, etc... which will make
tracking difficult...

I was under the assumption that moab will assign jobs to the same node if
atleast they will end at the same time, or not execute epilogues on that
node while other jobs belonging to the same user are still running.

I will look at reaver and see what it does.... #hopeful



On Tue, Sep 10, 2013 at 5:40 PM, Seth T Graham <sether at fnal.gov> wrote:

>
> In our epilogue, we use this command:
>
> /usr/sbin/lsof |/bin/grep ${JOBID}|/bin/awk '{print $2}'|/bin/sort -u
>
> Which will list all pids for a job, which you can then feed into a loop to
> kill them off.
>
> The JOBID is passed to the epilogue as $1.
>
>
>
> On Sep 10, 2013, at 4:02 AM, Sakhile Masoka <sakhile.harvey at gmail.com>
>  wrote:
>
> > I have epilogues set on nodes to clean up processes ( stray processes)
> after a job completes using "userid". But also I have implemented on Moab,
> ENFORCENODEACCESS SINGLEUSER, meaning  jobs of the same user can be
> scheduled to the same node if resources are still available. This helps
> with users running many single task jobs with less small memory
> requirements.
> >
> > The issue now, I have a user saying his jobs (single tasks) are
> cancelled when one finishes. And I see how that can be, since epilogues
> will clean all processes on the node that belongs to that user.
> >
> > Is there a way to work arround this issue, config sugestions etc....
> >
> > Otherwise I'll have to disable epilogues and work with prologues alone.
> >
> > Regards
> > Sakhile Masoka
> > Sys Admin, CHPC
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20130910/494c7e11/attachment.html 


More information about the torqueusers mailing list