[torqueusers] Semaphores limit per job/user in torque?

Andrew Savchenko bircoph at gmail.com
Tue Sep 24 12:45:26 MDT 2013


On Tue, 24 Sep 2013 11:26:53 -0600 Mark Moore wrote:
> > Let's consider that user may run multiple jobs on the same node. IPC
> > semaphores are not connected to any pid, thus it is not safe to
> > remove user's semaphores in the epilogue if there are some other jobs
> > of this user on this node running.
> 
> Our cluster is not configured for shared_node, so we don't have this
> problem.
> 
> 'ipcs -p' will return LSPID (last PID to send to this semaphore) and
> LRPID (last PID to received from this semaphore). A simple check of
> these against the active process table should get what's needed. Yes,
> a race condition could exist if the system quickly re-assigns a PID
> to a new job.

ipcs -p works only for shared memory resources, not for IPC
semaphores. And we deal successfully with shm leaks using
kernel.shm_rmid_forced = 1.

> > That's why I asked about IPC namespace isolation: if it can be used
> > by torque per job, the stale semaphores will be gone with isolated
> > namespace after job is finished. LXC works this way and since torque
> > is capable to use cpuset, I was hoping that it is capable to use
> > namespaces for job isolation too. Looks like this feature is not here
> > yet.
> 
> Interesting, I hadn't thought of that. My concern would be too much
> overhead introduced by building containers for each job.

Torque already uses cpusets. I doubt that use of IPC namespace will be
more demanding. Nobody asks for a full container-like isolation. 

Best regards,
Andrew Savchenko
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20130924/1e1dfc4c/attachment.bin 


More information about the torqueusers mailing list