[torqueusers] Problems with acl_group_enable and acl_group_sloppy

Garrick Staples garrick at clusterresources.com
Mon Sep 25 18:07:38 MDT 2006


On Mon, Sep 25, 2006 at 10:17:45AM +0200, Walter de Jong alleged:
> 
> 
> Garrick Staples wrote:
> >On Thu, Sep 21, 2006 at 10:59:09AM +0200, Bas van der Vlies alleged:
> >>Tested with torque versions
> >> 2.1.1
> >> 2.1.2
> >>
> >>I have the following queue definition:
> >> create queue q_genetics
> >> set queue q_genetics queue_type = Execution
> >> set queue q_genetics resources_default.ncpus = 1
> >> set queue q_genetics resources_default.nodes = 1
> >> set queue q_genetics resources_default.walltime = 00:01:00
> >> set queue q_genetics acl_group_enable = True
> >> set queue q_genetics acl_groups = gaussian
> >> set queue q_genetics acl_group_sloppy = True
> >> set queue q_genetics enabled = True
> >> set queue q_genetics started = True
> >>
> >>We have a large cluster with many groups and secondaries. If i am member 
> >>of group it will succeed if i am not a member it will hang up the 
> >>pbs_server. The server does not respond any more. As told before we many 
> >>groups
> >>
> >>I have examined the source and i think the check is wrong for 
> >>acl_sloppy_group.  It will fetch a group and find out if the user is a 
> >>member of the group and so on. This will take a lot of time.
> >>
> >>To make it faster we must only examine the groups that are allowed for 
> >>the queue. We get the allowed groups from the queue and find out if the 
> >>user is a member.  I made a patch for it and it does not hang up the 
> >>pbs_server anymore ;-)
> >
> >Turns out there is a downside to this patch, it doesn't work with
> >multiple groups with the same gid.  Those of us that rely on this
> >behaviour (including me), we'll now have to explicitly list all of the
> >extra groups in the ACL.
> >
> >Is this a problem with anyone else?
> 
> I think listing all groups (as you suggest) is the correct way to go.
> 
> Having multiple groups with the same gid is possible in unix but doesn't
> seem correct or the intended way to do things. I say this for 2 reasons:
>  * the struct grp has a (one) gr_name and a gr_gid, not a list of
>    gr_names for one gid
>  * the getgrgid() function returns the group that has the gid
>    that you are looking for. There is no standard unix (or posix)
>    function to return the next group that has the same gid.


Fair enough.  The changes have been committed to 2.1-fixes and trunk.



More information about the torqueusers mailing list