[torquedev] patch to add gssapi/krb5 support to Torque

Alex Rolfe arolfe at MIT.EDU
Fri Jul 7 18:59:24 MDT 2006


Garrick Staples <garrick at clusterresources.com> writes:
[rearranged a little]

> Your gssapi patch sans FE5 bits:
> svn://www.clusterresources.com/torque/branches/gssapi

I'll start working from this so things don't get too out-of-sync.

> This will completely replace the ruserok() stuff?  Including data
> stageout with scp/mom_rcp?

The change I made was to bypass the call to ruserok.  As I understand
it, ruserok checks whether the user can rcp without a password.  In a
gssapi world, you can scp using the credentials forwarded with the job
instead of relying on rhosts.   While it's possible that the scp would
fail (because you're not allowed to connect even with credentials), I
don't know of an equivalent call to check for this.  

So if I understand your question right- stageout with scp still works
because scp uses gssapi authentication.

> Let's get rid of the cronjobs.  pbs_server has an internal scheduling
> mechanism that can call functions at requested times, and we can add a
> check in pbs_mom's main loop?

Sounds good to me.  I'll work on this first, since I think it's the most
annoying "feature."

I'm curious if there's a better solution for the clientrenew script.
This is started by the mom before it executes the job and it runs in the
background, keeping the jobs credentials and AFS tokens renewed.  This
task is harder than just renewing credentials because of the tokens.
For process A to renew tokens for process B, they must be in the same
process authentication group.  This prevents a single mom process from
renewing for multiple jobs.

One note, though- if the pbs_server goes down long enough for credentials to
expire, then those jobs will fail.  This isn't necessarily the case with
a cron job.  I don't think this is a reason to stay with the cron job,
but it's something we need to keep in mind.

> With creds and principals added to the connection struct, I wonder if it
> would make sense to make those linked lists.  Perhaps one day we'll want
> to support other kinds of credentials?  kx509 certs?  ssh passphrases?
> Maybe a pointer to a generic cred struct so we can plugin future
> security mechanisms?
>
> Is TM supported?  Do creds get forwarded to sister MOMs?

No.  I didn't try to do this since I didn't want to deal with that
complexity yet.  If someone else wants to work on this, that'd be very helpful.

> PAM support?  Would it make sense to move the authn/authz down to PAM?

I'm not sure how this would work, but I'm open to suggestions.

> Another thought... it would be nice to disable this feature at run-time
> to allow distributors to build with it.  Then configure can enable it by
> default, but run-time would be disabled by default.

> All client connections would require kerb princs?  This will complicate
> everyone's homegrown queue status CGIs?  What about globus?

Perhaps the correct solution is to make the credentials forwarding
optional (only happens if they're there) with the fallback to iff.  So
if you build with gssapi support but a client doesn't have credentials,
everything works as if you didn't have gssapi support.  The potential
drawback is that a job may be queued but fail later (execution or
stageout) if it turns out that it needed credentials.

Josh Butikofer <josh at clusterresources.com> writes:

> I haven't had a chance to look over the patch yet, but I was wondering
> if the functionality you've added handles renewing Kerberos
> credentials on the server and MOM if a job runs longer than the
> lifetime of the credential? If not, would you anticipate this feature
> would be difficult to add?

That is the functionality that I added. 


Alex


More information about the torquedev mailing list