[torquedev] renewing credentials

Garrick Staples garrick at clusterresources.com
Wed Mar 7 14:55:07 MST 2007


On Wed, Mar 07, 2007 at 10:51:01PM +0100, Sergio Gelato alleged:
> * Garrick Staples [2007-03-06 13:39:49 -0700]:
> > On Tue, Mar 06, 2007 at 09:14:57PM +0100, Sergio Gelato alleged:
> > > I don't think so. It's quite easy for a job to do a
> > > 	(while kinit -Rf; do sleep 30000; done) &
> > > or equivalent (e.g., Russ Allbery's krenew) on each node. Indeed it would 
> > > be nice for pbs_mom to set that up on the user's behalf and to clean up at 
> > > the end of the job. Isn't this what the prologue and epilogue scripts
> > > are for?
> > 
> > I thought the pro/epilog bits were no longer necessary.  When the gssapi
> > patch was originally submitted, I was the one that rejected the idea of
> > pro/epilog scripts managing the key renewals.
> > 
> > I had thought the pbs_mom bits required to handle this were already in
> > checked in to the gssapi branch.
> 
> Maybe I'm misreading the code, but my impression is that the only
> renewals at the moment are done while the job sits waiting in the queue.
> Specifically, the only call to pbsgss_renew_creds() is from
> renew_job_credentials(), which is only mentioned in req_quejob() as
>  set_task(WORK_Timed,time((time_t *)0)+3600*3,renew_job_credentials,jobidcopy)
> and this appears in an #ifndef PBS_MOM block.
> 
> There is no question that credentials must be periodically refreshed
> by pbs_server while the job is queued. (This, I believe, is happening.
> In an ugly way, with hard-coded calls to kinit and a fixed 3-hour refresh 
> rate, but we can tidy that up later on.) Once the job starts executing, 
> however, it could in principle take over that responsibility. Which
> doesn't mean that it should...

Can pbs_server simply ensure that the ticket's lifetime is long enough
before the job executes?  Thereby eliminating renewals at MOM/

Is it necessarily implied that MOM even has network connectivity to a
KDC to get a renewal?  (I'm rather kerberos illiterate)

 
> At least for the kinds of credentials I'm familiar with, the refreshing
> doesn't require superuser privileges. And where AFS is involved it needs
> to happen in the same PAG as the user's job. Which is the better place to
> do it: pbs_mom, or a separate daemon (say, krenew) that's launched
> by prologue.user and terminated by epilogue.user?

I will continue to veto usurping pro/epilog scripts for this purpose :)

pbs_mom must directly do whatever is necessary.  Either internally,
launching an external daemon, or fork()ing off a process; but not
through pro/epilog.
 

> OK, so req_cpyfile() may need the credentials at the end of the job in
> order to copy the output files. At the moment it reuses the job's
> ccache and sets up a new AFS PAG for itself. Since it reuses the job's
> ccache it doesn't really care how the credentials have been refreshed.
> That doesn't help much in answering my question.
> _______________________________________________
> torquedev mailing list
> torquedev at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torquedev


More information about the torquedev mailing list