[torqueusers] Max Connections

Garrick Staples garrick at clusterresources.com
Thu Dec 7 11:14:34 MST 2006

On Thu, Dec 07, 2006 at 11:11:32AM -0500, nathaniel.x.woody at gsk.com alleged:
> On a follow up question here, I've had issues in the past with trying to 
> increase the Max Connections setting (or even at 4 for that matter).  As 
> the number of connections go up, I'll often get failures on calls, 
> typically with some sort of a gripe from pbs_iff (allegedly, I will get a 
> stdout print statement from pbs_iff).  This is always associated with a 
> threaded situation where a number of calls are attempted simultaneously. 
> I've always had to synchronize my calls into the API.  My understanding 
> has always been that this stems from some sort of a race condition 
> associated with setting the server_name property.  It also means I end up 
> wrapping any API calls with a fair amount of defensive coding (for 
> instance, have a c++ wrapper around the API that handles making the 
> connections for me and insuring that a valid connection is retrieved and 
> being very patient with waiting for it).  Have other people experienced 
> this and is the solution for libtorque to not try and retain the current 
> server_name, or have I totally missed the mark here?

The PBS client API wasn't coded to be thread safe.  There are probably a
dozen race conditions in there.

