[torqueusers] pbs_mom segfaulting

Jan Lindheim lindheim at cacr.caltech.edu
Thu Jan 29 17:17:11 MST 2009


On Tue, Jan 27, 2009 at 04:49:29PM -0800, Joshua Bernstein wrote:
> Hi Jan,
> 
> Jan Lindheim wrote:
> >After upgrading to the torque 2.3.6 recently, we have seen pbs_mom
> >segfaulting and jobs getting stuck.  This is on an Opteron system, running
> >SLES9.1.  Has anybody else reported instability with pbs_mom lately?
> 
> I've personally had problems with 2.3.6 and other versions producing a 
> SEGV. You might want to read through the thread here:
> 
> http://www.clusterresources.com/pipermail/torquedev/2008-December/001276.html
> 
> I have an RPM of version of 2.4.0 I can send you that contains the fix I 
> proposed in the post aforementioned. I'd be curious to see if that fixes 
> your issue. Ping me off list and I'd be happy to send you the RPM.
> 
> -Joshua Bernstein
> Senior Software Engineer
> Penguin Computing
> 

Thanks Joshua!
I will be happy to try this on a test cluster.
Will have to hold off with testing on the production cluster for now.

Jan Lindheim


More information about the torqueusers mailing list