[torqueusers] pbs_mom segfaulting
Jan Lindheim
lindheim at cacr.caltech.edu
Thu Jan 29 17:17:11 MST 2009
On Tue, Jan 27, 2009 at 04:49:29PM -0800, Joshua Bernstein wrote:
> Hi Jan,
>
> Jan Lindheim wrote:
> >After upgrading to the torque 2.3.6 recently, we have seen pbs_mom
> >segfaulting and jobs getting stuck. This is on an Opteron system, running
> >SLES9.1. Has anybody else reported instability with pbs_mom lately?
>
> I've personally had problems with 2.3.6 and other versions producing a
> SEGV. You might want to read through the thread here:
>
> http://www.clusterresources.com/pipermail/torquedev/2008-December/001276.html
>
> I have an RPM of version of 2.4.0 I can send you that contains the fix I
> proposed in the post aforementioned. I'd be curious to see if that fixes
> your issue. Ping me off list and I'd be happy to send you the RPM.
>
> -Joshua Bernstein
> Senior Software Engineer
> Penguin Computing
>
Thanks Joshua!
I will be happy to try this on a test cluster.
Will have to hold off with testing on the production cluster for now.
Jan Lindheim
More information about the torqueusers
mailing list