[torquedev] pbs_mom crashing

Josh Butikofer josh at clusterresources.com
Mon Jul 20 19:56:22 MDT 2009


No one has been able to get a core dump when this happens to help us understand when/how it happens--that's why it wasn't fixed in 2.3.7.

Josh Butikofer
Cluster Resources, Inc.
#############################

----- "Oliver Baltzer" <obaltzer at flagstonere.bm> wrote:

> Hi all,
> 
> there was a thread last month with this subject discussing an issue
> with
> pbs_mom arbitrarily segfaulting. I am observing the same problem in
> particular with sequential jobs submitted with:
> 
> qsub -l walltime=8640000
> 
> It does not appear to be reliably reproducible so I cannot provide a
> good test case yet.
> 
> I was hoping it is fixed in 2.3.7, however, this does not appear to
> be
> the case.
> 
> Are there any leads on how this segfault is caused? I am currently
> waiting for it to happen again and will be able to provide detailed
> logs. Please let me know what else you might need to track down this
> issue.
> 
> Cheers,
> Oliver
> 
> **********************************************************************
> This communication contains information which is confidential and may
> also be legally privileged. It is for the exclusive use of the
> intended recipient(s). If you are not the intended recipient(s),
> disclosure, copying, distribution, or other use of, or action taken or
> omitted to be taken in reliance upon, this communication or the
> information in it is prohibited and maybe unlawful. If you have
> received this communication in error please notify the sender by
> return email, delete it from your system and destroy any copies.
> **********************************************************************
> 
> _______________________________________________
> torquedev mailing list
> torquedev at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torquedev


More information about the torquedev mailing list