[torqueusers] pbs_sched crash

Denise Berendes denise at nsstc.uah.edu
Fri Apr 28 10:49:53 MDT 2006


The license file only had root access.  I changed it, please try again.

Denise


Alexander Saydakov wrote:
> Wonderful. Thanks. I will give it a try.
> 
> 
> -----Original Message-----
> From: torqueusers-bounces at supercluster.org
> [mailto:torqueusers-bounces at supercluster.org] On Behalf Of Garrick Staples
> Sent: Thursday, April 27, 2006 8:48 PM
> To: torqueusers at supercluster.org
> Subject: Re: [torqueusers] pbs_sched crash
> 
> On Wed, Mar 22, 2006 at 11:04:08AM -0800, Alexander Saydakov alleged:
> 
>>Last night pbs_sched crashed leaving our 70+ nodes idle all night long :(
>>
>>#0  0x1013c8e in pbs_rescquery (c=0, resclist=0x9fbff484, num_resc=1,
>>available=0x9fbff498, allocated=0x9fbff494, reserved=0x9fbff490,
>>down=0x9fbff48c)
>>
>>    at ./../Libifl/pbsD_resc.c:218
>>
>>218           *(available + i) = *(reply->brp_un.brp_rescq.brq_avail + i);
> 
> 
> I just checked in this fix for 2.1.0, you can patch your 2.0.0 if you
> want.  It might even help the memory leak.
> 
> Index: src/lib/Libifl/pbsD_resc.c
> ===================================================================
> RCS file:
> /usr/local/nfs/src/cvs_repository/torque/src/lib/Libifl/pbsD_resc.c,v
> retrieving revision 1.3
> diff -u -r1.3 pbsD_resc.c
> --- src/lib/Libifl/pbsD_resc.c  23 Mar 2006 02:01:50 -0000      1.3
> +++ src/lib/Libifl/pbsD_resc.c  28 Apr 2006 03:44:23 -0000
> @@ -209,7 +209,7 @@
>    
>    reply = PBSD_rdrpy(c);
> 
> -  if (rc == PBSE_NONE)
> +  if (((rc = connection[c].ch_errno) == PBSE_NONE))
>      {
>      /* copy in available and allocated numbers */
> 
> 
> 





More information about the torqueusers mailing list