[torquedev] pbs_mom crashing

Glen Beane glen.beane at gmail.com
Wed Jul 22 05:36:44 MDT 2009


can you run pbs_mom in valgrind and trigger the crash?  That might
catch the stack corruption if that is the case

On Tue, Jul 21, 2009 at 11:42 PM, Oliver Baltzer<obaltzer at flagstonere.bm> wrote:
> Hi Glen,
>
> Glen Beane wrote:
>>
>> the code in job_free referenced by that stack trace looks like this:
>>
>>   /* remove any malloc working attribute space */
>>
>>   for (i = 0;i < (int)JOB_ATR_LAST;i++)
>>     {
>>     job_attr_def[i].at_free(&pj->ji_wattr[i]);  /* this is line 509!! */
>>     }
>>
> The core dump tells me i = 1:
>
> (gdb) frame 6
> #6  0x0000000000426814 in job_free (pj=0x5d7ae0) at job_func.c:509
> 509         job_attr_def[i].at_free(&pj->ji_wattr[i]);
> (gdb) print i
> $18 = 1
>
> Though this is misleading as free_str is called with attr=0x5d7d50 which
> is for i = 0:
>
> (gdb) print &pj->ji_wattr[0]
> $21 = (attribute *) 0x5d7d50
> (gdb) print &pj->ji_wattr[1]
> $22 = (attribute *) 0x5d7d70
>
> So I am not really sure what the value of i is. If it is in fact i = 1,
> then there must be some sort of stack corruption between job_free and
> free_str, which moves the value of the pointer passed to free_str by 32
> bytes, which then causes a double-free of pj->ji_wattr[0].at_val.at_str
> rather than a free of pj->ji_wattr[1].at_val.at_str.
>
> I hope this helps. I am going to try to dig into it some more tomorrow.
>
> Cheers,
> Oliver
>
> --
>
>
> **********************************************************************
> This communication contains information which is confidential and may also be legally privileged. It is for the exclusive use of the intended recipient(s). If you are not the intended recipient(s), disclosure, copying, distribution, or other use of, or action taken or omitted to be taken in reliance upon, this communication or the information in it is prohibited and maybe unlawful. If you have received this communication in error please notify the sender by return email, delete it from your system and destroy any copies.
> **********************************************************************
>
> _______________________________________________
> torquedev mailing list
> torquedev at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torquedev
>


More information about the torquedev mailing list