[torquedev] [Bug 98] Allocation of incorrect pointer in src/scheduler.cc/samples/fifo/job_info.c: update_job_comment causes random crash.

bugzilla-daemon at supercluster.org bugzilla-daemon at supercluster.org
Thu Nov 11 04:36:10 MST 2010


http://www.clusterresources.com/bugzilla/show_bug.cgi?id=98

--- Comment #17 from Stephen Usher <steve at earth.ox.ac.uk> 2010-11-11 04:36:10 MST ---
Hmm.. After installing that patch I now get a totally different crash:

(gdb) where
#0  0xffffe430 in __kernel_vsyscall ()
#1  0xf75aece1 in raise () from /lib/libc.so.6
#2  0xf75b0632 in abort () from /lib/libc.so.6
#3  0x08049d63 in catch_abort (sig=11) at pbs_sched.c:225
#4  <signal handler called>
#5  0x0804d8d2 in update_job_comment (pbs_sd=775369784, jinfo=0x3832312c, 
    comment=0xfffe4e78 "Not Running - PBS Error: Resource temporarily
unavailable REJHOST=everest MSG=cannot allocate node 'everest' to job - node
not currently available (nps needed/free: 1/-1,  joblist:
128863.newton:0,128"...)
    at job_info.c:695
#6  0x0804c7c0 in run_update_job (pbs_sd=775369784, sinfo=0x7477656e, 
    qinfo=0x303a6e6f, jinfo=0x3832312c) at fifo.c:702
#7  0x3832312c in ?? ()
#8  0x2e373438 in ?? ()
#9  0x7477656e in ?? ()
#10 0x303a6e6f in ?? ()
#11 0x3832312c in ?? ()
#12 0x2e323338 in ?? ()
#13 0x7477656e in ?? ()
#14 0x303a6e6f in ?? ()
#15 0xfffe0029 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Notice that it's not even getting as far as copying the comment!

-- 
Configure bugmail: http://www.clusterresources.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


More information about the torquedev mailing list