[torquedev] [Bug 98] Allocation of incorrect pointer in src/scheduler.cc/samples/fifo/job_info.c: update_job_comment causes random crash.
bugzilla-daemon at supercluster.org
bugzilla-daemon at supercluster.org
Thu Nov 11 04:36:10 MST 2010
http://www.clusterresources.com/bugzilla/show_bug.cgi?id=98
--- Comment #17 from Stephen Usher <steve at earth.ox.ac.uk> 2010-11-11 04:36:10 MST ---
Hmm.. After installing that patch I now get a totally different crash:
(gdb) where
#0 0xffffe430 in __kernel_vsyscall ()
#1 0xf75aece1 in raise () from /lib/libc.so.6
#2 0xf75b0632 in abort () from /lib/libc.so.6
#3 0x08049d63 in catch_abort (sig=11) at pbs_sched.c:225
#4 <signal handler called>
#5 0x0804d8d2 in update_job_comment (pbs_sd=775369784, jinfo=0x3832312c,
comment=0xfffe4e78 "Not Running - PBS Error: Resource temporarily
unavailable REJHOST=everest MSG=cannot allocate node 'everest' to job - node
not currently available (nps needed/free: 1/-1, joblist:
128863.newton:0,128"...)
at job_info.c:695
#6 0x0804c7c0 in run_update_job (pbs_sd=775369784, sinfo=0x7477656e,
qinfo=0x303a6e6f, jinfo=0x3832312c) at fifo.c:702
#7 0x3832312c in ?? ()
#8 0x2e373438 in ?? ()
#9 0x7477656e in ?? ()
#10 0x303a6e6f in ?? ()
#11 0x3832312c in ?? ()
#12 0x2e323338 in ?? ()
#13 0x7477656e in ?? ()
#14 0x303a6e6f in ?? ()
#15 0xfffe0029 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
Notice that it's not even getting as far as copying the comment!
--
Configure bugmail: http://www.clusterresources.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
More information about the torquedev
mailing list