[torqueusers] pbs_sched crash

Alexander Saydakov saydakov at yahoo-inc.com
Wed Mar 22 12:04:08 MST 2006

Last night pbs_sched crashed leaving our 70+ nodes idle all night long :(


-rw-------  1 root  wheel  1612468224 Mar 21 23:07 pbs_sched.core


Note the size!


We are running 2.0.0p7


> gdb pbs_sched pbs_sched.core

GNU gdb 4.18 (FreeBSD)

This GDB was configured as "i386-unknown-freebsd"...Deprecated bfd_read
called at
line 2627 in elfstab_build_psymtabs

Deprecated bfd_read called at
line 933 in fill_symbuf


Core was generated by `pbs_sched'.

Program terminated with signal 11, Segmentation fault.

Reading symbols from /usr/lib/libkvm.so.2...done.

Reading symbols from /usr/lib/libc.so.4...done.

Reading symbols from /usr/libexec/ld-elf.so.1...done.

#0  0x1013c8e in pbs_rescquery (c=0, resclist=0x9fbff484, num_resc=1,
available=0x9fbff498, allocated=0x9fbff494, reserved=0x9fbff490,

    at ./../Libifl/pbsD_resc.c:218

218           *(available + i) = *(reply->brp_un.brp_rescq.brq_avail + i);


