Bug 101 - record_jobinfo pbs_server segfault
: record_jobinfo pbs_server segfault
Status: RESOLVED FIXED
Product: TORQUE
pbs_server
: 2.5.x
: PC Linux
: P5 critical
Assigned To: Glen
:
:
:
  Show dependency treegraph
 
Reported: 2010-12-03 20:29 MST by Martin Siegert
Modified: 2010-12-06 10:00 MST (History)
2 users (show)

See Also:


Attachments
torque-2.5.3 svr_recov patch (502 bytes, patch)
2010-12-04 20:28 MST, Martin Siegert
Details | Diff


Note

You need to log in before you can comment on or make changes to this bug.


Description Martin Siegert 2010-12-03 20:29:54 MST
This is with torque-2.5.3.
After setting record_job_info = True pbs_server segfaults:

(gdb) c
Continuing.
Detaching after fork from child process 9544.

Program received signal SIGSEGV, Segmentation fault.
0x000000000044fdac in escape_xml (in=0x0, out=0x7fffbe60473d "", size=4083)
    at u_xml.c:207
207       int len = strlen(in);
(gdb) where
#0  0x000000000044fdac in escape_xml (in=0x0, out=0x7fffbe60473d "", size=4083)
    at u_xml.c:207
#1  0x0000000000442f7e in attr_to_str (out=0x7fffbe604730 "\t\t<neednodes>", 
    size=4096, at_def=0x661d60, attr=
        {at_flags = 1, at_type = 6, at_val = {at_long = 299521440, at_ll =
299521440, at_char = -96 '�', at_str = 0x11da55a0 "���\021", at_arst =
0x11da55a0, at_size = {atsv_num = 299521440, atsv_shift = 192, atsv_units = 0},
at_list = {ll_prior = 0x11da55a0, ll_next = 0x11e6a4c0, ll_struct = 0x0},
at_jinfo = 0x11da55a0, at_short = 21920}}, XML=1) at svr_recov.c:493
#2  0x000000000040de9e in record_jobinfo (pjob=0x11da5830) at job_func.c:1416
#3  0x000000000040e090 in job_purge (pjob=0x11da5830) at job_func.c:1495
#4  0x00000000004268f6 in on_job_exit (ptask=0x11e66620) at req_jobobit.c:1572
#5  0x0000000000445294 in dispatch_task (ptask=0x11e66620) at svr_task.c:206
#6  0x000000000041ca00 in next_task () at pbsd_main.c:963
#7  0x000000000041cd73 in main_loop () at pbsd_main.c:1119
#8  0x000000000041dc54 in main (argc=1, argv=0x7fffbe609b68)
    at pbsd_main.c:1748
(gdb)

- Martin
Comment 1 Martin Siegert 2010-12-04 20:28:34 MST
Created an attachment (id=66) [details]
torque-2.5.3 svr_recov patch

safeguards similar to line 384 are needed before calling escape_xml. The
attached patch appears to solve the problem.
Comment 2 Ken Nielson 2010-12-06 10:00:44 MST
This patch has been committed to 2.5-fixes, 3.0 and trunk