[torquedev] failing job init is fubar
garrick at clusterresources.com
Wed Mar 7 16:11:49 MST 2007
Turns out that any failure to initialize a job when pbs_server is
restarting is entirely mishandled and generally causes it to segfault.
An easy to trigger this is to create a temp execution queue, submit a
job to that queue, stop pbs_server, remove the queue state file, and
start pbs_server again. Trying to reenque the job into a non-existing
queue fails the job init.
<- returns PBSE_UNKQUE
<- returns after completely free()ing the job struct
<- returns void
has no idea anything went wrong and continues to access pjob and blows
So, um, this might be fun for someone else to fix :)
More information about the torquedev