Bugzilla – Bug 187
segfault in job_abt after dealing with array dependencies
Last modified: 2012-04-26 00:04:17 MDT
You need to log in before you can comment on or make changes to this bug.
This code in job_abt: if (pjob->ji_wattr[JOB_ATR_depend].at_flags & ATR_VFLAG_SET) { strcpy(jobid, pjob->ji_qs.ji_jobid); depend_on_term(pjob); pjob = find_job(jobid); } /* update internal array bookeeping values */ if ((pjob->ji_arraystruct != NULL) && (pjob->ji_is_array_template == FALSE)) { ... } is causing a seg fault for us, in torque 4.0.1, r6023, since find_job is changing pjob to be null, then the following conditional statement crashes. Strangely, the code within the conditional statement has several checks for pjob being null, while the condition itself does not. This patch: Index: src/server/job_func.c =================================================================== --- src/server/job_func.c (revision 6023) +++ src/server/job_func.c (working copy) @@ -526,10 +526,14 @@ strcpy(jobid, pjob->ji_qs.ji_jobid); depend_on_term(pjob); pjob = find_job(jobid); + if (pjob == NULL){ + log_event(PBSEVENT_JOB, PBS_EVENTCLASS_JOB, jobid, "lost job after setting up dependencies."); } + } /* update internal array bookeeping values */ - if ((pjob->ji_arraystruct != NULL) && + if ((pjob != NULL) && + (pjob->ji_arraystruct != NULL) && (pjob->ji_is_array_template == FALSE)) { job_array *pa = get_jobs_array(&pjob); resolves the crash. It seems like the code was intended to have this check, but it was lost/missed/deleted.