Bugzilla – Bug 82
Unnecessary code in send_job
Last modified: 2011-03-29 20:10:53 MDT
You need to log in before you can comment on or make changes to this bug.
This can and probably should be removed: if (move_type != MOVE_TYPE_Exec) /* not if sending to MOM */ job_purge(jobp); This is a forked child and therefore it doesn't matter at all if the job gets purged. The process ends after few lines anyway. Plus the job variable is being accessed in error handling for PBSD_commit.
I'm planning on committing this patch to trunk, 2.5-fixes, and 2.4-fixes - hopefully within the next 24hours
job_purge() changes and deletes files on disk. It doesn't matter if it is called from a child process; it is still an important function. There might be some bug or not. I don't know. I'm not commenting on that. I'm just saying that the requirement for job_purge is not invalidated by being called from a child process.
(In reply to comment #2) > job_purge() changes and deletes files on disk. It doesn't matter if it is > called from a child process; it is still an important function. > > There might be some bug or not. I don't know. I'm not commenting on that. I'm > just saying that the requirement for job_purge is not invalidated by being > called from a child process. this is a good point, we need to make sure that the files associated with the job get cleaned up, but the parent process must call job_purge() at some point, otherwise the job would persist in the parent process's linked lists indefinitely (until a reboot, at which time the job would not be recovered because the job_purge called by the child removed the files).
(In reply to comment #3) > (In reply to comment #2) > > job_purge() changes and deletes files on disk. It doesn't matter if it is > > called from a child process; it is still an important function. > > > > There might be some bug or not. I don't know. I'm not commenting on that. I'm > > just saying that the requirement for job_purge is not invalidated by being > > called from a child process. > > this is a good point, we need to make sure that the files associated with the > job get cleaned up, but the parent process must call job_purge() at some point, > otherwise the job would persist in the parent process's linked lists > indefinitely (until a reboot, at which time the job would not be recovered > because the job_purge called by the child removed the files). not making this change until this can be investigated further. It definitely sounds like a bug since the job could be referenced after the job_purge.
(In reply to comment #3) > (In reply to comment #2) > > job_purge() changes and deletes files on disk. It doesn't matter if it is > > called from a child process; it is still an important function. > > > > There might be some bug or not. I don't know. I'm not commenting on that. I'm > > just saying that the requirement for job_purge is not invalidated by being > > called from a child process. > > this is a good point, we need to make sure that the files associated with the > job get cleaned up, but the parent process must call job_purge() at some point, > otherwise the job would persist in the parent process's linked lists > indefinitely (until a reboot, at which time the job would not be recovered > because the job_purge called by the child removed the files). The real job_purge is called from post_movejob (and only if the move succeeded of course).
(In reply to comment #5) > (In reply to comment #3) > > (In reply to comment #2) > > > job_purge() changes and deletes files on disk. It doesn't matter if it is > > > called from a child process; it is still an important function. > > > > > > There might be some bug or not. I don't know. I'm not commenting on that. I'm > > > just saying that the requirement for job_purge is not invalidated by being > > > called from a child process. > > > > this is a good point, we need to make sure that the files associated with the > > job get cleaned up, but the parent process must call job_purge() at some point, > > otherwise the job would persist in the parent process's linked lists > > indefinitely (until a reboot, at which time the job would not be recovered > > because the job_purge called by the child removed the files). > > The real job_purge is called from post_movejob (and only if the move succeeded > of course). right net_move calls send_job with a post_movejob as the argument for the post_func parameter. In send_job the parent sets up a work task to run post_movejob when the child exits. The job_purge in the child is redundant, and the parent is the right place to do it. The only other place send_job() is called is in req_runjob()->svr_startjob()->svr_strtjob2() to send the job to a mom. We don't want the job to get purged in this case, and svr_strtjob2 sets up its own post function to do any required cleanup. This change looks valid to me.
checked into trunk I'll wait a bit before making the change in 2.5-fixes and 2.4-fixes
fixed in 2.4-fixes and 2.5-fixes as well. this still needs to make it into the 3.0 branch
should be fixed in all active branches