[torquedev] Double free and touches of freed memory inside pbs_server

Ken Nielson knielson at adaptivecomputing.com
Mon Aug 9 11:33:30 MDT 2010


TORQUE 2.5.0 introduced a new function in req_modifyjob named modify_job. In previous versions of TORQUE req_modifyjob called relay_to_mom directly and did a return on success. It skipped the call to reply_ack so there was no problems in earlier versions. Because of the checkpoint work and other things happening in modify_job, req_modifyjob monitors the return code of modify_job and branches to different error routines based on the return code. I added a new error code PBSE_RELAYED_TO_MOM to let req_modifyjob know the job went to the mom and to return without calling reply_ack. 

I have attached the patch. I think this is better suited to the problem than modifying batch_request to handle the rq_refcount element.

This also explains why the other functions which call relay_to_mom do not have a problem. Their code has not been modified.


Ken 

----- Original Message -----
From: "Ken Nielson" <knielson at adaptivecomputing.com>
To: "Torque Developers mailing list" <torquedev at supercluster.org>
Sent: Monday, August 9, 2010 10:26:10 AM
Subject: Re: [torquedev] Double free and touches of freed	memory	inside	pbs_server

Anytime we call relay_to_mom we have the potential to send the batch_request buffer from process_request to a task. req_modifyjob is interesting because it does not always have to talk to a mom to modify a job. Only if the job is running. The code path for req_modifyjob ends up calling reply_ack every time which calls reply_send. reply_send frees the batch_request. If relay_to_mom is called because the job is running a task to start post_modify_req is scheduled. post_modify_req also calls reply_ack. Whoever calls reply_ack last causes the double free.

The reason this does not happen for every call to relay_to_mom is because each function schedules different callback routines and also does not always call reply_ack. It is possible we have a memory leak in some cases because some of the routines do not call reply_ack and a call to relay_to_mom is not always guaranteed to happen.

Ken

----- Original Message -----
From: "Garrick Staples" <garrick at usc.edu>
To: torquedev at supercluster.org
Sent: Thursday, August 5, 2010 1:11:01 PM
Subject: Re: [torquedev] Double free and touches of freed memory	inside	pbs_server

Reading your description makes it sound like a problem with all server->mom
requests, but it actually only happens with modify_job()? If so, what is
special about modify_job()?


On Thu, Aug 05, 2010 at 09:26:40PM +0400, Eygene Ryabinkin alleged:
> Good day.
> 
> It looks like I digged the case where pbs_server will free the memory,
> then touch it and then will free it again.  I had experienced it with
> 2.5.1, but it looks like most versions should have this problem.
> 
> Here's what happens:
>  - modifyjob request comes in, process_request() will allocate
>    new request with alloc_br();
>  - then dispatch_request() will call req_modifyjob() that in turn
>    will call modify_job() and which in some cases (when job attributes
>    are to be changed) will call relay_to_mom();
>  - relay_to_mom() will insert this request (allocated with alloc_br())
>    into task_list_event (by calling issue_Drequest());
>  - modify_job() will do its job and req_modifyjob() will call
>    reply_ack() that will invoke reply_send();
>  - reply_send() sends the reply and calls free_br() on our request;
>    _but_ the same request was pushed to the task_list_event, so
>    once the MOM will reply, pbs_server will touch the freed memory
>    chunk and will free it once again.
> 
> Since there can be modifications of multiple jobs per one client's
> request (via req_modifyarray()) and it is rather hard to make a proper
> deep copy of a request (at least, it is hard for me), I ended up with a
> simple refcounting patch.  It works in the sense that pbs_server stopped
> to dump core (because glibc detects double frees on CentOS 5.5 and calls
> abort()), but pbs_server for 2.5.1 was responding to the requests like
> 'qstat -Bf' very slowly (with and without my patch), so I had rolled
> back to 2.4.9 at our production infrastructure.
> 
> The patch is attached and it will be very good if someone will be
> able to evaluate both the patch and the logics above.
> 
> Meanwhile, I will try to backport the patch for 2.4.9 and use it
> on our production systems.
> 
> Thanks!
> -- 
> Eygene Ryabinkin, Russian Research Centre "Kurchatov Institute"


> _______________________________________________
> torquedev mailing list
> torquedev at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torquedev


-- 
Garrick Staples, GNU/Linux HPCC SysAdmin
University of Southern California

Life is Good!

_______________________________________________
torquedev mailing list
torquedev at supercluster.org
http://www.supercluster.org/mailman/listinfo/torquedev
_______________________________________________
torquedev mailing list
torquedev at supercluster.org
http://www.supercluster.org/mailman/listinfo/torquedev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2.5.2-doublefree.patch
Type: text/x-patch
Size: 1207 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20100809/dda37fe6/attachment.bin 


More information about the torquedev mailing list