Bugzilla – Bug 73
Reported By Stuart Barkley
Last modified: 2010-07-28 20:25:03 MDT
You need to log in before you can comment on or make changes to this bug.
Problem 1: pbs_server crash: For a while I was seeing pbs_server crash each time moab was restarted. I was playing with moab REMAPCLASS and REMAPCLASSLIST configurations. With REMAPCLASS disabled pbs_server did not crash. guess: I had something queued which was being remapped upon moab restart which would crash pbs_server (the jobs do not get remapped). After restarting pbs_server things where okay and new jobs where remapped correctly. Additional note: There was a large array job with both running (~1500) and queued (~1000) tasks. There may have been some confusion when the queued tasks where attempted to be remapped. Some more notes: I've just seen another instance of this problem. If I submit several jobs quickly which need to be remapped pbs_server will die. If there is only a single job needing to be remapped when moab restarts pbs_server does not die and the remapping happens. It looks like pbs_server dies if multiple remaps happen either two quickly or simultaneously. Queuing a single array job does not crash pbs_server. I see the individual tasks get remapped over time.
Crash has a patch that is being reviewed by Glen and I. David
Created an attachment (id=43) [details] Patch
Fix has been checked in to 2.5
(In reply to comment #3) > Fix has been checked in to 2.5 I merged the fix into trunk as well