Bugzilla – Bug 73
Reported By Stuart Barkley
Last modified: 2010-07-28 20:25:03 MDT
You need to
before you can comment on or make changes to this bug.
Problem 1: pbs_server crash:
For a while I was seeing pbs_server crash each time moab was
restarted. I was playing with moab REMAPCLASS and REMAPCLASSLIST
configurations. With REMAPCLASS disabled pbs_server did not crash.
guess: I had something queued which was being remapped upon moab
restart which would crash pbs_server (the jobs do not get remapped).
After restarting pbs_server things where okay and new jobs where
Additional note: There was a large array job with both running (~1500)
and queued (~1000) tasks. There may have been some confusion when the
queued tasks where attempted to be remapped.
Some more notes: I've just seen another instance of this problem. If
I submit several jobs quickly which need to be remapped pbs_server
will die. If there is only a single job needing to be remapped when
moab restarts pbs_server does not die and the remapping happens.
It looks like pbs_server dies if multiple remaps happen either two
quickly or simultaneously.
Queuing a single array job does not crash pbs_server. I see the
individual tasks get remapped over time.
Crash has a patch that is being reviewed by Glen and I.
Created an attachment (id=43) [details]
Fix has been checked in to 2.5
(In reply to comment #3)
> Fix has been checked in to 2.5
I merged the fix into trunk as well