Bugzilla – Bug 35
Pbs_server writes the wrong PID number to $pbs_home/server_priv/server.lock
Last modified: 2009-12-04 16:40:22 MST
You need to
before you can comment on or make changes to this bug.
When pbs_server starts, it writes the wrong PID number to the
$pbs_home/server_priv/server.lock file. The PID number written to this file is
the pbs_server PID number minus 1. This prevents the /etc/init.d/pbs script to
properly stop the server. Only the scheduler is stopped.
[root@fn1 ~]# ps -ef | grep pbs_server
root 18669 1 0 12:02 ? 00:00:00 /usr/torque/sbin/pbs_server
root 19016 744 0 16:47 pts/1 00:00:00 grep pbs_server
[root@fn1 ~]# cat /var/spool/torque/server_priv/server.lock
It appears that pbs_server writes out this lock file before it forks itself to
put itself into the background, and the bug appears at least as far back as
later 2.3.x versions. Some of this code was modified for "high availability"
mode where multiple pbs_servers could be monitoring the same lock file.
I am going to propose a solution to the TORQUE developers mailing list for
comments, and we should get this fixed in 2.3 and 2.4 branches (as well as
actually, I take my comment back. The bug is not in the 2.3.x branch, it
appeared in 2.4.x
as far as I can tell, at least when not running in HA mode, the code looks like
it should do the right thing: fork, create a new session, and write the session
ID (which should be the same as the pid for the newly forked process) to the
I'll probably add some debugging output to my local build to see if I can track
I looked into this and I have fixed it. For normal mode, the problem is that
the pid for the server wasn't updated after the last fork, thus it had the one
For high availability mode (with --enable-high-availability configured) the
problem was the it didn't write anything to the lock file at all.
Both of these problems have been corrected in a patch I created that is being
reviewed for check-in.
David, could you post the patch here after it has been reviewed for check-in.
Sure, once we clear the patch I will post it here.
Created an attachment (id=21) [details]
This is the patch to fix this bug.