Bug 161 - pbs_server, init.d hangs when creating serverdb.
: pbs_server, init.d hangs when creating serverdb.
Status: NEW
Product: TORQUE
pbs_server
: 3.0.x
: PC Linux
: P5 normal
Assigned To: David Beer
:
:
:
  Show dependency treegraph
 
Reported: 2011-10-08 16:06 MDT by Steve Traylen
Modified: 2011-10-08 16:06 MDT (History)
1 user (show)

See Also:


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description Steve Traylen 2011-10-08 16:06:21 MDT
Hi,

Reported 
http://www.supercluster.org/pipermail/torqueusers/2011-May/012828.html
and
https://bugzilla.redhat.com/show_bug.cgi?id=744138

steps to reproduce

1. /etc/init.d/pbs_server stop
2. rm -f /var/torque/server_priv/serverdb 
3. /etc/init.d/pbs_server start

The init.d script is left in  a "sleep 1" loop 

$PBS_DAEMON -d $PBS_HOME -t create &
    while [ ! -r $PBS_SERVERDB ]; do
        sleep 1
    done
    killproc pbs_server
    RET=$?

in particular the serverdb is never actually written to disk until
the pbs_server is killed so the loop lasts forever.


Better would be if the pbs_server could close the filehandle on
the serverdb or something.

Following on from the comments in the mail thread, the init.d script
should either be successful or if the serverdb file fails then
the init.d script should just fail with a message and return code
of 5 telling you what to do.... As a trivial fix changing the above
code to:

   $PBS_DAEMON -d $PBS_HOME -t create &
   sleep 5
   killproc pbs_server
   RET=$?

does the job but will not handle an error case very well.

Steve.





Steve.