[torqueusers] unknow reason:the pbs_server daemon was killed
and can not start
Garrick Staples
garrick at usc.edu
Thu Nov 13 00:39:30 MST 2008
On Thu, Nov 13, 2008 at 12:50:09PM +0800, Weiguang Chen alleged:
> Hi,
> Our torque version is 2.3.13.
> Today, "qstat" command can not be executed normally, and I found:
> [root at node1 init.d]# qstat
> Cannot connect to default server host 'node1' - check pbs_server daemon.
> qstat: cannot connect to server node1 (errno=111)
>
> and I checked the pbs_server daemon and found
> -- [root at node1 init.d]# ps -ef|grep pbs
> root 3079 1 0 Sep16 ? 00:01:07 /usr/local/sbin/pbs_sched
> root 16571 5229 0 12:38 pts/21 00:00:00 grep pbs
>
> The pbs_server daemon was killed by unknow reason
> and when i decided to rerun this daemon, a problem happened:
> [root at node1 init.d]# /usr/local/sbin/pbs_server
> pbs_server: svr_func.c:222: set_resc_assigned: Assertion
> `pjob->ji_qhdr->qu_qs.qu_type == 1' failed.
> 已放弃
> What is the problem?
> How i can do?
You might also just backup your entire $PBS_SERVER_HOME/server_priv directory,
and rebuild torque with that src/server/svr_func.c:assert() call commented out.
--
Garrick Staples, GNU/Linux HPCC SysAdmin
University of Southern California
Revoke LDS Church 501(c)(3) Status - http://lds501c3.wordpress.com/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20081112/0b6ab7b6/attachment.bin
More information about the torqueusers
mailing list