[torqueusers] Question about the difference between a node where pbs_server is run and a compute node
Ken Nielson
knielson at adaptivecomputing.com
Wed Apr 28 09:05:19 MDT 2010
On 04/28/2010 03:56 AM, Bas van der Vlies wrote:
> Hello,
>
> We just installed version 2.4.7 and experiencing some serious problems
> with executing programs on the server. I noticed that the server
> is using '/tmp/.torque-unix' and the clients 'pbs_iff'.
>
> The following test on the pbs_server node will completely hang pbs_server.
> Here some pseudo code:
> p = pbs_connect( pbs_default() )
>
> After this we can not do anythinng on all compute nodes and server:
> - qstat, qsub, .....
>
> On a compute node this no problem at all. So i except the /tmp/.torque-unix
> is causing the problem.
>
> Is this a known problem or a bug?
>
> Regards
>
>
>
There are several things we need to look at. The first one Garrick
already addressed. Is the MOM running on the same node as the pbs_server?
If not is there evidence in the log files that both the server and the
client are communicating?
Ken
More information about the torqueusers
mailing list