[torqueusers] Question about the difference between a node where pbs_server is run and a compute node

Ken Nielson knielson at adaptivecomputing.com
Wed Apr 28 09:05:19 MDT 2010


On 04/28/2010 03:56 AM, Bas van der Vlies wrote:
> Hello,
>
>    We just installed version 2.4.7 and experiencing some serious problems
> with executing programs on the server. I noticed that the server
> is using '/tmp/.torque-unix' and the clients 'pbs_iff'.
>
> The following test on the pbs_server node will completely hang pbs_server.
> Here some pseudo code:
>     p = pbs_connect( pbs_default() )
>
> After this we can not do anythinng on all compute nodes and server:
>      - qstat, qsub, .....
>
> On a compute node this no problem at all. So i except the /tmp/.torque-unix
> is causing the problem.
>
> Is this a known problem or a bug?
>
> Regards
>
>
>    
There are several things we need to look at. The first one Garrick 
already addressed. Is the MOM running on the same node as the pbs_server?

If not is there evidence in the log files that both the server and the 
client are communicating?

Ken


More information about the torqueusers mailing list