[torqueusers] Question about the difference between a node where pbs_server is run and a compute node

Garrick garrick at usc.edu
Wed Apr 28 08:49:10 MDT 2010


The unix socket file is just a faster replacement for tcp and priv  
ports when on the same machine as the server. I wrote that code years  
ago in the 2.1.0 days.

The server is listening on unix, tcp, and udp sockets. The clients and  
moms select the method.

I don't know why your's is hanging. Do some stracing on the server.  
Attach a debugger and see where it is.

HPCC/Linux Systems Admin

On Apr 28, 2010, at 2:56 AM, Bas van der Vlies <basv at sara.nl> wrote:

> Hello,
>
>  We just installed version 2.4.7 and experiencing some serious  
> problems
> with executing programs on the server. I noticed that the server
> is using '/tmp/.torque-unix' and the clients 'pbs_iff'.
>
> The following test on the pbs_server node will completely hang  
> pbs_server.
> Here some pseudo code:
>   p = pbs_connect( pbs_default() )
>
> After this we can not do anythinng on all compute nodes and server:
>    - qstat, qsub, .....
>
> On a compute node this no problem at all. So i except the / 
> tmp/.torque-unix
> is causing the problem.
>
> Is this a known problem or a bug?
>
> Regards
>
>
> -- 
> ********************************************************************
> *  Bas van der Vlies                    e-mail: basv at sara.nl       *
> *  SARA - Academic Computing Services   Amsterdam, The Netherlands *
> ********************************************************************
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers


More information about the torqueusers mailing list