[torqueusers] Re: error in pbs_iff: cannot read reply from pbs_server

Guilherme Menegon Arantes garantes at iq.usp.br
Thu Sep 18 14:23:47 MDT 2008

> On Wed, Sep 17, 2008 at 03:38:38PM -0300, Guilherme Menegon Arantes wrote:
> > 
> > Dear Torque users,
> > 
> > My Torque installation works fine, but when I submitted a large amount
> > of jobs in a row (say more than 10 or 15), I get the following error 
> > message:
> > 
> > pbs_iff: cannot read reply from pbs_server
> > No Permission.
> > qsub: cannot connect to server node5 (errno=15007)
> > 
> > where node5 is my Torque server. This error is seen both for qsub,
> > qstat or pbsnodes, everytime a large amount of jobs is submitted. 
> > Checking the server logs, I see errors like:
> > 
> > 09/17/2008 09:58:33;0080;PBS_Server;Req;req_reject;Reject reply code=15019(Invalid credential MSG=cannot authenticate user), aux=0, type=AuthenticateUser, from garantes at node5.full_server_name
> > 
> > where the server full domain name was not copied here, but is shown 
> > in the logs. I am running Torque 2.3.0 and this error is seen when
> > either default pbs_sched or Maui (3.2.6p19) are running as Schedulers. 
> If you haven't figured this out already, check the permissions on
> pbs_iff
> on all your systems.  Make sure that it has the setuid bit set.

The setuid bit was already set. Thanks anyway.

Another data: this behaviour is not reproduceable everytime. For
instance, I was able to submmit 100 jobs in a row, without problem
today... Strange.

Let me know if anyone needs further logs or info about my cluster.

Kind regards,



Guilherme Menegon Arantes, PhD       São Paulo, Brasil

More information about the torqueusers mailing list