[torqueusers] Torque-1.1.0p5 problems

Jones, Wesley wesley_jones at nrel.gov
Tue Dec 7 14:34:54 MST 2004


I am having troubles with the december 3rd, Torque-1.1.0p5

I have 102 AMD Opteron nodes with torque compiled in 32bit mode.

I am getting a few of the following:
WARNING;!!! unable to contact node node094
But not very often.

Every so often, every 30 seconds or so, pbs is not able to process queries
from pbsnodes, qstat etc.

pbs_iff: cannot connect to head:15001 - fatal error, errno=99 (Cannot assign
requested address)
No Permission.
qstat: cannot connect to server head (errno=15007)

I think that it is occuring once every 30 seconds (maybe caused by maui)
when the log file gets 500 of the following lines (they show up in 3
seconds)
12/07/2004 13:42:02;0100;PBS_Server;Req;;Type AuthenticateUser request
received from root at head.atipacluster, sock=12
12/07/2004 13:42:02;0100;PBS_Server;Req;;Type StatusJob request received
from root at head.atipacluster, sock=11

When I have more than 102 nodes I get some of the following:

Inappropriate ioctl for device (25) in stream_eof, connection to node015
dropped

WEs




More information about the torqueusers mailing list