[torqueusers] Torque-1.1.0p5 problems
Jones, Wesley
wesley_jones at nrel.gov
Tue Dec 7 14:34:54 MST 2004
I am having troubles with the december 3rd, Torque-1.1.0p5
I have 102 AMD Opteron nodes with torque compiled in 32bit mode.
I am getting a few of the following:
WARNING;!!! unable to contact node node094
But not very often.
Every so often, every 30 seconds or so, pbs is not able to process queries
from pbsnodes, qstat etc.
pbs_iff: cannot connect to head:15001 - fatal error, errno=99 (Cannot assign
requested address)
No Permission.
qstat: cannot connect to server head (errno=15007)
I think that it is occuring once every 30 seconds (maybe caused by maui)
when the log file gets 500 of the following lines (they show up in 3
seconds)
12/07/2004 13:42:02;0100;PBS_Server;Req;;Type AuthenticateUser request
received from root at head.atipacluster, sock=12
12/07/2004 13:42:02;0100;PBS_Server;Req;;Type StatusJob request received
from root at head.atipacluster, sock=11
When I have more than 102 nodes I get some of the following:
Inappropriate ioctl for device (25) in stream_eof, connection to node015
dropped
WEs
More information about the torqueusers
mailing list