[torqueusers] connection time out problem

Lloyd Brown lloyd_brown at byu.edu
Tue Nov 30 13:21:31 MST 2010


As far as I know, showq and showstate are Moab/Maui commands, and
interact with that service, not with pbs_server.  I'd look at the
process load for whichever you're using.  For example, we use Moab, and
when the load average on our scheduler host goes up, we can usually
trace it to the Moab process.  Often this is accompanied with longer
scheduling iterations (we have it set at 1 minute, but sometimes see
45-55 seconds, even now).  The best things I can recommend to check the
scheduling iteration length are something like these:

mdiag -S
grep 'scheduling time' /opt/moab/log/moab.log

If the timeouts are legitimate, and you simply want to extend the
timeout of the moab client commands, you can put something like
"CLIENTTIMEOUT   00:03:00" in /etc/moab.cfg on the client hosts.

Lloyd



On 11/30/10 1:16 PM, Abhishek Gupta wrote:
> We are experiencing connection time out problem when we execute showq or 
> showstate commands. This problem do not occur all the time. It just 
> appears for 10-15 mins and then after that everything seems to be normal 
> again. We checked our system when we get these timeouts and we didn't 
> see anything running that may interrupt the connection with PBS server 
> like this.
> Where should I look for to understand the problem? Any ideas?
> Thanks,
> Abhi.


-- 


Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu




More information about the torqueusers mailing list