[torqueusers] Slow response of torque when jobs are running
jbernstein at penguincomputing.com
Mon Dec 7 21:56:05 MST 2009
I've gotta believe this is a name resolution issue.
Can you check to make sure the hostnames in TORQUEs server_name file
contain a hostname the resolves quickly with getent?
On Dec 7, 2009, at 7:15 PM, "Garrick Staples" <garrick at usc.edu> wrote:
> On Tue, Dec 08, 2009 at 01:39:38AM +0000, Luc Vereecken alleged:
>> Hi Chris,
>> I attach a strace -T output of qstat. The output looked like a normal
>> qstat output with jobnumbers and running times etc, so nothing
>> The strace reveals that it all goes awry when accessing the
>> /tmp/.torque-unix. Major time is lost on a poll (line 78) and a read
>> (line 90), all other times look like normal timings.
>> That reminds me that there is something like a no-unix-sockets option
>> in configure, iirc.
> What you want is an strace of the _server_ while doing a qstat.
> qstat is just going to wait for a response from the server. Your
> strace shows
> exactly that.
> torqueusers mailing list
> torqueusers at supercluster.org
More information about the torqueusers