[torqueusers] performance problem on x86_64
Wightman
wightman at clusterresources.com
Thu Oct 6 15:39:17 MDT 2005
Although my own system is MUCH smaller, it is x86_64 (mix of debian and
fedora). I see no slowdown at all with any client commands. (our other
x86_64 systems include fedora4 and centos4...no reports of slowdown).
Just FYI.
- Douglas
On Thu, 2005-10-06 at 13:35 -0700, Garrick Staples wrote:
> On Thu, Oct 06, 2005 at 10:43:43AM -0700, Garrick Staples alleged:
> > I'm getting plagued by a strange performance problem in x86_64 TORQUE. It's
> > driving me nuts.
> >
> > Multiple, quick stats of jobs or nodes are very very slow when run on any x86_64
> > host. The examples below work fine if I run it from any 32bit hosts. And it
> > seems to only happen when a lot of single-node jobs are in the queue (running or
> > idle). (Dave, I think you've seen this happen on TeraGrid)
>
> I've found that the problem is inside of pbs_iff, but I can't figure out
> why. This is cleaned up slightly with the attached patch:
>
> # ./pbs_iff -t hpc-pbs 15001; strace -r ./pbs_iff -t hpc-pbs 15001
> ...
> 0.000000 socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3
> 0.000000 fcntl(3, F_GETFL) = 0x2 (flags O_RDWR)
> 0.000000 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
> 0.000000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, "\1\0\0\0\0\0\0\0", 8) = 0
> 0.000000 bind(3, {sa_family=AF_INET, sin_port=htons(1023), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
> 0.000000 connect(3, {sa_family=AF_INET, sin_port=htons(15001), sin_addr=inet_addr("10.125.0.205")}, 16) = -1 EINPROGRESS (Operation now in progress)
> 0.000000 select(1024, NULL, [3], NULL, {5, 0}^[[A) = 1 (out [3], left {2, 0})
> 3.000000 getsockopt(3, SOL_SOCKET, SO_ERROR, [17179869184], [4]) = 0
>
>
> Note that the select() call takes 3 seconds! Every time it fails, it is
> always precisely 3 seconds.
>
> I also tried removing the O_NONBLOCK and SO_REUSEADDR bits, but that
> didn't effect it either. I'm thinking this is a Linux (RHEL3) bug.
>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
More information about the torqueusers
mailing list