[Mauiusers] maui performance problem

Garrick Staples garrick at usc.edu
Wed Feb 9 16:40:23 MST 2005


I have a user that has submitted a 1000 jobs, most of which with 1000
dependencies.  This makes the job information very large.  Maui, on each
scheduler iteration, is "downloading" all of that information from pbs_server;
and it's being very slow about it.

This is the production environment, so I'm not up to the latest revs yet:
torque-1.1.0p4-snap.1099003850
maui-3.2.6p10-snap.1095450030

strace shows pbs_server is doing very large writes, and maui is spinning it's
wheels doing *single byte* reads!  This process takes about 2 minutes and
pbs_server and maui are both unresponsive during this time.

Of course the usual cascading happens: nodes start dropping connections and
pbs_server starts marking nodes down, jobs can't be submitted, users start
throwing things at me, etc.

pbs_server...
write(10, "4+5004PBS_O_HOME=/home/rcf-78/sd"..., 62595) = 62595
write(10, "4+5004PBS_O_HOME=/home/rcf-78/sd"..., 62595) = 62595
write(10, "4+5004PBS_O_HOME=/home/rcf-78/sd"..., 62595) = 62595
write(10, "4+5004PBS_O_HOME=/home/rcf-78/sd"..., 62595) = 62595
write(10, "4+5004PBS_O_HOME=/home/rcf-78/sd"..., 62595) = 62595
write(10, "4+5004PBS_O_HOME=/home/rcf-78/sd"..., 62595) = 62595
write(10, "4+5004PBS_O_HOME=/home/rcf-78/sd"..., 62595) = 62595
write(10, "4+5004PBS_O_HOME=/home/rcf-78/sd"..., 62595) = 62595
write(10, "4+5004PBS_O_HOME=/home/rcf-78/sd"..., 62595) = 62595
write(10, "4+5004PBS_O_HOME=/home/rcf-78/sd"..., 62595) = 62595
write(10, "4+5004PBS_O_HOME=/home/rcf-78/sd"..., 62595) = 62595
write(10, "4+5004PBS_O_HOME=/home/rcf-78/sd"..., 62595) = 62595
write(10, "4+5004PBS_O_HOME=/home/rcf-78/sd"..., 62595) = 62595
write(10, "4+5004PBS_O_HOME=/home/rcf-78/sd"..., 64695) = 64695
...

maui...
read(7, "s", 1)                         = 1
read(7, "c", 1)                         = 1
read(7, ".", 1)                         = 1
read(7, "e", 1)                         = 1
read(7, "d", 1)                         = 1
read(7, "u", 1)                         = 1
read(7, "@", 1)                         = 1
read(7, "h", 1)                         = 1
read(7, "p", 1)                         = 1
read(7, "c", 1)                         = 1
read(7, "-", 1)                         = 1
read(7, "p", 1)                         = 1
read(7, "b", 1)                         = 1
read(7, "s", 1)                         = 1
read(7, ".", 1)                         = 1
read(7, "u", 1)                         = 1
read(7, "s", 1)                         = 1
read(7, "c", 1)                         = 1
read(7, ".", 1)                         = 1
...

-- 
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/mauiusers/attachments/20050209/66b45173/attachment.bin


More information about the mauiusers mailing list