[Mauiusers] Maui crash `diagnose -p`
Simon Robbins
robbins at physik.uni-wuppertal.de
Thu Jan 19 03:52:13 MST 2006
Hello,
I'm running on a 64 bit (AMD Opteron) with CentOS
3.5.3, Maui 3.2.6p14 and torque 2.0.0p5. Although I
also saw this problem with Maui 3.2.6p13 compiled in
both 32 and 64 bit mode.
When a user runs `diagnose -p` the Maui scheduler
crashes. I give the output from GDB here:
(gdb) bt
#0 0x0000002a95884656 in _IO_default_xsputn_internal () from /lib64/tls/libc.so.6
#1 0x0000002a9587981c in _IO_padn_internal () from /lib64/tls/libc.so.6
#2 0x0000002a9585c22a in vfprintf () from /lib64/tls/libc.so.6
#3 0x0000002a9587aa05 in vsprintf () from /lib64/tls/libc.so.6
#4 0x0000002a95862efa in sprintf () from /lib64/tls/libc.so.6
#5 0x00000000004c79e3 in MJobGetStartPriority (J=0x2a4680d0, PIndex=0, Priority=0x7fbff751b0, Mode=0,
Buffer=0x7fbff75c60 "diagnosing job priority information (partition: ALL)\n\nJob", ' ' <repeats 20 times>, "PRIORITY* Cred(Class) FS(Group) Res(Proc)\n", ' ' <repeats 13 times>, "Weights -------- 1( 1) 1( 100) 1( 10)\n"...) at MPriority.c:1466
#6 0x00000000004196e7 in UIDiagnosePriority (
Buffer=0x7fbff75c60 "diagnosing job priority information (partition: ALL)\n\nJob", ' ' <repeats 20 times>, "PRIORITY* Cred(Class) FS(Group) Res(Proc)\n", ' ' <repeats 13 times>, "Weights -------- 1( 1) 1( 100) 1( 10)\n"..., BufSize=0x3fef1f0, P=0xd71040) at UserI.c:5260
#7 0x000000000041af85 in UIDiagnose (RBuffer=0xb1e12ba "13 0 ALL [NONE]\n",
SBuffer=0x7fbff75c60 "diagnosing job priority information (partition: ALL)\n\nJob", ' ' <repeats 20 times>, "PRIORITY* Cred(Class) FS(Group) Res( Proc)\n", ' ' <repeats 13 times>, "Weights -------- 1( 1) 1( 100) 1( 10)\n"..., FLAGS=5, Auth=0x7fbff753f0 "root",
SBufSize=0x3fef1f0) at UserI.c:6194
#8 0x000000000040cbe5 in UIProcessCommand (S=0x2020202020202020) at OUserI.c:443
#9 0x2e30202020293030 in ?? ()
........... (many many similar lines) ...........
#38009 0x3732363734310a29 in ?? ()
Cannot access memory at address 0x7fc0000000
(gdb)
IMPORTANT: I have changed the maximum number of jobs:
include/msched.h: line 443
#define MMAX_JOB 32768
(I've also seen this with 16384 jobs), without this
change I don't see this problem.
Best regards,
Simon Robbins.
More information about the mauiusers
mailing list