[Mauiusers] Maui crash `diagnose -p`

Simon Robbins robbins at physik.uni-wuppertal.de
Thu Jan 19 03:52:13 MST 2006


Hello,

I'm running on a 64 bit (AMD Opteron) with CentOS 
3.5.3, Maui 3.2.6p14 and torque 2.0.0p5.  Although I 
also saw this problem with Maui 3.2.6p13 compiled in 
both 32 and 64 bit mode.

When a user runs `diagnose -p` the Maui scheduler 
crashes.  I give the output from GDB here:

(gdb) bt
#0  0x0000002a95884656 in _IO_default_xsputn_internal () from /lib64/tls/libc.so.6
#1  0x0000002a9587981c in _IO_padn_internal () from /lib64/tls/libc.so.6
#2  0x0000002a9585c22a in vfprintf () from /lib64/tls/libc.so.6
#3  0x0000002a9587aa05 in vsprintf () from /lib64/tls/libc.so.6
#4  0x0000002a95862efa in sprintf () from /lib64/tls/libc.so.6
#5  0x00000000004c79e3 in MJobGetStartPriority (J=0x2a4680d0, PIndex=0, Priority=0x7fbff751b0, Mode=0,
    Buffer=0x7fbff75c60 "diagnosing job priority information (partition: ALL)\n\nJob", ' ' <repeats 20 times>, "PRIORITY*   Cred(Class)    FS(Group)   Res(Proc)\n", ' ' <repeats 13 times>, "Weights   --------       1(    1)     1(  100)     1(   10)\n"...) at MPriority.c:1466
#6  0x00000000004196e7 in UIDiagnosePriority (
    Buffer=0x7fbff75c60 "diagnosing job priority information (partition: ALL)\n\nJob", ' ' <repeats 20 times>, "PRIORITY*   Cred(Class)    FS(Group)   Res(Proc)\n", ' ' <repeats 13 times>, "Weights   --------       1(    1)     1(  100)     1(   10)\n"..., BufSize=0x3fef1f0, P=0xd71040) at UserI.c:5260
#7  0x000000000041af85 in UIDiagnose (RBuffer=0xb1e12ba "13 0 ALL [NONE]\n",
    SBuffer=0x7fbff75c60 "diagnosing job priority information (partition: ALL)\n\nJob", ' ' <repeats 20 times>, "PRIORITY*   Cred(Class)    FS(Group)   Res( Proc)\n", ' ' <repeats 13 times>, "Weights   --------       1(    1)     1(  100)     1(   10)\n"..., FLAGS=5, Auth=0x7fbff753f0 "root",
    SBufSize=0x3fef1f0) at UserI.c:6194
#8  0x000000000040cbe5 in UIProcessCommand (S=0x2020202020202020) at OUserI.c:443
#9  0x2e30202020293030 in ?? ()

........... (many many similar lines) ...........

#38009 0x3732363734310a29 in ?? ()
Cannot access memory at address 0x7fc0000000
(gdb)


IMPORTANT: I have changed the maximum number of jobs:
include/msched.h: line 443
#define MMAX_JOB          32768

(I've also seen this with 16384 jobs), without this 
change I don't see this problem.

Best regards,

Simon Robbins.


More information about the mauiusers mailing list