[torqueusers] RE: maui won't start

Efstathiadis, Efstratios stratos at bnl.gov
Wed Mar 8 12:26:40 MST 2006


I run Maui under gdb and looks like the segmentation fault
comes from torque/src/lib/Libfl/PBSD_status.c, 
line 113: strcpy( extend,"timeout"):


-bash:stratos:~/maui-3.2.6p14/src/moab>gdb /home/stratos/maui/sbin/maui 
GNU gdb 6.4
Copyright 2005 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "powerpc-ibm-aix5.2.0.0"...
(gdb) r
Starting program: /home/stratos/maui/sbin/maui 

Program received signal SIGSEGV, Segmentation fault.
0x10120e7c in PBSD_status (c=1, function=19, id=0x10183e90 "", attrib=0x0, 
    extend=0x10174ff4 "exec_queue_only") at ./../Libifl/PBSD_status.c:113
113     ./../Libifl/PBSD_status.c: A file or directory in the path name does not exist..
        in ./../Libifl/PBSD_status.c
(gdb) where
#0  0x10120e7c in PBSD_status (c=1, function=19, id=0x10183e90 "", attrib=0x0, 
    extend=0x10174ff4 "exec_queue_only") at ./../Libifl/PBSD_status.c:113
#1  0x10120b08 in pbs_statjob (c=1, id=0x0, attrib=0x0, 
    extend=0x10174ff4 "exec_queue_only") at ./../Libifl/pbsD_statjob.c:110
#2  0x100a5530 in MPBSWorkloadQuery (R=0x20cd6630, JCount=0x2ff11288, SC=0x0)
    at MPBSI.c:698
#3  0x10016de4 in __MUTFunc (V=0x2ff111e8) at MUtil.c:4717
#4  0x10016cfc in MUThread (F=@0x20008dc4: 0x100a53a4 <MPBSWorkloadQuery>, 
    TimeOut=9, RC=0x2ff11284, ACount=3, Lock=0x0) at MUtil.c:4690
#5  0x10099750 in MRMWorkloadQuery (WCount=0x2ff112f4, SC=0x0) at MRM.c:595
#6  0x10098ea8 in MRMGetInfo () at MRM.c:364
#7  0x1011900c in MSchedProcessJobs (OldDay=0x2ff197b8 "", GlobalSQ=0x2ff197f8, 
    GlobalHQ=0x2ff1d7f8) at MSched.c:6870
#8  0x10001348 in main (ArgC=1, ArgV=0x2ff228a8) at Server.c:189
(gdb) q




-----Original Message-----
From: Efstathiadis, Efstratios
Sent: Wed 3/8/2006 12:48 PM
To: torqueusers at supercluster.org
Subject: maui won't start
 

Hi, 
I just installed MAUI on an AIX52 box (already running Torque 1.2.0p6),
but Maui won't start. I have appended the maui.log file (LOGLEVEL 3).
Any ideas why Maui won;t start?

Thanks.

----- maui.log ----

03/08 12:44:32 INFO:     starting Maui version  ##################
03/08 12:44:32 INFO:     new LOGLEVEL value (3)
03/08 12:44:32 MCfgProcessLine(NODEALLOCATIONPOLICY,,MINRESOURCE)
03/08 12:44:32 MCfgSetVal(NODEALLOCATIONPOLICY,IVal,DVal,SVal,SArray,P)
03/08 12:44:32 MUGetIndex(MINRESOURCE,ValList,2)
03/08 12:44:32 MCfgProcessLine(QUEUETIMEWEIGHT,,1 )
03/08 12:44:32 MCfgSetVal(QUEUETIMEWEIGHT,IVal,DVal,SVal,SArray,P)
03/08 12:44:32 MCfgProcessLine(RESERVATIONPOLICY,,CURRENTHIGHEST)
03/08 12:44:32 MCfgSetVal(RESERVATIONPOLICY,IVal,DVal,SVal,SArray,P)
03/08 12:44:32 MUGetIndex(CURRENTHIGHEST,ValList,0)
03/08 12:44:32 MCfgProcessLine(RMPOLLINTERVAL,,00:00:30)
03/08 12:44:32 MCfgSetVal(RMPOLLINTERVAL,IVal,DVal,SVal,SArray,P)
03/08 12:44:32 MUTimeFromString(00:00:30)
03/08 12:44:32 MCfgProcessLine(SERVERHOST,,qcdochostb.qcdoc.bnl.gov)
03/08 12:44:32 MCfgSetVal(SERVERHOST,IVal,DVal,SVal,SArray,P)
03/08 12:44:32 INFO:     starting scheduler on 'qcdochostb.qcdoc.bnl.gov'
03/08 12:44:32 MCfgProcessLine(SERVERMODE,,TEST)
03/08 12:44:32 MCfgSetVal(SERVERMODE,IVal,DVal,SVal,SArray,P)
03/08 12:44:32 MUGetIndex(TEST,ValList,1)
03/08 12:44:32 MCfgProcessLine(SERVERPORT,,42559)
03/08 12:44:32 MCfgSetVal(SERVERPORT,IVal,DVal,SVal,SArray,P)
03/08 12:44:32 MUGetIndex(TYPE,ValList,0)
03/08 12:44:32 MUGetIndex(PBS,ValList,0)
03/08 12:44:32 MAMSetDefaults(bank)
03/08 12:44:32 MAMSetDefaults(bank)
03/08 12:44:32 MUGetIndex(TYPE,ValList,0)
03/08 12:44:32 MUGetIndex(NONE,ValList,0)
03/08 12:44:32 MAMSetDefaults(bank)
03/08 12:44:32 ServerProcessArgs(1,ArgV,0)
03/08 12:44:32 MUGetOpt(1,ArgV,a:Ab:B:c:C:dD:f:hH:i:j:l:L:m:n:N:p:P:r:s:v?-:,OptArg)
03/08 12:44:32 ServerDemonize()
03/08 12:44:32 ServerAuthenticate()
03/08 12:44:32 INFO:     executing scheduler from '/usr/local/maui/' under UID 310 GID 310
03/08 12:44:32 SDRGetSystemConfig()
03/08 12:44:32 MSysStartServer()
03/08 12:44:32 starting  version Maui (PID: 1695902) on Wed Mar  8 12:44:32
03/08 12:44:32 MSysMemCheck()
03/08 12:44:32 MNode[5120]               0.02
03/08 12:44:32 MJob[4096]                0.02
03/08 12:44:32 MJobTraceBuffer[4096]     0.00
03/08 12:44:32 MUser[1792]               0.01
03/08 12:44:32 MGroup[1792]              2.06
03/08 12:44:32 MAcct[1792]               2.06
03/08 12:44:32 MRes[1024]                0.00
03/08 12:44:32 SRes[ 128]                2.37
03/08 12:44:32 MStatInitialize(P)
03/08 12:44:32 MStatProfInitialize(P)
03/08 12:44:32 MStatOpenFile(1141839872)
03/08 12:44:32 MSUListen(S)
03/08 12:44:32 INFO:     opened service socket on port 42559
03/08 12:44:32 MSUListen(S)
03/08 12:44:32 INFO:     opened service socket on port 42560
03/08 12:44:32 MFSInitialize()
03/08 12:44:32 MCPLoad(/usr/local/maui/maui.ck,ResOnly)
03/08 12:44:32 MRMInitialize()
03/08 12:44:32 MPBSInitialize(QCDOCHOSTB,SC)
03/08 12:44:33 MSUListen(S)
03/08 12:44:33 INFO:     opened service socket on port 15004
03/08 12:44:33 __MPBSSystemQuery(QCDOCHOSTB,RCount,SC)
03/08 12:44:33 INFO:     connected to PBS server :0 on sd 1
03/08 12:44:33 INFO:     XRMInitialize not supported
03/08 12:44:33 MAMInitialize(NULL)
03/08 12:44:33 MStatInitializeActiveSysUsage()
03/08 12:44:33 MStatClearUsage([NONE],Active)
03/08 12:44:33 ServerUpdate()
03/08 12:44:33 MSysUpdateTime()
03/08 12:44:33 INFO:     starting new day: Wed Mar  8 12:44:33
03/08 12:44:33 MStatOpenFile(1141839873)
03/08 12:44:33 INFO:     starting iteration 0
03/08 12:44:33 MRMGetInfo()
03/08 12:44:33 MClusterClearUsage()
03/08 12:44:33 MRMClusterQuery()
03/08 12:44:33 MPBSClusterQuery(QCDOCHOSTB,RCount,SC)
03/08 12:44:33 __MPBSGetNodeState(Name,State,PNode)
03/08 12:44:33 INFO:     PBS node qcdochostb.qcdoc.bnl.gov set to state Idle (free)
03/08 12:44:33 MPBSNodeLoad(qcdochostb.qcdoc.bnl.gov,qcdochostb.qcdoc.bnl.gov,Idle,QCDOCHOSTB)
03/08 12:44:33 INFO:     node qcdochostb.qcdoc.bnl.gov has joblist '0/1644.qcdochostb.qcdoc.bnl.gov, 0/1621.qcdochostb.qcdoc.bnl.gov, 0/1631.qcdochostb.qcdoc.bnl.gov'
03/08 12:44:33 INFO:     cannot locate PBS job '1644.qcdochostb.qcdoc.bnl.gov' (running on node qcdochostb.qcdoc.bnl.gov)
03/08 12:44:33 INFO:     cannot locate PBS job '1621.qcdochostb.qcdoc.bnl.gov' (running on node qcdochostb.qcdoc.bnl.gov)
03/08 12:44:33 INFO:     cannot locate PBS job '1631.qcdochostb.qcdoc.bnl.gov' (running on node qcdochostb.qcdoc.bnl.gov)
03/08 12:44:33 MUGetIndex(STATACTIVETIME,ValList,0)
03/08 12:44:33 MUGetIndex(STATTOTALTIME,ValList,0)
03/08 12:44:33 MUGetIndex(STATUPTIME,ValList,0)
03/08 12:44:33 MNodeUpdateResExpression(qcdochostb.qcdoc.bnl.gov)
03/08 12:44:33 INFO:     node slot not set on node 'qcdochostb.qcdoc.bnl.gov'
[000] qcdochostb.qcdoc.bnl.gov: (P:8,S:8192,M:8192,D:1) [Idle][DEFAULT][ aix5]<0.000000> C:[NONE][DEFAULT] [NONE] [NONE]
03/08 12:44:33 MPBSLoadQueueInfo(QCDOCHOSTB,NULL,SC)
03/08 12:44:33 INFO:     queue 'rack20-21' maxrunning set to 1
03/08 12:44:33 INFO:     queue 'rack20-21' started state set to True
03/08 12:44:33 INFO:     class to node not mapping enabled for queue 'rack20-21' adding class to all nodes
03/08 12:44:33 INFO:     queue 'rack18' maxrunning set to 1
03/08 12:44:33 INFO:     queue 'rack18' started state set to True
03/08 12:44:33 INFO:     class to node not mapping enabled for queue 'rack18' adding class to all nodes
03/08 12:44:33 INFO:     queue 'rack19' maxrunning set to 1
03/08 12:44:33 INFO:     queue 'rack19' started state set to True
03/08 12:44:33 INFO:     class to node not mapping enabled for queue 'rack19' adding class to all nodes
03/08 12:44:33 INFO:     queue 'rack22-23' maxrunning set to 1
03/08 12:44:33 INFO:     queue 'rack22-23' started state set to True
03/08 12:44:33 INFO:     class to node not mapping enabled for queue 'rack22-23' adding class to all node
03/08 12:44:33 INFO:     queue 'rack24-27' maxrunning set to 1
03/08 12:44:33 INFO:     queue 'rack24-27' started state set to True
03/08 12:44:33 INFO:     class to node not mapping enabled for queue 'rack24-27' adding class to all nodes
03/08 12:44:33 INFO:     queue 'acc7/slot0' maxrunning set to 1
03/08 12:44:33 INFO:     queue 'acc7/slot0' started state set to True
03/08 12:44:33 INFO:     class to node not mapping enabled for queue 'acc7/slot0' adding class to all nodes
03/08 12:44:33 INFO:     queue 'acc7/slot1' maxrunning set to 1
03/08 12:44:33 INFO:     queue 'acc7/slot1' started state set to True
03/08 12:44:33 INFO:     class to node not mapping enabled for queue 'acc7/slot1' adding class to all nodes
03/08 12:44:33 INFO:     queue 'acc7/slot2' maxrunning set to 1
03/08 12:44:33 INFO:     queue 'acc7/slot2' started state set to True
03/08 12:44:33 INFO:     class to node not mapping enabled for queue 'acc7/slot2' adding class to all nodes
03/08 12:44:33 INFO:     queue 'acc7/slot3' maxrunning set to 1
03/08 12:44:33 INFO:     queue 'acc7/slot3' started state set to True
03/08 12:44:33 INFO:     class to node not mapping enabled for queue 'acc7/slot3' adding class to all nodes
03/08 12:44:33 INFO:     queue 'rack16/14mb' maxrunning set to 1
03/08 12:44:33 INFO:     queue 'rack16/14mb' started state set to True
03/08 12:44:33 INFO:     class to node not mapping enabled for queue 'rack16/14mb' adding class to all nodes
03/08 12:44:33 INFO:     queue 'short' started state set to True
03/08 12:44:33 INFO:     class to node not mapping enabled for queue 'short' adding class to all nodes
03/08 12:44:33 INFO:     queue 'medium' started state set to True
03/08 12:44:33 INFO:     class to node not mapping enabled for queue 'medium' adding class to all nodes
03/08 12:44:33 INFO:     queue 'r16/c0/s0' maxrunning set to 1
03/08 12:44:33 INFO:     queue 'r16/c0/s0' started state set to True
03/08 12:44:33 INFO:     class to node not mapping enabled for queue 'r16/c0/s0' adding class to all nodes
03/08 12:44:33 INFO:     1 PBS resources detected on RM QCDOCHOSTB
03/08 12:44:33 INFO:     resources detected: 1
03/08 12:44:33 MRMWorkloadQuery()
03/08 12:44:33 MPBSWorkloadQuery(QCDOCHOSTB,JCount,SC)
03/08 12:44:33 MSysShutdown(11)
03/08 12:44:33 INFO:     received signal 11.  shutting down server
03/08 12:44:33 MCPCreate(/usr/local/maui/maui.ck)
03/08 12:44:33 MCPStoreQueue(ckfp,Buffer)
03/08 12:44:33 MCPStoreResList(CP,RL)
03/08 12:44:33 MCPStoreSRList(CP,SRL)
03/08 12:44:33 MCPStoreCluster(CP,NL)
03/08 12:44:33 MCPStoreUserList(CP,UL)
03/08 12:44:33 MCPStoreGroupList(CP,GL)
03/08 12:44:33 MCPStoreAcctList(CP,AL)
03/08 12:44:33 MCPWriteGridStats(ckfp)
03/08 12:44:33 MCPWriteSystemStats(ckfp)
03/08 12:44:33 MSUDisconnect(S)






More information about the torqueusers mailing list