[Mauiusers] Maui/SLURM-wiki and consumable resources otherthan processors

Balle, Susanne susanne.balle at hp.com
Thu Jan 20 12:19:10 MST 2005


Dave,

The only mention of memory that I can see is RMEM which is correctly set
t 2000Mb and CMEMORY which is correctly set to 2981 for node xc14n16. I
have enclosed a section of the maui.log which has to do with job 43
which is the job I launched below.

I launched my first SLURM job with (under lsfadmin):
srun -n 4 -N 4 --mem=2000 sleep 120

The output of sinfo shows:
[root at xc14n16 etc]# sinfo -lNe
NODELIST     NODES PARTITION       STATE CPUS MEMORY TMP_DISK WEIGHT
FEATURES REASON
xc14n[13-15]     3      lsf*   allocated    2   2981        1      1
(null) none
xc14n16          1      lsf*   allocated    4   3813        1      1
(null) none
[root at xc14n16 etc]#

Is SLURM passing the wrong parameters to Maui? How do I make SLURM give
Maui the DMEM job attribute or the AMEMORY node attribute?

It looks to me like the info is there but just not labelled right?

Thanks for any help,

Regards

Susanne
~
20 19:07:39 MResAdjust(NULL,0,0)
01/20 19:07:39 MJobSetAttr(,PAL,Value,1,2)
01/20 19:07:39 INFO:     job flags for job : 0
01/20 19:07:39 MJobSetAttr(,GAttr,Value,1,5)
01/20 19:07:39 MStatInitializeActiveSysUsage()
01/20 19:07:39 MStatClearUsage([NONE],Active)
01/20 19:07:39 ServerUpdate()
01/20 19:07:39 MSysUpdateTime()
01/20 19:07:39 INFO:     starting iteration 89
01/20 19:07:39 MSchedProcessJobs()
01/20 19:07:39 MRMGetInfo()
01/20 19:07:39 MClusterClearUsage()
01/20 19:07:39 MRMClusterQuery()
01/20 19:07:39 MWikiClusterLoadInfo(XC14N16,RCount,EMsg,SC)
01/20 19:07:39 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETNODES
ARG=0:ALL,Data,DataSize,SC)
01/20 19:07:39 MSUConnect(S)
01/20 19:07:39 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:07:39 INFO:     non-blocking mode established
01/20 19:07:39 MSUSelectWrite(8,9000000)
01/20 19:07:39 INFO:     successful connect to TCP server (sd: 8)
01/20 19:07:39 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:07:39 INFO:     header created '00000022
'
01/20 19:07:39 INFO:     sending short packet '00000022
CMD=GETNODES ARG=0:ALL'
01/20 19:07:39 MSUSendPacket(8,Message,31,9000000)
01/20 19:07:39 MSUSelectWrite(8,9000000)
01/20 19:07:39 INFO:     packet sent (31 bytes of 31)
01/20 19:07:39 INFO:     command sent to server
01/20 19:07:39 INFO:     message sent: 'CMD=GETNODES ARG=0:ALL'
01/20 19:07:39 MSURecvData(S,9000000,0)
01/20 19:07:39 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:07:39 MSUSelectRead(8,9000000)
01/20 19:07:39 INFO:     9 of 9 bytes read from sd 8
01/20 19:07:39 MSURecvPacket(8,Buffer,269,NULL,9000000)
01/20 19:07:39 MSUSelectRead(8,9000000)
01/20 19:07:39 INFO:     269 of 269 bytes read from sd 8
01/20 19:07:39 INFO:     received message 'CK=d6c8c1fcebc5159b
TS=1106266059 AUTH=slurm DT=SC=0
ARG=4#xc14n13:STATE=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n14:STATE
=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n15:STATE=Idle;CMEMORY=2981;
CDISK=12283;CPROC=2;#xc14n16:STATE=Idle;CMEMORY=3813;CDISK=7867;CPROC=4;
' from wiki server
01/20 19:07:39 MSUDisconnect(S)
01/20 19:07:39 INFO:     received node list through WIKI RM
01/20 19:07:39 INFO:     loading 4 node(s)
01/20 19:07:39 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:07:39 MNodeFind(xc14n13,N)
01/20 19:07:39 MRMNodePreUpdate(xc14n13,Idle,XC14N16)
01/20 19:07:39 MWikiNodeUpdate(AList,xc14n13)
01/20 19:07:39 MWikiNodeUpdateAttr(STATE=Idle,xc14n13)
01/20 19:07:39 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n13)
01/20 19:07:39 MWikiNodeUpdateAttr(CDISK=12283,xc14n13)
01/20 19:07:39 MWikiNodeUpdateAttr(CPROC=2,xc14n13)
01/20 19:07:39 MRMNodePostUpdate(xc14n13,Idle)
01/20 19:07:39 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:07:39 MNodeFind(xc14n14,N)
01/20 19:07:39 MRMNodePreUpdate(xc14n14,Idle,XC14N16)
01/20 19:07:39 MWikiNodeUpdate(AList,xc14n14)
01/20 19:07:39 MWikiNodeUpdateAttr(STATE=Idle,xc14n14)
01/20 19:07:39 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n14)
01/20 19:07:39 MWikiNodeUpdateAttr(CDISK=12283,xc14n14)
01/20 19:07:39 MWikiNodeUpdateAttr(CPROC=2,xc14n14)
01/20 19:07:39 MRMNodePostUpdate(xc14n14,Idle)
01/20 19:07:39 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:07:39 MNodeFind(xc14n15,N)
01/20 19:07:39 MRMNodePreUpdate(xc14n15,Idle,XC14N16)
01/20 19:07:39 MWikiNodeUpdate(AList,xc14n15)
01/20 19:07:39 MWikiNodeUpdateAttr(STATE=Idle,xc14n15)
01/20 19:07:39 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n15)
01/20 19:07:39 MWikiNodeUpdateAttr(CDISK=12283,xc14n15)
01/20 19:07:39 MWikiNodeUpdateAttr(CPROC=2,xc14n15)
01/20 19:07:39 MRMNodePostUpdate(xc14n15,Idle)
01/20 19:07:39 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:07:39 MNodeFind(xc14n16,N)
01/20 19:07:39 MRMNodePreUpdate(xc14n16,Idle,XC14N16)
01/20 19:07:39 MWikiNodeUpdate(AList,xc14n16)
01/20 19:07:39 MWikiNodeUpdateAttr(STATE=Idle,xc14n16)
01/20 19:07:39 MWikiNodeUpdateAttr(CMEMORY=3813,xc14n16)
01/20 19:07:39 MWikiNodeUpdateAttr(CDISK=7867,xc14n16)
01/20 19:07:39 MWikiNodeUpdateAttr(CPROC=4,xc14n16)
01/20 19:07:39 MRMNodePostUpdate(xc14n16,Idle)
01/20 19:07:39 INFO:     0 WIKI resources detected on RM XC14N16
01/20 19:07:39 WARNING:  no resources detected
01/20 19:07:39 MRMWorkloadQuery()
01/20 19:07:39 MWikiWorkloadQuery(XC14N16,JCount,SC)
01/20 19:07:39 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETJOBS
ARG=0:ALL,Data,DataSize,SC)
01/20 19:07:39 MSUConnect(S)
01/20 19:07:39 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:07:39 INFO:     non-blocking mode established
01/20 19:07:39 MSUSelectWrite(8,9000000)
01/20 19:07:39 INFO:     successful connect to TCP server (sd: 8)
01/20 19:07:39 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:07:39 INFO:     header created '00000021
'
01/20 19:07:39 INFO:     sending short packet '00000021
CMD=GETJOBS ARG=0:ALL'
01/20 19:07:39 MSUSendPacket(8,Message,30,9000000)
01/20 19:07:39 MSUSelectWrite(8,9000000)
01/20 19:07:39 INFO:     packet sent (30 bytes of 30)
01/20 19:07:39 INFO:     command sent to server
01/20 19:07:39 INFO:     message sent: 'CMD=GETJOBS ARG=0:ALL'
01/20 19:07:39 MSURecvData(S,9000000,0)
01/20 19:07:39 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:07:39 MSUSelectRead(8,9000000)
01/20 19:07:39 INFO:     9 of 9 bytes read from sd 8
01/20 19:07:39 MSURecvPacket(8,Buffer,274,NULL,9000000)
01/20 19:07:39 MSUSelectRead(8,9000000)
01/20 19:07:39 INFO:     274 of 274 bytes read from sd 8
01/20 19:07:39 INFO:     received message 'CK=a532729fa05e9418
TS=1106266059 AUTH=slurm DT=SC=0
ARG=1#43:UPDATETIME=1106265905;STATE=Complete;WCLIMIT=0;TASKS=4;QUEUETIM
E=1106265902;STARTTIME=1106265905;UNAME=lsfadmin;GNAME=lsfadmin;HOSTLIST
=xc14n13:xc14n14:xc14n15:xc14n16;PARTITIONMASK=lsf;NODES=4;RMEM=2000;RDI
SK=1;' from wiki server
01/20 19:07:39 MSUDisconnect(S)
01/20 19:07:39 INFO:     received job list through WIKI RM
01/20 19:07:39 INFO:     loading 1 job(s)
01/20 19:07:39 MWikiGetAttr(job,Name,Status,Attr,Start)
01/20 19:07:39 WARNING:  job '43' detected with unexpected state '20'
01/20 19:07:39 INFO:     1 WIKI jobs detected on RM XC14N16
01/20 19:07:39 INFO:     jobs detected: 1
01/20 19:07:39 MStatClearUsage(node,Active)
01/20 19:07:39 MClusterUpdateNodeState()
01/20 19:07:39 INFO:     node 'xc14n13' C/A/D procs:  2/2/0
01/20 19:07:39 INFO:     node 'xc14n14' C/A/D procs:  2/2/0
01/20 19:07:39 INFO:     node 'xc14n15' C/A/D procs:  2/2/0
01/20 19:07:39 INFO:     node 'xc14n16' C/A/D procs:  4/4/0
01/20 19:07:39 MParUpdate(ALL)
01/20 19:07:39 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:07:39 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:07:39 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:07:39 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:07:39 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:07:39 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:07:39 INFO:     jobs in queue
01/20 19:07:39 MResAdjustDRes(NULL,FALSE)
01/20 19:07:39 MQueueSelectAllJobs(Q,HARD,ALL,JIList,DP,Msg)
01/20 19:07:39 MQueueSelectAllJobs(Q,SOFT,ALL,JIList,DP,Msg)
01/20 19:07:39
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,FALSE
)
01/20 19:07:39 INFO:     idle job queue is empty on iteration 89
01/20 19:07:39
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:07:39 INFO:     idle job queue is empty on iteration 89
01/20 19:07:39
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:07:39 INFO:     idle job queue is empty on iteration 89
01/20 19:07:39 INFO:     cannot finalize RM cycle (RM 'XC14N16' does not
support function 'cyclefinalize')
01/20 19:07:39
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:07:39 INFO:     idle job queue is empty on iteration 89
01/20 19:07:39 MSchedUpdateStats()
01/20 19:07:39 INFO:     iteration:   89   scheduling time:  0.001
seconds
01/20 19:07:39 MResUpdateStats()
01/20 19:07:39 INFO:     current util[89]:  0/4 (0.00%)  PH: 1.35%
active jobs: 0 of 0 (completed: 3)
01/20 19:07:39 MQueueCheckStatus()
01/20 19:07:39 MNodeCheckStatus()
01/20 19:07:39 INFO:     checking node 'xc14n13'
01/20 19:07:39 INFO:     checking node 'xc14n14'
01/20 19:07:39 INFO:     checking node 'xc14n15'
01/20 19:07:39 INFO:     checking node 'xc14n16'
01/20 19:07:39 MSysCheck()
01/20 19:07:39 MLimitEnforceAll(ALL)
01/20 19:07:39 MUClearChild(PID)
01/20 19:07:39 MParUpdate(ALL)
01/20 19:07:39 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:07:39 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:07:39 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:07:39 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:07:39 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:07:39 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:07:39 MResCheckStatus(NULL)
01/20 19:07:39 INFO:     scheduling complete.  sleeping 10 seconds
01/20 19:07:50 ServerProcessRequests()
01/20 19:07:50 MLogRoll(NULL,0,1)
01/20 19:07:50 INFO:     not rolling logs (8964 < 10000000)
01/20 19:07:50 MResAdjust(NULL,0,0)
01/20 19:07:50 MJobSetAttr(,PAL,Value,1,2)
01/20 19:07:50 INFO:     job flags for job : 0
01/20 19:07:50 MJobSetAttr(,GAttr,Value,1,5)
01/20 19:07:50 MStatInitializeActiveSysUsage()
01/20 19:07:50 MStatClearUsage([NONE],Active)
01/20 19:07:50 ServerUpdate()
01/20 19:07:50 MSysUpdateTime()
01/20 19:07:50 INFO:     starting iteration 90
01/20 19:07:50 MSchedProcessJobs()
01/20 19:07:50 MRMGetInfo()
01/20 19:07:50 MClusterClearUsage()
01/20 19:07:50 MRMClusterQuery()
01/20 19:07:50 MWikiClusterLoadInfo(XC14N16,RCount,EMsg,SC)
01/20 19:07:50 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETNODES
ARG=0:ALL,Data,DataSize,SC)
01/20 19:07:50 MSUConnect(S)
01/20 19:07:50 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:07:50 INFO:     non-blocking mode established
01/20 19:07:50 MSUSelectWrite(8,9000000)
01/20 19:07:50 INFO:     successful connect to TCP server (sd: 8)
01/20 19:07:50 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:07:50 INFO:     header created '00000022
'
01/20 19:07:50 INFO:     sending short packet '00000022
CMD=GETNODES ARG=0:ALL'
01/20 19:07:50 MSUSendPacket(8,Message,31,9000000)
01/20 19:07:50 MSUSelectWrite(8,9000000)
01/20 19:07:50 INFO:     packet sent (31 bytes of 31)
01/20 19:07:50 INFO:     command sent to server
01/20 19:07:50 INFO:     message sent: 'CMD=GETNODES ARG=0:ALL'
01/20 19:07:50 MSURecvData(S,9000000,0)
01/20 19:07:50 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:07:50 MSUSelectRead(8,9000000)
01/20 19:07:50 INFO:     9 of 9 bytes read from sd 8
01/20 19:07:50 MSURecvPacket(8,Buffer,269,NULL,9000000)
01/20 19:07:50 MSUSelectRead(8,9000000)
01/20 19:07:50 INFO:     269 of 269 bytes read from sd 8
01/20 19:07:50 INFO:     received message 'CK=452a9f34fb1a04f4
TS=1106266070 AUTH=slurm DT=SC=0
ARG=4#xc14n13:STATE=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n14:STATE
=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n15:STATE=Idle;CMEMORY=2981;
CDISK=12283;CPROC=2;#xc14n16:STATE=Idle;CMEMORY=3813;CDISK=7867;CPROC=4;
' from wiki server
01/20 19:07:50 MSUDisconnect(S)
01/20 19:07:50 INFO:     received node list through WIKI RM
01/20 19:07:50 INFO:     loading 4 node(s)
01/20 19:07:50 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:07:50 MNodeFind(xc14n13,N)
01/20 19:07:50 MRMNodePreUpdate(xc14n13,Idle,XC14N16)
01/20 19:07:50 MWikiNodeUpdate(AList,xc14n13)
01/20 19:07:50 MWikiNodeUpdateAttr(STATE=Idle,xc14n13)
01/20 19:07:50 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n13)
01/20 19:07:50 MWikiNodeUpdateAttr(CDISK=12283,xc14n13)
01/20 19:07:50 MWikiNodeUpdateAttr(CPROC=2,xc14n13)
01/20 19:07:50 MRMNodePostUpdate(xc14n13,Idle)
01/20 19:07:50 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:07:50 MNodeFind(xc14n14,N)
01/20 19:07:50 MRMNodePreUpdate(xc14n14,Idle,XC14N16)
01/20 19:07:50 MWikiNodeUpdate(AList,xc14n14)
01/20 19:07:50 MWikiNodeUpdateAttr(STATE=Idle,xc14n14)
01/20 19:07:50 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n14)
01/20 19:07:50 MWikiNodeUpdateAttr(CDISK=12283,xc14n14)
01/20 19:07:50 MWikiNodeUpdateAttr(CPROC=2,xc14n14)
01/20 19:07:50 MRMNodePostUpdate(xc14n14,Idle)
01/20 19:07:50 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:07:50 MNodeFind(xc14n15,N)
01/20 19:07:50 MRMNodePreUpdate(xc14n15,Idle,XC14N16)
01/20 19:07:50 MWikiNodeUpdate(AList,xc14n15)
01/20 19:07:50 MWikiNodeUpdateAttr(STATE=Idle,xc14n15)
01/20 19:07:50 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n15)
01/20 19:07:50 MWikiNodeUpdateAttr(CDISK=12283,xc14n15)
01/20 19:07:50 MWikiNodeUpdateAttr(CPROC=2,xc14n15)
01/20 19:07:50 MRMNodePostUpdate(xc14n15,Idle)
01/20 19:07:50 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:07:50 MNodeFind(xc14n16,N)
01/20 19:07:50 MRMNodePreUpdate(xc14n16,Idle,XC14N16)
01/20 19:07:50 MWikiNodeUpdate(AList,xc14n16)
01/20 19:07:50 MWikiNodeUpdateAttr(STATE=Idle,xc14n16)
01/20 19:07:50 MWikiNodeUpdateAttr(CMEMORY=3813,xc14n16)
01/20 19:07:50 MWikiNodeUpdateAttr(CDISK=7867,xc14n16)
01/20 19:07:50 MWikiNodeUpdateAttr(CPROC=4,xc14n16)
01/20 19:07:50 MRMNodePostUpdate(xc14n16,Idle)
01/20 19:07:50 INFO:     0 WIKI resources detected on RM XC14N16
01/20 19:07:50 WARNING:  no resources detected
01/20 19:07:50 MRMWorkloadQuery()
01/20 19:07:50 MWikiWorkloadQuery(XC14N16,JCount,SC)
01/20 19:07:50 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETJOBS
ARG=0:ALL,Data,DataSize,SC)
01/20 19:07:50 MSUConnect(S)
01/20 19:07:50 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:07:50 INFO:     non-blocking mode established
01/20 19:07:50 MSUSelectWrite(8,9000000)
01/20 19:07:50 INFO:     successful connect to TCP server (sd: 8)
01/20 19:07:50 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:07:50 INFO:     header created '00000021
'
01/20 19:07:50 INFO:     sending short packet '00000021
CMD=GETJOBS ARG=0:ALL'
01/20 19:07:50 MSUSendPacket(8,Message,30,9000000)
01/20 19:07:50 MSUSelectWrite(8,9000000)
01/20 19:07:50 INFO:     packet sent (30 bytes of 30)
01/20 19:07:50 INFO:     command sent to server
01/20 19:07:50 INFO:     message sent: 'CMD=GETJOBS ARG=0:ALL'
01/20 19:07:50 MSURecvData(S,9000000,0)
01/20 19:07:50 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:07:50 MSUSelectRead(8,9000000)
01/20 19:07:50 INFO:     9 of 9 bytes read from sd 8
01/20 19:07:50 MSURecvPacket(8,Buffer,274,NULL,9000000)
01/20 19:07:50 MSUSelectRead(8,9000000)
01/20 19:07:50 INFO:     274 of 274 bytes read from sd 8
01/20 19:07:50 INFO:     received message 'CK=1636b45a81701886
TS=1106266070 AUTH=slurm DT=SC=0
ARG=1#43:UPDATETIME=1106265905;STATE=Complete;WCLIMIT=0;TASKS=4;QUEUETIM
E=1106265902;STARTTIME=1106265905;UNAME=lsfadmin;GNAME=lsfadmin;HOSTLIST
=xc14n13:xc14n14:xc14n15:xc14n16;PARTITIONMASK=lsf;NODES=4;RMEM=2000;RDI
SK=1;' from wiki server
01/20 19:07:50 MSUDisconnect(S)
01/20 19:07:50 INFO:     received job list through WIKI RM
01/20 19:07:50 INFO:     loading 1 job(s)
01/20 19:07:50 MWikiGetAttr(job,Name,Status,Attr,Start)
01/20 19:07:50 WARNING:  job '43' detected with unexpected state '20'
01/20 19:07:50 INFO:     1 WIKI jobs detected on RM XC14N16
01/20 19:07:50 INFO:     jobs detected: 1
01/20 19:07:50 MStatClearUsage(node,Active)
01/20 19:07:50 MClusterUpdateNodeState()
01/20 19:07:50 INFO:     node 'xc14n13' C/A/D procs:  2/2/0
01/20 19:07:50 INFO:     node 'xc14n14' C/A/D procs:  2/2/0
01/20 19:07:50 INFO:     node 'xc14n15' C/A/D procs:  2/2/0
01/20 19:07:50 INFO:     node 'xc14n16' C/A/D procs:  4/4/0
01/20 19:07:50 MParUpdate(ALL)
01/20 19:07:50 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:07:50 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:07:50 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:07:50 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:07:50 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:07:50 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:07:50 INFO:     jobs in queue
01/20 19:07:50 MResAdjustDRes(NULL,FALSE)
01/20 19:07:50 MQueueSelectAllJobs(Q,HARD,ALL,JIList,DP,Msg)
01/20 19:07:50 MQueueSelectAllJobs(Q,SOFT,ALL,JIList,DP,Msg)
01/20 19:07:50
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,FALSE
)
01/20 19:07:50 INFO:     idle job queue is empty on iteration 90
01/20 19:07:50
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:07:50 INFO:     idle job queue is empty on iteration 90
01/20 19:07:50
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:07:50 INFO:     idle job queue is empty on iteration 90
01/20 19:07:50 INFO:     cannot finalize RM cycle (RM 'XC14N16' does not
support function 'cyclefinalize')
01/20 19:07:50
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:07:50 INFO:     idle job queue is empty on iteration 90
01/20 19:07:50 MSchedUpdateStats()
01/20 19:07:50 INFO:     iteration:   90   scheduling time:  0.002
seconds
01/20 19:07:50 MResUpdateStats()
01/20 19:07:50 INFO:     current util[90]:  0/4 (0.00%)  PH: 1.33%
active jobs: 0 of 0 (completed: 3)
01/20 19:07:50 MQueueCheckStatus()
01/20 19:07:50 MNodeCheckStatus()
01/20 19:07:50 INFO:     checking node 'xc14n13'
01/20 19:07:50 INFO:     checking node 'xc14n14'
01/20 19:07:50 INFO:     checking node 'xc14n15'
01/20 19:07:50 INFO:     checking node 'xc14n16'
01/20 19:07:50 MSysCheck()
01/20 19:07:50 MLimitEnforceAll(ALL)
01/20 19:07:50 MUClearChild(PID)
01/20 19:07:50 MParUpdate(ALL)
01/20 19:07:50 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:07:50 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:07:50 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:07:50 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:07:50 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:07:50 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:07:50 MResCheckStatus(NULL)
01/20 19:07:50 INFO:     scheduling complete.  sleeping 10 seconds
01/20 19:08:01 ServerProcessRequests()
01/20 19:08:01 MLogRoll(NULL,0,1)
01/20 19:08:01 INFO:     not rolling logs (17988 < 10000000)
01/20 19:08:01 MResAdjust(NULL,0,0)
01/20 19:08:01 MJobSetAttr(,PAL,Value,1,2)
01/20 19:08:01 INFO:     job flags for job : 0
01/20 19:08:01 MJobSetAttr(,GAttr,Value,1,5)
01/20 19:08:01 MStatInitializeActiveSysUsage()
01/20 19:08:01 MStatClearUsage([NONE],Active)
01/20 19:08:01 ServerUpdate()
01/20 19:08:01 MSysUpdateTime()
01/20 19:08:01 INFO:     starting iteration 91
01/20 19:08:01 MSchedProcessJobs()
01/20 19:08:01 MRMGetInfo()
01/20 19:08:01 MClusterClearUsage()
01/20 19:08:01 MRMClusterQuery()
01/20 19:08:01 MWikiClusterLoadInfo(XC14N16,RCount,EMsg,SC)
01/20 19:08:01 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETNODES
ARG=0:ALL,Data,DataSize,SC)
01/20 19:08:01 MSUConnect(S)
01/20 19:08:01 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:08:01 INFO:     non-blocking mode established
01/20 19:08:01 MSUSelectWrite(8,9000000)
01/20 19:08:01 INFO:     successful connect to TCP server (sd: 8)
01/20 19:08:01 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:08:01 INFO:     header created '00000022
'
01/20 19:08:01 INFO:     sending short packet '00000022
CMD=GETNODES ARG=0:ALL'
01/20 19:08:01 MSUSendPacket(8,Message,31,9000000)
01/20 19:08:01 MSUSelectWrite(8,9000000)
01/20 19:08:01 INFO:     packet sent (31 bytes of 31)
01/20 19:08:01 INFO:     command sent to server
01/20 19:08:01 INFO:     message sent: 'CMD=GETNODES ARG=0:ALL'
01/20 19:08:01 MSURecvData(S,9000000,0)
01/20 19:08:01 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:08:01 MSUSelectRead(8,9000000)
01/20 19:08:01 INFO:     9 of 9 bytes read from sd 8
01/20 19:08:01 MSURecvPacket(8,Buffer,269,NULL,9000000)
01/20 19:08:01 MSUSelectRead(8,9000000)
01/20 19:08:01 INFO:     269 of 269 bytes read from sd 8
01/20 19:08:01 INFO:     received message 'CK=554552cb6a32cf40
TS=1106266081 AUTH=slurm DT=SC=0
ARG=4#xc14n13:STATE=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n14:STATE
=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n15:STATE=Idle;CMEMORY=2981;
CDISK=12283;CPROC=2;#xc14n16:STATE=Idle;CMEMORY=3813;CDISK=7867;CPROC=4;
' from wiki server
01/20 19:08:01 MSUDisconnect(S)
01/20 19:08:01 INFO:     received node list through WIKI RM
01/20 19:08:01 INFO:     loading 4 node(s)
01/20 19:08:01 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:01 MNodeFind(xc14n13,N)
01/20 19:08:01 MRMNodePreUpdate(xc14n13,Idle,XC14N16)
01/20 19:08:01 MWikiNodeUpdate(AList,xc14n13)
01/20 19:08:01 MWikiNodeUpdateAttr(STATE=Idle,xc14n13)
01/20 19:08:01 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n13)
01/20 19:08:01 MWikiNodeUpdateAttr(CDISK=12283,xc14n13)
01/20 19:08:01 MWikiNodeUpdateAttr(CPROC=2,xc14n13)
01/20 19:08:01 MRMNodePostUpdate(xc14n13,Idle)
01/20 19:08:01 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:01 MNodeFind(xc14n14,N)
01/20 19:08:01 MRMNodePreUpdate(xc14n14,Idle,XC14N16)
01/20 19:08:01 MWikiNodeUpdate(AList,xc14n14)
01/20 19:08:01 MWikiNodeUpdateAttr(STATE=Idle,xc14n14)
01/20 19:08:01 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n14)
01/20 19:08:01 MWikiNodeUpdateAttr(CDISK=12283,xc14n14)
01/20 19:08:01 MWikiNodeUpdateAttr(CPROC=2,xc14n14)
01/20 19:08:01 MRMNodePostUpdate(xc14n14,Idle)
01/20 19:08:01 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:01 MNodeFind(xc14n15,N)
01/20 19:08:01 MRMNodePreUpdate(xc14n15,Idle,XC14N16)
01/20 19:08:01 MWikiNodeUpdate(AList,xc14n15)
01/20 19:08:01 MWikiNodeUpdateAttr(STATE=Idle,xc14n15)
01/20 19:08:01 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n15)
01/20 19:08:01 MWikiNodeUpdateAttr(CDISK=12283,xc14n15)
01/20 19:08:01 MWikiNodeUpdateAttr(CPROC=2,xc14n15)
01/20 19:08:01 MRMNodePostUpdate(xc14n15,Idle)
01/20 19:08:01 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:01 MNodeFind(xc14n16,N)
01/20 19:08:01 MRMNodePreUpdate(xc14n16,Idle,XC14N16)
01/20 19:08:01 MWikiNodeUpdate(AList,xc14n16)
01/20 19:08:01 MWikiNodeUpdateAttr(STATE=Idle,xc14n16)
01/20 19:08:01 MWikiNodeUpdateAttr(CMEMORY=3813,xc14n16)
01/20 19:08:01 MWikiNodeUpdateAttr(CDISK=7867,xc14n16)
01/20 19:08:01 MWikiNodeUpdateAttr(CPROC=4,xc14n16)
01/20 19:08:01 MRMNodePostUpdate(xc14n16,Idle)
01/20 19:08:01 INFO:     0 WIKI resources detected on RM XC14N16
01/20 19:08:01 WARNING:  no resources detected
01/20 19:08:01 MRMWorkloadQuery()
01/20 19:08:01 MWikiWorkloadQuery(XC14N16,JCount,SC)
01/20 19:08:01 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETJOBS
ARG=0:ALL,Data,DataSize,SC)
01/20 19:08:01 MSUConnect(S)
01/20 19:08:01 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:08:01 INFO:     non-blocking mode established
01/20 19:08:01 MSUSelectWrite(8,9000000)
01/20 19:08:01 INFO:     successful connect to TCP server (sd: 8)
01/20 19:08:01 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:08:01 INFO:     header created '00000021
'
01/20 19:08:01 INFO:     sending short packet '00000021
CMD=GETJOBS ARG=0:ALL'
01/20 19:08:01 MSUSendPacket(8,Message,30,9000000)
01/20 19:08:01 MSUSelectWrite(8,9000000)
01/20 19:08:01 INFO:     packet sent (30 bytes of 30)
01/20 19:08:01 INFO:     command sent to server
01/20 19:08:01 INFO:     message sent: 'CMD=GETJOBS ARG=0:ALL'
01/20 19:08:01 MSURecvData(S,9000000,0)
01/20 19:08:01 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:08:01 MSUSelectRead(8,9000000)
01/20 19:08:01 INFO:     9 of 9 bytes read from sd 8
01/20 19:08:01 MSURecvPacket(8,Buffer,274,NULL,9000000)
01/20 19:08:01 MSUSelectRead(8,9000000)
01/20 19:08:01 INFO:     274 of 274 bytes read from sd 8
01/20 19:08:01 INFO:     received message 'CK=2f64dc25a821972b
TS=1106266081 AUTH=slurm DT=SC=0
ARG=1#43:UPDATETIME=1106265905;STATE=Complete;WCLIMIT=0;TASKS=4;QUEUETIM
E=1106265902;STARTTIME=1106265905;UNAME=lsfadmin;GNAME=lsfadmin;HOSTLIST
=xc14n13:xc14n14:xc14n15:xc14n16;PARTITIONMASK=lsf;NODES=4;RMEM=2000;RDI
SK=1;' from wiki server
01/20 19:08:01 MSUDisconnect(S)
01/20 19:08:01 INFO:     received job list through WIKI RM
01/20 19:08:01 INFO:     loading 1 job(s)
01/20 19:08:01 MWikiGetAttr(job,Name,Status,Attr,Start)
01/20 19:08:01 WARNING:  job '43' detected with unexpected state '20'
01/20 19:08:01 INFO:     1 WIKI jobs detected on RM XC14N16
01/20 19:08:01 INFO:     jobs detected: 1
01/20 19:08:01 MStatClearUsage(node,Active)
01/20 19:08:01 MClusterUpdateNodeState()
01/20 19:08:01 INFO:     node 'xc14n13' C/A/D procs:  2/2/0
01/20 19:08:01 INFO:     node 'xc14n14' C/A/D procs:  2/2/0
01/20 19:08:01 INFO:     node 'xc14n15' C/A/D procs:  2/2/0
01/20 19:08:01 INFO:     node 'xc14n16' C/A/D procs:  4/4/0
01/20 19:08:01 MParUpdate(ALL)
01/20 19:08:01 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:01 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:08:01 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:08:01 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:08:01 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:08:01 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:01 INFO:     jobs in queue
01/20 19:08:01 MResAdjustDRes(NULL,FALSE)
01/20 19:08:01 MQueueSelectAllJobs(Q,HARD,ALL,JIList,DP,Msg)
01/20 19:08:01 MQueueSelectAllJobs(Q,SOFT,ALL,JIList,DP,Msg)
01/20 19:08:01
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,FALSE
)
01/20 19:08:01 INFO:     idle job queue is empty on iteration 91
01/20 19:08:01
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:08:01 INFO:     idle job queue is empty on iteration 91
01/20 19:08:01
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:08:01 INFO:     idle job queue is empty on iteration 91
01/20 19:08:01 INFO:     cannot finalize RM cycle (RM 'XC14N16' does not
support function 'cyclefinalize')
01/20 19:08:01
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:08:01 INFO:     idle job queue is empty on iteration 91
01/20 19:08:01 MSchedUpdateStats()
01/20 19:08:01 INFO:     iteration:   91   scheduling time:  0.002
seconds
01/20 19:08:01 MResUpdateStats()
01/20 19:08:01 INFO:     current util[91]:  0/4 (0.00%)  PH: 1.32%
active jobs: 0 of 0 (completed: 3)
01/20 19:08:01 MQueueCheckStatus()
01/20 19:08:01 MNodeCheckStatus()
01/20 19:08:01 INFO:     checking node 'xc14n13'
01/20 19:08:01 INFO:     checking node 'xc14n14'
01/20 19:08:01 INFO:     checking node 'xc14n15'
01/20 19:08:01 INFO:     checking node 'xc14n16'
01/20 19:08:01 MSysCheck()
01/20 19:08:01 MLimitEnforceAll(ALL)
01/20 19:08:01 MUClearChild(PID)
01/20 19:08:01 MParUpdate(ALL)
01/20 19:08:01 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:01 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:08:01 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:08:01 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:08:01 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:08:01 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:01 MResCheckStatus(NULL)
01/20 19:08:01 INFO:     scheduling complete.  sleeping 10 seconds
01/20 19:08:12 ServerProcessRequests()
01/20 19:08:12 MLogRoll(NULL,0,1)
01/20 19:08:12 INFO:     not rolling logs (27013 < 10000000)
01/20 19:08:12 MResAdjust(NULL,0,0)
01/20 19:08:12 MJobSetAttr(,PAL,Value,1,2)
01/20 19:08:12 INFO:     job flags for job : 0
01/20 19:08:12 MJobSetAttr(,GAttr,Value,1,5)
01/20 19:08:12 MStatInitializeActiveSysUsage()
01/20 19:08:12 MStatClearUsage([NONE],Active)
01/20 19:08:12 ServerUpdate()
01/20 19:08:12 MSysUpdateTime()
01/20 19:08:12 INFO:     starting iteration 92
01/20 19:08:12 MSchedProcessJobs()
01/20 19:08:12 MRMGetInfo()
01/20 19:08:12 MClusterClearUsage()
01/20 19:08:12 MRMClusterQuery()
01/20 19:08:12 MWikiClusterLoadInfo(XC14N16,RCount,EMsg,SC)
01/20 19:08:12 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETNODES
ARG=0:ALL,Data,DataSize,SC)
01/20 19:08:12 MSUConnect(S)
01/20 19:08:12 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:08:12 INFO:     non-blocking mode established
01/20 19:08:12 MSUSelectWrite(8,9000000)
01/20 19:08:12 INFO:     successful connect to TCP server (sd: 8)
01/20 19:08:12 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:08:12 INFO:     header created '00000022
'
01/20 19:08:12 INFO:     sending short packet '00000022
CMD=GETNODES ARG=0:ALL'
01/20 19:08:12 MSUSendPacket(8,Message,31,9000000)
01/20 19:08:12 MSUSelectWrite(8,9000000)
01/20 19:08:12 INFO:     packet sent (31 bytes of 31)
01/20 19:08:12 INFO:     command sent to server
01/20 19:08:12 INFO:     message sent: 'CMD=GETNODES ARG=0:ALL'
01/20 19:08:12 MSURecvData(S,9000000,0)
01/20 19:08:12 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:08:12 MSUSelectRead(8,9000000)
01/20 19:08:12 INFO:     9 of 9 bytes read from sd 8
01/20 19:08:12 MSURecvPacket(8,Buffer,269,NULL,9000000)
01/20 19:08:12 MSUSelectRead(8,9000000)
01/20 19:08:12 INFO:     269 of 269 bytes read from sd 8
01/20 19:08:12 INFO:     received message 'CK=757005971a811646
TS=1106266092 AUTH=slurm DT=SC=0
ARG=4#xc14n13:STATE=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n14:STATE
=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n15:STATE=Idle;CMEMORY=2981;
CDISK=12283;CPROC=2;#xc14n16:STATE=Idle;CMEMORY=3813;CDISK=7867;CPROC=4;
' from wiki server
01/20 19:08:12 MSUDisconnect(S)
01/20 19:08:12 INFO:     received node list through WIKI RM
01/20 19:08:12 INFO:     loading 4 node(s)
01/20 19:08:12 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:12 MNodeFind(xc14n13,N)
01/20 19:08:12 MRMNodePreUpdate(xc14n13,Idle,XC14N16)
01/20 19:08:12 MWikiNodeUpdate(AList,xc14n13)
01/20 19:08:12 MWikiNodeUpdateAttr(STATE=Idle,xc14n13)
01/20 19:08:12 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n13)
01/20 19:08:12 MWikiNodeUpdateAttr(CDISK=12283,xc14n13)
01/20 19:08:12 MWikiNodeUpdateAttr(CPROC=2,xc14n13)
01/20 19:08:12 MRMNodePostUpdate(xc14n13,Idle)
01/20 19:08:12 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:12 MNodeFind(xc14n14,N)
01/20 19:08:12 MRMNodePreUpdate(xc14n14,Idle,XC14N16)
01/20 19:08:12 MWikiNodeUpdate(AList,xc14n14)
01/20 19:08:12 MWikiNodeUpdateAttr(STATE=Idle,xc14n14)
01/20 19:08:12 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n14)
01/20 19:08:12 MWikiNodeUpdateAttr(CDISK=12283,xc14n14)
01/20 19:08:12 MWikiNodeUpdateAttr(CPROC=2,xc14n14)
01/20 19:08:12 MRMNodePostUpdate(xc14n14,Idle)
01/20 19:08:12 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:12 MNodeFind(xc14n15,N)
01/20 19:08:12 MRMNodePreUpdate(xc14n15,Idle,XC14N16)
01/20 19:08:12 MWikiNodeUpdate(AList,xc14n15)
01/20 19:08:12 MWikiNodeUpdateAttr(STATE=Idle,xc14n15)
01/20 19:08:12 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n15)
01/20 19:08:12 MWikiNodeUpdateAttr(CDISK=12283,xc14n15)
01/20 19:08:12 MWikiNodeUpdateAttr(CPROC=2,xc14n15)
01/20 19:08:12 MRMNodePostUpdate(xc14n15,Idle)
01/20 19:08:12 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:12 MNodeFind(xc14n16,N)
01/20 19:08:12 MRMNodePreUpdate(xc14n16,Idle,XC14N16)
01/20 19:08:12 MWikiNodeUpdate(AList,xc14n16)
01/20 19:08:12 MWikiNodeUpdateAttr(STATE=Idle,xc14n16)
01/20 19:08:12 MWikiNodeUpdateAttr(CMEMORY=3813,xc14n16)
01/20 19:08:12 MWikiNodeUpdateAttr(CDISK=7867,xc14n16)
01/20 19:08:12 MWikiNodeUpdateAttr(CPROC=4,xc14n16)
01/20 19:08:12 MRMNodePostUpdate(xc14n16,Idle)
01/20 19:08:12 INFO:     0 WIKI resources detected on RM XC14N16
01/20 19:08:12 WARNING:  no resources detected
01/20 19:08:12 MRMWorkloadQuery()
01/20 19:08:12 MWikiWorkloadQuery(XC14N16,JCount,SC)
01/20 19:08:12 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETJOBS
ARG=0:ALL,Data,DataSize,SC)
01/20 19:08:12 MSUConnect(S)
01/20 19:08:12 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:08:12 INFO:     non-blocking mode established
01/20 19:08:12 MSUSelectWrite(8,9000000)
01/20 19:08:12 INFO:     successful connect to TCP server (sd: 8)
01/20 19:08:12 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:08:12 INFO:     header created '00000021
'
01/20 19:08:12 INFO:     sending short packet '00000021
CMD=GETJOBS ARG=0:ALL'
01/20 19:08:12 MSUSendPacket(8,Message,30,9000000)
01/20 19:08:12 MSUSelectWrite(8,9000000)
01/20 19:08:12 INFO:     packet sent (30 bytes of 30)
01/20 19:08:12 INFO:     command sent to server
01/20 19:08:12 INFO:     message sent: 'CMD=GETJOBS ARG=0:ALL'
01/20 19:08:12 MSURecvData(S,9000000,0)
01/20 19:08:12 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:08:12 MSUSelectRead(8,9000000)
01/20 19:08:12 INFO:     9 of 9 bytes read from sd 8
01/20 19:08:12 MSURecvPacket(8,Buffer,274,NULL,9000000)
01/20 19:08:12 MSUSelectRead(8,9000000)
01/20 19:08:12 INFO:     274 of 274 bytes read from sd 8
01/20 19:08:12 INFO:     received message 'CK=5b517a720f960746
TS=1106266092 AUTH=slurm DT=SC=0
ARG=1#43:UPDATETIME=1106265905;STATE=Complete;WCLIMIT=0;TASKS=4;QUEUETIM
E=1106265902;STARTTIME=1106265905;UNAME=lsfadmin;GNAME=lsfadmin;HOSTLIST
=xc14n13:xc14n14:xc14n15:xc14n16;PARTITIONMASK=lsf;NODES=4;RMEM=2000;RDI
SK=1;' from wiki server
01/20 19:08:12 MSUDisconnect(S)
01/20 19:08:12 INFO:     received job list through WIKI RM
01/20 19:08:12 INFO:     loading 1 job(s)
01/20 19:08:12 MWikiGetAttr(job,Name,Status,Attr,Start)
01/20 19:08:12 WARNING:  job '43' detected with unexpected state '20'
01/20 19:08:12 INFO:     1 WIKI jobs detected on RM XC14N16
01/20 19:08:12 INFO:     jobs detected: 1
01/20 19:08:12 MStatClearUsage(node,Active)
01/20 19:08:12 MClusterUpdateNodeState()
01/20 19:08:12 INFO:     node 'xc14n13' C/A/D procs:  2/2/0
01/20 19:08:12 INFO:     node 'xc14n14' C/A/D procs:  2/2/0
01/20 19:08:12 INFO:     node 'xc14n15' C/A/D procs:  2/2/0
01/20 19:08:12 INFO:     node 'xc14n16' C/A/D procs:  4/4/0
01/20 19:08:12 MParUpdate(ALL)
01/20 19:08:12 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:12 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:08:12 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:08:12 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:08:12 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:08:12 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:12 INFO:     jobs in queue
01/20 19:08:12 MResAdjustDRes(NULL,FALSE)
01/20 19:08:12 MQueueSelectAllJobs(Q,HARD,ALL,JIList,DP,Msg)
01/20 19:08:12 MQueueSelectAllJobs(Q,SOFT,ALL,JIList,DP,Msg)
01/20 19:08:12
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,FALSE
)
01/20 19:08:12 INFO:     idle job queue is empty on iteration 92
01/20 19:08:12
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:08:12 INFO:     idle job queue is empty on iteration 92
01/20 19:08:12
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:08:12 INFO:     idle job queue is empty on iteration 92
01/20 19:08:12 INFO:     cannot finalize RM cycle (RM 'XC14N16' does not
support function 'cyclefinalize')
01/20 19:08:12
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:08:12 INFO:     idle job queue is empty on iteration 92
01/20 19:08:12 MSchedUpdateStats()
01/20 19:08:12 INFO:     iteration:   92   scheduling time:  0.001
seconds
01/20 19:08:12 MResUpdateStats()
01/20 19:08:12 INFO:     current util[92]:  0/4 (0.00%)  PH: 1.30%
active jobs: 0 of 0 (completed: 3)
01/20 19:08:12 MQueueCheckStatus()
01/20 19:08:12 MNodeCheckStatus()
01/20 19:08:12 INFO:     checking node 'xc14n13'
01/20 19:08:12 INFO:     checking node 'xc14n14'
01/20 19:08:12 INFO:     checking node 'xc14n15'
01/20 19:08:12 INFO:     checking node 'xc14n16'
01/20 19:08:12 MSysCheck()
01/20 19:08:12 MLimitEnforceAll(ALL)
01/20 19:08:12 MUClearChild(PID)
01/20 19:08:12 MParUpdate(ALL)
01/20 19:08:12 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:12 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:08:12 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:08:12 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:08:12 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:08:12 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:12 MResCheckStatus(NULL)
01/20 19:08:12 INFO:     scheduling complete.  sleeping 10 seconds
01/20 19:08:23 ServerProcessRequests()
01/20 19:08:23 MLogRoll(NULL,0,1)
01/20 19:08:23 INFO:     not rolling logs (36038 < 10000000)
01/20 19:08:23 MResAdjust(NULL,0,0)
01/20 19:08:23 MJobSetAttr(,PAL,Value,1,2)
01/20 19:08:23 INFO:     job flags for job : 0
01/20 19:08:23 MJobSetAttr(,GAttr,Value,1,5)
01/20 19:08:23 MStatInitializeActiveSysUsage()
01/20 19:08:23 MStatClearUsage([NONE],Active)
01/20 19:08:23 ServerUpdate()
01/20 19:08:23 MSysUpdateTime()
01/20 19:08:23 INFO:     starting iteration 93
01/20 19:08:23 MSchedProcessJobs()
01/20 19:08:23 MRMGetInfo()
01/20 19:08:23 MClusterClearUsage()
01/20 19:08:23 MRMClusterQuery()
01/20 19:08:23 MWikiClusterLoadInfo(XC14N16,RCount,EMsg,SC)
01/20 19:08:23 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETNODES
ARG=0:ALL,Data,DataSize,SC)
01/20 19:08:23 MSUConnect(S)
01/20 19:08:23 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:08:23 INFO:     non-blocking mode established
01/20 19:08:23 MSUSelectWrite(8,9000000)
01/20 19:08:23 INFO:     successful connect to TCP server (sd: 8)
01/20 19:08:23 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:08:23 INFO:     header created '00000022
'
01/20 19:08:23 INFO:     sending short packet '00000022
CMD=GETNODES ARG=0:ALL'
01/20 19:08:23 MSUSendPacket(8,Message,31,9000000)
01/20 19:08:23 MSUSelectWrite(8,9000000)
01/20 19:08:23 INFO:     packet sent (31 bytes of 31)
01/20 19:08:23 INFO:     command sent to server
01/20 19:08:23 INFO:     message sent: 'CMD=GETNODES ARG=0:ALL'
01/20 19:08:23 MSURecvData(S,9000000,0)
01/20 19:08:23 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:08:23 MSUSelectRead(8,9000000)
01/20 19:08:23 INFO:     9 of 9 bytes read from sd 8
01/20 19:08:23 MSURecvPacket(8,Buffer,269,NULL,9000000)
01/20 19:08:23 MSUSelectRead(8,9000000)
01/20 19:08:23 INFO:     269 of 269 bytes read from sd 8
01/20 19:08:23 INFO:     received message 'CK=cb03522c8a4d24aa
TS=1106266103 AUTH=slurm DT=SC=0
ARG=4#xc14n13:STATE=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n14:STATE
=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n15:STATE=Idle;CMEMORY=2981;
CDISK=12283;CPROC=2;#xc14n16:STATE=Idle;CMEMORY=3813;CDISK=7867;CPROC=4;
' from wiki server
01/20 19:08:23 MSUDisconnect(S)
01/20 19:08:23 INFO:     received node list through WIKI RM
01/20 19:08:23 INFO:     loading 4 node(s)
01/20 19:08:23 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:23 MNodeFind(xc14n13,N)
01/20 19:08:23 MRMNodePreUpdate(xc14n13,Idle,XC14N16)
01/20 19:08:23 MWikiNodeUpdate(AList,xc14n13)
01/20 19:08:23 MWikiNodeUpdateAttr(STATE=Idle,xc14n13)
01/20 19:08:23 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n13)
01/20 19:08:23 MWikiNodeUpdateAttr(CDISK=12283,xc14n13)
01/20 19:08:23 MWikiNodeUpdateAttr(CPROC=2,xc14n13)
01/20 19:08:23 MRMNodePostUpdate(xc14n13,Idle)
01/20 19:08:23 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:23 MNodeFind(xc14n14,N)
01/20 19:08:23 MRMNodePreUpdate(xc14n14,Idle,XC14N16)
01/20 19:08:23 MWikiNodeUpdate(AList,xc14n14)
01/20 19:08:23 MWikiNodeUpdateAttr(STATE=Idle,xc14n14)
01/20 19:08:23 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n14)
01/20 19:08:23 MWikiNodeUpdateAttr(CDISK=12283,xc14n14)
01/20 19:08:23 MWikiNodeUpdateAttr(CPROC=2,xc14n14)
01/20 19:08:23 MRMNodePostUpdate(xc14n14,Idle)
01/20 19:08:23 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:23 MNodeFind(xc14n15,N)
01/20 19:08:23 MRMNodePreUpdate(xc14n15,Idle,XC14N16)
01/20 19:08:23 MWikiNodeUpdate(AList,xc14n15)
01/20 19:08:23 MWikiNodeUpdateAttr(STATE=Idle,xc14n15)
01/20 19:08:23 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n15)
01/20 19:08:23 MWikiNodeUpdateAttr(CDISK=12283,xc14n15)
01/20 19:08:23 MWikiNodeUpdateAttr(CPROC=2,xc14n15)
01/20 19:08:23 MRMNodePostUpdate(xc14n15,Idle)
01/20 19:08:23 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:23 MNodeFind(xc14n16,N)
01/20 19:08:23 MRMNodePreUpdate(xc14n16,Idle,XC14N16)
01/20 19:08:23 MWikiNodeUpdate(AList,xc14n16)
01/20 19:08:23 MWikiNodeUpdateAttr(STATE=Idle,xc14n16)
01/20 19:08:23 MWikiNodeUpdateAttr(CMEMORY=3813,xc14n16)
01/20 19:08:23 MWikiNodeUpdateAttr(CDISK=7867,xc14n16)
01/20 19:08:23 MWikiNodeUpdateAttr(CPROC=4,xc14n16)
01/20 19:08:23 MRMNodePostUpdate(xc14n16,Idle)
01/20 19:08:23 INFO:     0 WIKI resources detected on RM XC14N16
01/20 19:08:23 WARNING:  no resources detected
01/20 19:08:23 MRMWorkloadQuery()
01/20 19:08:23 MWikiWorkloadQuery(XC14N16,JCount,SC)
01/20 19:08:23 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETJOBS
ARG=0:ALL,Data,DataSize,SC)
01/20 19:08:23 MSUConnect(S)
01/20 19:08:23 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:08:23 INFO:     non-blocking mode established
01/20 19:08:23 MSUSelectWrite(8,9000000)
01/20 19:08:23 INFO:     successful connect to TCP server (sd: 8)
01/20 19:08:23 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:08:23 INFO:     header created '00000021
'
01/20 19:08:23 INFO:     sending short packet '00000021
CMD=GETJOBS ARG=0:ALL'
01/20 19:08:23 MSUSendPacket(8,Message,30,9000000)
01/20 19:08:23 MSUSelectWrite(8,9000000)
01/20 19:08:23 INFO:     packet sent (30 bytes of 30)
01/20 19:08:23 INFO:     command sent to server
01/20 19:08:23 INFO:     message sent: 'CMD=GETJOBS ARG=0:ALL'
01/20 19:08:23 MSURecvData(S,9000000,0)
01/20 19:08:23 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:08:23 MSUSelectRead(8,9000000)
01/20 19:08:23 INFO:     9 of 9 bytes read from sd 8
01/20 19:08:23 MSURecvPacket(8,Buffer,274,NULL,9000000)
01/20 19:08:23 MSUSelectRead(8,9000000)
01/20 19:08:23 INFO:     274 of 274 bytes read from sd 8
01/20 19:08:23 INFO:     received message 'CK=00f64ae455f79556
TS=1106266103 AUTH=slurm DT=SC=0
ARG=1#43:UPDATETIME=1106265905;STATE=Complete;WCLIMIT=0;TASKS=4;QUEUETIM
E=1106265902;STARTTIME=1106265905;UNAME=lsfadmin;GNAME=lsfadmin;HOSTLIST
=xc14n13:xc14n14:xc14n15:xc14n16;PARTITIONMASK=lsf;NODES=4;RMEM=2000;RDI
SK=1;' from wiki server
01/20 19:08:23 MSUDisconnect(S)
01/20 19:08:23 INFO:     received job list through WIKI RM
01/20 19:08:23 INFO:     loading 1 job(s)
01/20 19:08:23 MWikiGetAttr(job,Name,Status,Attr,Start)
01/20 19:08:23 WARNING:  job '43' detected with unexpected state '20'
01/20 19:08:23 INFO:     1 WIKI jobs detected on RM XC14N16
01/20 19:08:23 INFO:     jobs detected: 1
01/20 19:08:23 MStatClearUsage(node,Active)
01/20 19:08:23 MClusterUpdateNodeState()
01/20 19:08:23 INFO:     node 'xc14n13' C/A/D procs:  2/2/0
01/20 19:08:23 INFO:     node 'xc14n14' C/A/D procs:  2/2/0
01/20 19:08:23 INFO:     node 'xc14n15' C/A/D procs:  2/2/0
01/20 19:08:23 INFO:     node 'xc14n16' C/A/D procs:  4/4/0
01/20 19:08:23 MParUpdate(ALL)
01/20 19:08:23 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:23 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:08:23 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:08:23 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:08:23 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:08:23 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:23 INFO:     jobs in queue
01/20 19:08:23 MResAdjustDRes(NULL,FALSE)
01/20 19:08:23 MQueueSelectAllJobs(Q,HARD,ALL,JIList,DP,Msg)
01/20 19:08:23 MQueueSelectAllJobs(Q,SOFT,ALL,JIList,DP,Msg)
01/20 19:08:23
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,FALSE
)
01/20 19:08:23 INFO:     idle job queue is empty on iteration 93
01/20 19:08:23
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:08:23 INFO:     idle job queue is empty on iteration 93
01/20 19:08:23
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:08:23 INFO:     idle job queue is empty on iteration 93
01/20 19:08:23 INFO:     cannot finalize RM cycle (RM 'XC14N16' does not
support function 'cyclefinalize')
01/20 19:08:23
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:08:23 INFO:     idle job queue is empty on iteration 93
01/20 19:08:23 MSchedUpdateStats()
01/20 19:08:23 INFO:     iteration:   93   scheduling time:  0.001
seconds
01/20 19:08:23 MResUpdateStats()
01/20 19:08:23 INFO:     current util[93]:  0/4 (0.00%)  PH: 1.29%
active jobs: 0 of 0 (completed: 3)
01/20 19:08:23 MQueueCheckStatus()
01/20 19:08:23 MNodeCheckStatus()
01/20 19:08:23 INFO:     checking node 'xc14n13'
01/20 19:08:23 INFO:     checking node 'xc14n14'
01/20 19:08:23 INFO:     checking node 'xc14n15'
01/20 19:08:23 INFO:     checking node 'xc14n16'
01/20 19:08:23 MSysCheck()
01/20 19:08:23 MLimitEnforceAll(ALL)
01/20 19:08:23 MUClearChild(PID)
01/20 19:08:23 MParUpdate(ALL)
01/20 19:08:23 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:23 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:08:23 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:08:23 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:08:23 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:08:23 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:23 MResCheckStatus(NULL)
01/20 19:08:23 INFO:     scheduling complete.  sleeping 10 seconds
01/20 19:08:34 ServerProcessRequests()
01/20 19:08:34 MLogRoll(NULL,0,1)
01/20 19:08:34 INFO:     not rolling logs (45063 < 10000000)
01/20 19:08:34 MResAdjust(NULL,0,0)
01/20 19:08:34 MJobSetAttr(,PAL,Value,1,2)
01/20 19:08:34 INFO:     job flags for job : 0
01/20 19:08:34 MJobSetAttr(,GAttr,Value,1,5)
01/20 19:08:34 MStatInitializeActiveSysUsage()
01/20 19:08:34 MStatClearUsage([NONE],Active)
01/20 19:08:34 ServerUpdate()
01/20 19:08:34 MSysUpdateTime()
01/20 19:08:34 INFO:     starting iteration 94
01/20 19:08:34 MSchedProcessJobs()
01/20 19:08:34 MRMGetInfo()
01/20 19:08:34 MClusterClearUsage()
01/20 19:08:34 MRMClusterQuery()
01/20 19:08:34 MWikiClusterLoadInfo(XC14N16,RCount,EMsg,SC)
01/20 19:08:34 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETNODES
ARG=0:ALL,Data,DataSize,SC)
01/20 19:08:34 MSUConnect(S)
01/20 19:08:34 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:08:34 INFO:     non-blocking mode established
01/20 19:08:34 MSUSelectWrite(8,9000000)
01/20 19:08:34 INFO:     successful connect to TCP server (sd: 8)
01/20 19:08:34 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:08:34 INFO:     header created '00000022
'
01/20 19:08:34 INFO:     sending short packet '00000022
CMD=GETNODES ARG=0:ALL'
01/20 19:08:34 MSUSendPacket(8,Message,31,9000000)
01/20 19:08:34 MSUSelectWrite(8,9000000)
01/20 19:08:34 INFO:     packet sent (31 bytes of 31)
01/20 19:08:34 INFO:     command sent to server
01/20 19:08:34 INFO:     message sent: 'CMD=GETNODES ARG=0:ALL'
01/20 19:08:34 MSURecvData(S,9000000,0)
01/20 19:08:34 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:08:34 MSUSelectRead(8,9000000)
01/20 19:08:34 INFO:     9 of 9 bytes read from sd 8
01/20 19:08:34 MSURecvPacket(8,Buffer,269,NULL,9000000)
01/20 19:08:34 MSUSelectRead(8,9000000)
01/20 19:08:34 INFO:     269 of 269 bytes read from sd 8
01/20 19:08:34 INFO:     received message 'CK=aac158fa4d6babdc
TS=1106266114 AUTH=slurm DT=SC=0
ARG=4#xc14n13:STATE=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n14:STATE
=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n15:STATE=Idle;CMEMORY=2981;
CDISK=12283;CPROC=2;#xc14n16:STATE=Idle;CMEMORY=3813;CDISK=7867;CPROC=4;
' from wiki server
01/20 19:08:34 MSUDisconnect(S)
01/20 19:08:34 INFO:     received node list through WIKI RM
01/20 19:08:34 INFO:     loading 4 node(s)
01/20 19:08:34 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:34 MNodeFind(xc14n13,N)
01/20 19:08:34 MRMNodePreUpdate(xc14n13,Idle,XC14N16)
01/20 19:08:34 MWikiNodeUpdate(AList,xc14n13)
01/20 19:08:34 MWikiNodeUpdateAttr(STATE=Idle,xc14n13)
01/20 19:08:34 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n13)
01/20 19:08:34 MWikiNodeUpdateAttr(CDISK=12283,xc14n13)
01/20 19:08:34 MWikiNodeUpdateAttr(CPROC=2,xc14n13)
01/20 19:08:34 MRMNodePostUpdate(xc14n13,Idle)
01/20 19:08:34 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:34 MNodeFind(xc14n14,N)
01/20 19:08:34 MRMNodePreUpdate(xc14n14,Idle,XC14N16)
01/20 19:08:34 MWikiNodeUpdate(AList,xc14n14)
01/20 19:08:34 MWikiNodeUpdateAttr(STATE=Idle,xc14n14)
01/20 19:08:34 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n14)
01/20 19:08:34 MWikiNodeUpdateAttr(CDISK=12283,xc14n14)
01/20 19:08:34 MWikiNodeUpdateAttr(CPROC=2,xc14n14)
01/20 19:08:34 MRMNodePostUpdate(xc14n14,Idle)
01/20 19:08:34 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:34 MNodeFind(xc14n15,N)
01/20 19:08:34 MRMNodePreUpdate(xc14n15,Idle,XC14N16)
01/20 19:08:34 MWikiNodeUpdate(AList,xc14n15)
01/20 19:08:34 MWikiNodeUpdateAttr(STATE=Idle,xc14n15)
01/20 19:08:34 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n15)
01/20 19:08:34 MWikiNodeUpdateAttr(CDISK=12283,xc14n15)
01/20 19:08:34 MWikiNodeUpdateAttr(CPROC=2,xc14n15)
01/20 19:08:34 MRMNodePostUpdate(xc14n15,Idle)
01/20 19:08:34 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:34 MNodeFind(xc14n16,N)
01/20 19:08:34 MRMNodePreUpdate(xc14n16,Idle,XC14N16)
01/20 19:08:34 MWikiNodeUpdate(AList,xc14n16)
01/20 19:08:34 MWikiNodeUpdateAttr(STATE=Idle,xc14n16)
01/20 19:08:34 MWikiNodeUpdateAttr(CMEMORY=3813,xc14n16)
01/20 19:08:34 MWikiNodeUpdateAttr(CDISK=7867,xc14n16)
01/20 19:08:34 MWikiNodeUpdateAttr(CPROC=4,xc14n16)
01/20 19:08:34 MRMNodePostUpdate(xc14n16,Idle)
01/20 19:08:34 INFO:     0 WIKI resources detected on RM XC14N16
01/20 19:08:34 WARNING:  no resources detected
01/20 19:08:34 MRMWorkloadQuery()
01/20 19:08:34 MWikiWorkloadQuery(XC14N16,JCount,SC)
01/20 19:08:34 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETJOBS
ARG=0:ALL,Data,DataSize,SC)
01/20 19:08:34 MSUConnect(S)
01/20 19:08:34 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:08:34 INFO:     non-blocking mode established
01/20 19:08:34 MSUSelectWrite(8,9000000)
01/20 19:08:34 INFO:     successful connect to TCP server (sd: 8)
01/20 19:08:34 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:08:34 INFO:     header created '00000021
'
01/20 19:08:34 INFO:     sending short packet '00000021
CMD=GETJOBS ARG=0:ALL'
01/20 19:08:34 MSUSendPacket(8,Message,30,9000000)
01/20 19:08:34 MSUSelectWrite(8,9000000)
01/20 19:08:34 INFO:     packet sent (30 bytes of 30)
01/20 19:08:34 INFO:     command sent to server
01/20 19:08:34 INFO:     message sent: 'CMD=GETJOBS ARG=0:ALL'
01/20 19:08:34 MSURecvData(S,9000000,0)
01/20 19:08:34 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:08:34 MSUSelectRead(8,9000000)
01/20 19:08:34 INFO:     9 of 9 bytes read from sd 8
01/20 19:08:34 MSURecvPacket(8,Buffer,274,NULL,9000000)
01/20 19:08:34 MSUSelectRead(8,9000000)
01/20 19:08:34 INFO:     274 of 274 bytes read from sd 8
01/20 19:08:34 INFO:     received message 'CK=c931aac794b0c605
TS=1106266114 AUTH=slurm DT=SC=0
ARG=1#43:UPDATETIME=1106265905;STATE=Complete;WCLIMIT=0;TASKS=4;QUEUETIM
E=1106265902;STARTTIME=1106265905;UNAME=lsfadmin;GNAME=lsfadmin;HOSTLIST
=xc14n13:xc14n14:xc14n15:xc14n16;PARTITIONMASK=lsf;NODES=4;RMEM=2000;RDI
SK=1;' from wiki server
01/20 19:08:34 MSUDisconnect(S)
01/20 19:08:34 INFO:     received job list through WIKI RM
01/20 19:08:34 INFO:     loading 1 job(s)
01/20 19:08:34 MWikiGetAttr(job,Name,Status,Attr,Start)
01/20 19:08:34 WARNING:  job '43' detected with unexpected state '20'
01/20 19:08:34 INFO:     1 WIKI jobs detected on RM XC14N16
01/20 19:08:34 INFO:     jobs detected: 1
01/20 19:08:34 MStatClearUsage(node,Active)
01/20 19:08:34 MClusterUpdateNodeState()
01/20 19:08:34 INFO:     node 'xc14n13' C/A/D procs:  2/2/0
01/20 19:08:34 INFO:     node 'xc14n14' C/A/D procs:  2/2/0
01/20 19:08:34 INFO:     node 'xc14n15' C/A/D procs:  2/2/0
01/20 19:08:34 INFO:     node 'xc14n16' C/A/D procs:  4/4/0
01/20 19:08:34 MParUpdate(ALL)
01/20 19:08:34 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:34 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:08:34 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:08:34 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:08:34 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:08:34 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:34 INFO:     jobs in queue
01/20 19:08:34 MResAdjustDRes(NULL,FALSE)
01/20 19:08:34 MQueueSelectAllJobs(Q,HARD,ALL,JIList,DP,Msg)
01/20 19:08:34 MQueueSelectAllJobs(Q,SOFT,ALL,JIList,DP,Msg)
01/20 19:08:34
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,FALSE
)
01/20 19:08:34 INFO:     idle job queue is empty on iteration 94
01/20 19:08:34
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:08:34 INFO:     idle job queue is empty on iteration 94
01/20 19:08:34
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:08:34 INFO:     idle job queue is empty on iteration 94
01/20 19:08:34 INFO:     cannot finalize RM cycle (RM 'XC14N16' does not
support function 'cyclefinalize')
01/20 19:08:34
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:08:34 INFO:     idle job queue is empty on iteration 94
01/20 19:08:34 MSchedUpdateStats()
01/20 19:08:34 INFO:     iteration:   94   scheduling time:  0.001
seconds
01/20 19:08:34 MResUpdateStats()
01/20 19:08:34 INFO:     current util[94]:  0/4 (0.00%)  PH: 1.28%
active jobs: 0 of 0 (completed: 3)
01/20 19:08:34 MQueueCheckStatus()
01/20 19:08:34 MNodeCheckStatus()
01/20 19:08:34 INFO:     checking node 'xc14n13'
01/20 19:08:34 INFO:     checking node 'xc14n14'
01/20 19:08:34 INFO:     checking node 'xc14n15'
01/20 19:08:34 INFO:     checking node 'xc14n16'
01/20 19:08:34 MSysCheck()
01/20 19:08:34 MLimitEnforceAll(ALL)
01/20 19:08:34 MUClearChild(PID)
01/20 19:08:34 MParUpdate(ALL)
01/20 19:08:34 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:34 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:08:34 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:08:34 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:08:34 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:08:34 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:34 MResCheckStatus(NULL)
01/20 19:08:34 INFO:     scheduling complete.  sleeping 10 seconds
01/20 19:08:45 ServerProcessRequests()
01/20 19:08:45 MLogRoll(NULL,0,1)
01/20 19:08:45 INFO:     not rolling logs (54088 < 10000000)
01/20 19:08:45 MResAdjust(NULL,0,0)
01/20 19:08:45 MJobSetAttr(,PAL,Value,1,2)
01/20 19:08:45 INFO:     job flags for job : 0
01/20 19:08:45 MJobSetAttr(,GAttr,Value,1,5)
01/20 19:08:45 MStatInitializeActiveSysUsage()
01/20 19:08:45 MStatClearUsage([NONE],Active)
01/20 19:08:45 ServerUpdate()
01/20 19:08:45 MSysUpdateTime()
01/20 19:08:45 INFO:     starting iteration 95
01/20 19:08:45 MSchedProcessJobs()
01/20 19:08:45 MRMGetInfo()
01/20 19:08:45 MClusterClearUsage()
01/20 19:08:45 MRMClusterQuery()
01/20 19:08:45 MWikiClusterLoadInfo(XC14N16,RCount,EMsg,SC)
01/20 19:08:45 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETNODES
ARG=0:ALL,Data,DataSize,SC)
01/20 19:08:45 MSUConnect(S)
01/20 19:08:45 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:08:45 INFO:     non-blocking mode established
01/20 19:08:45 MSUSelectWrite(8,9000000)
01/20 19:08:45 INFO:     successful connect to TCP server (sd: 8)
01/20 19:08:45 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:08:45 INFO:     header created '00000022
'
01/20 19:08:45 INFO:     sending short packet '00000022
CMD=GETNODES ARG=0:ALL'
01/20 19:08:45 MSUSendPacket(8,Message,31,9000000)
01/20 19:08:45 MSUSelectWrite(8,9000000)
01/20 19:08:45 INFO:     packet sent (31 bytes of 31)
01/20 19:08:45 INFO:     command sent to server
01/20 19:08:45 INFO:     message sent: 'CMD=GETNODES ARG=0:ALL'
01/20 19:08:45 MSURecvData(S,9000000,0)
01/20 19:08:45 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:08:45 MSUSelectRead(8,9000000)
01/20 19:08:45 INFO:     9 of 9 bytes read from sd 8
01/20 19:08:45 MSURecvPacket(8,Buffer,269,NULL,9000000)
01/20 19:08:45 MSUSelectRead(8,9000000)
01/20 19:08:45 INFO:     269 of 269 bytes read from sd 8
01/20 19:08:45 INFO:     received message 'CK=23b9967e78a18320
TS=1106266125 AUTH=slurm DT=SC=0
ARG=4#xc14n13:STATE=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n14:STATE
=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n15:STATE=Idle;CMEMORY=2981;
CDISK=12283;CPROC=2;#xc14n16:STATE=Idle;CMEMORY=3813;CDISK=7867;CPROC=4;
' from wiki server
01/20 19:08:45 MSUDisconnect(S)
01/20 19:08:45 INFO:     received node list through WIKI RM
01/20 19:08:45 INFO:     loading 4 node(s)
01/20 19:08:45 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:45 MNodeFind(xc14n13,N)
01/20 19:08:45 MRMNodePreUpdate(xc14n13,Idle,XC14N16)
01/20 19:08:45 MWikiNodeUpdate(AList,xc14n13)
01/20 19:08:45 MWikiNodeUpdateAttr(STATE=Idle,xc14n13)
01/20 19:08:45 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n13)
01/20 19:08:45 MWikiNodeUpdateAttr(CDISK=12283,xc14n13)
01/20 19:08:45 MWikiNodeUpdateAttr(CPROC=2,xc14n13)
01/20 19:08:45 MRMNodePostUpdate(xc14n13,Idle)
01/20 19:08:45 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:45 MNodeFind(xc14n14,N)
01/20 19:08:45 MRMNodePreUpdate(xc14n14,Idle,XC14N16)
01/20 19:08:45 MWikiNodeUpdate(AList,xc14n14)
01/20 19:08:45 MWikiNodeUpdateAttr(STATE=Idle,xc14n14)
01/20 19:08:45 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n14)
01/20 19:08:45 MWikiNodeUpdateAttr(CDISK=12283,xc14n14)
01/20 19:08:45 MWikiNodeUpdateAttr(CPROC=2,xc14n14)
01/20 19:08:45 MRMNodePostUpdate(xc14n14,Idle)
01/20 19:08:45 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:45 MNodeFind(xc14n15,N)
01/20 19:08:45 MRMNodePreUpdate(xc14n15,Idle,XC14N16)
01/20 19:08:45 MWikiNodeUpdate(AList,xc14n15)
01/20 19:08:45 MWikiNodeUpdateAttr(STATE=Idle,xc14n15)
01/20 19:08:45 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n15)
01/20 19:08:45 MWikiNodeUpdateAttr(CDISK=12283,xc14n15)
01/20 19:08:45 MWikiNodeUpdateAttr(CPROC=2,xc14n15)
01/20 19:08:45 MRMNodePostUpdate(xc14n15,Idle)
01/20 19:08:45 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:45 MNodeFind(xc14n16,N)
01/20 19:08:45 MRMNodePreUpdate(xc14n16,Idle,XC14N16)
01/20 19:08:45 MWikiNodeUpdate(AList,xc14n16)
01/20 19:08:45 MWikiNodeUpdateAttr(STATE=Idle,xc14n16)
01/20 19:08:45 MWikiNodeUpdateAttr(CMEMORY=3813,xc14n16)
01/20 19:08:45 MWikiNodeUpdateAttr(CDISK=7867,xc14n16)
01/20 19:08:45 MWikiNodeUpdateAttr(CPROC=4,xc14n16)
01/20 19:08:45 MRMNodePostUpdate(xc14n16,Idle)
01/20 19:08:45 INFO:     0 WIKI resources detected on RM XC14N16
01/20 19:08:45 WARNING:  no resources detected
01/20 19:08:45 MRMWorkloadQuery()
01/20 19:08:45 MWikiWorkloadQuery(XC14N16,JCount,SC)
01/20 19:08:45 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETJOBS
ARG=0:ALL,Data,DataSize,SC)
01/20 19:08:45 MSUConnect(S)
01/20 19:08:45 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:08:45 INFO:     non-blocking mode established
01/20 19:08:45 MSUSelectWrite(8,9000000)
01/20 19:08:45 INFO:     successful connect to TCP server (sd: 8)
01/20 19:08:45 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:08:45 INFO:     header created '00000021
'
01/20 19:08:45 INFO:     sending short packet '00000021
CMD=GETJOBS ARG=0:ALL'
01/20 19:08:45 MSUSendPacket(8,Message,30,9000000)
01/20 19:08:45 MSUSelectWrite(8,9000000)
01/20 19:08:45 INFO:     packet sent (30 bytes of 30)
01/20 19:08:45 INFO:     command sent to server
01/20 19:08:45 INFO:     message sent: 'CMD=GETJOBS ARG=0:ALL'
01/20 19:08:45 MSURecvData(S,9000000,0)
01/20 19:08:45 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:08:45 MSUSelectRead(8,9000000)
01/20 19:08:45 INFO:     9 of 9 bytes read from sd 8
01/20 19:08:45 MSURecvPacket(8,Buffer,274,NULL,9000000)
01/20 19:08:45 MSUSelectRead(8,9000000)
01/20 19:08:45 INFO:     274 of 274 bytes read from sd 8
01/20 19:08:45 INFO:     received message 'CK=ad96686eb24032ca
TS=1106266125 AUTH=slurm DT=SC=0
ARG=1#43:UPDATETIME=1106265905;STATE=Complete;WCLIMIT=0;TASKS=4;QUEUETIM
E=1106265902;STARTTIME=1106265905;UNAME=lsfadmin;GNAME=lsfadmin;HOSTLIST
=xc14n13:xc14n14:xc14n15:xc14n16;PARTITIONMASK=lsf;NODES=4;RMEM=2000;RDI
SK=1;' from wiki server
01/20 19:08:45 MSUDisconnect(S)
01/20 19:08:45 INFO:     received job list through WIKI RM
01/20 19:08:45 INFO:     loading 1 job(s)
01/20 19:08:45 MWikiGetAttr(job,Name,Status,Attr,Start)
01/20 19:08:45 WARNING:  job '43' detected with unexpected state '20'
01/20 19:08:45 INFO:     1 WIKI jobs detected on RM XC14N16
01/20 19:08:45 INFO:     jobs detected: 1
01/20 19:08:45 MStatClearUsage(node,Active)
01/20 19:08:45 MClusterUpdateNodeState()
01/20 19:08:45 INFO:     node 'xc14n13' C/A/D procs:  2/2/0
01/20 19:08:45 INFO:     node 'xc14n14' C/A/D procs:  2/2/0
01/20 19:08:45 INFO:     node 'xc14n15' C/A/D procs:  2/2/0
01/20 19:08:45 INFO:     node 'xc14n16' C/A/D procs:  4/4/0
01/20 19:08:45 MParUpdate(ALL)
01/20 19:08:45 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:45 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:08:45 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:08:45 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:08:45 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:08:45 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:45 INFO:     jobs in queue
01/20 19:08:45 MResAdjustDRes(NULL,FALSE)
01/20 19:08:45 MQueueSelectAllJobs(Q,HARD,ALL,JIList,DP,Msg)
01/20 19:08:45 MQueueSelectAllJobs(Q,SOFT,ALL,JIList,DP,Msg)
01/20 19:08:45
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,FALSE
)
01/20 19:08:45 INFO:     idle job queue is empty on iteration 95
01/20 19:08:45
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:08:45 INFO:     idle job queue is empty on iteration 95
01/20 19:08:45
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:08:45 INFO:     idle job queue is empty on iteration 95
01/20 19:08:45 INFO:     cannot finalize RM cycle (RM 'XC14N16' does not
support function 'cyclefinalize')
01/20 19:08:45
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:08:45 INFO:     idle job queue is empty on iteration 95
01/20 19:08:45 MSchedUpdateStats()
01/20 19:08:45 INFO:     iteration:   95   scheduling time:  0.002
seconds
01/20 19:08:45 MResUpdateStats()
01/20 19:08:45 INFO:     current util[95]:  0/4 (0.00%)  PH: 1.26%
active jobs: 0 of 0 (completed: 3)
01/20 19:08:45 MQueueCheckStatus()
01/20 19:08:45 MNodeCheckStatus()
01/20 19:08:45 INFO:     checking node 'xc14n13'
01/20 19:08:45 INFO:     checking node 'xc14n14'
01/20 19:08:45 INFO:     checking node 'xc14n15'
01/20 19:08:45 INFO:     checking node 'xc14n16'
01/20 19:08:45 MSysCheck()
01/20 19:08:45 MLimitEnforceAll(ALL)
01/20 19:08:45 MUClearChild(PID)
01/20 19:08:45 MParUpdate(ALL)
01/20 19:08:45 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:45 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:08:45 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:08:45 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:08:45 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:08:45 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:45 MResCheckStatus(NULL)
01/20 19:08:45 INFO:     scheduling complete.  sleeping 10 seconds
01/20 19:08:56 ServerProcessRequests()
01/20 19:08:56 MLogRoll(NULL,0,1)
01/20 19:08:56 INFO:     not rolling logs (63113 < 10000000)
01/20 19:08:56 MResAdjust(NULL,0,0)
01/20 19:08:56 MJobSetAttr(,PAL,Value,1,2)
01/20 19:08:56 INFO:     job flags for job : 0
01/20 19:08:56 MJobSetAttr(,GAttr,Value,1,5)
01/20 19:08:56 MStatInitializeActiveSysUsage()
01/20 19:08:56 MStatClearUsage([NONE],Active)
01/20 19:08:56 ServerUpdate()
01/20 19:08:56 MSysUpdateTime()
01/20 19:08:56 INFO:     starting iteration 96
01/20 19:08:56 MSchedProcessJobs()
01/20 19:08:56 MRMGetInfo()
01/20 19:08:56 MClusterClearUsage()
01/20 19:08:56 MRMClusterQuery()
01/20 19:08:56 MWikiClusterLoadInfo(XC14N16,RCount,EMsg,SC)
01/20 19:08:56 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETNODES
ARG=0:ALL,Data,DataSize,SC)
01/20 19:08:56 MSUConnect(S)
01/20 19:08:56 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:08:56 INFO:     non-blocking mode established
01/20 19:08:56 MSUSelectWrite(8,9000000)
01/20 19:08:56 INFO:     successful connect to TCP server (sd: 8)
01/20 19:08:56 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:08:56 INFO:     header created '00000022
'
01/20 19:08:56 INFO:     sending short packet '00000022
CMD=GETNODES ARG=0:ALL'
01/20 19:08:56 MSUSendPacket(8,Message,31,9000000)
01/20 19:08:56 MSUSelectWrite(8,9000000)
01/20 19:08:56 INFO:     packet sent (31 bytes of 31)
01/20 19:08:56 INFO:     command sent to server
01/20 19:08:56 INFO:     message sent: 'CMD=GETNODES ARG=0:ALL'
01/20 19:08:56 MSURecvData(S,9000000,0)
01/20 19:08:56 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:08:56 MSUSelectRead(8,9000000)
01/20 19:08:56 INFO:     9 of 9 bytes read from sd 8
01/20 19:08:56 MSURecvPacket(8,Buffer,269,NULL,9000000)
01/20 19:08:56 MSUSelectRead(8,9000000)
01/20 19:08:56 INFO:     269 of 269 bytes read from sd 8
01/20 19:08:56 INFO:     received message 'CK=80720aea09f257a7
TS=1106266136 AUTH=slurm DT=SC=0
ARG=4#xc14n13:STATE=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n14:STATE
=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n15:STATE=Idle;CMEMORY=2981;
CDISK=12283;CPROC=2;#xc14n16:STATE=Idle;CMEMORY=3813;CDISK=7867;CPROC=4;
' from wiki server
01/20 19:08:56 MSUDisconnect(S)
01/20 19:08:56 INFO:     received node list through WIKI RM
01/20 19:08:56 INFO:     loading 4 node(s)
01/20 19:08:56 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:56 MNodeFind(xc14n13,N)
01/20 19:08:56 MRMNodePreUpdate(xc14n13,Idle,XC14N16)
01/20 19:08:56 MWikiNodeUpdate(AList,xc14n13)
01/20 19:08:56 MWikiNodeUpdateAttr(STATE=Idle,xc14n13)
01/20 19:08:56 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n13)
01/20 19:08:56 MWikiNodeUpdateAttr(CDISK=12283,xc14n13)
01/20 19:08:56 MWikiNodeUpdateAttr(CPROC=2,xc14n13)
01/20 19:08:56 MRMNodePostUpdate(xc14n13,Idle)
01/20 19:08:56 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:56 MNodeFind(xc14n14,N)
01/20 19:08:56 MRMNodePreUpdate(xc14n14,Idle,XC14N16)
01/20 19:08:56 MWikiNodeUpdate(AList,xc14n14)
01/20 19:08:56 MWikiNodeUpdateAttr(STATE=Idle,xc14n14)
01/20 19:08:56 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n14)
01/20 19:08:56 MWikiNodeUpdateAttr(CDISK=12283,xc14n14)
01/20 19:08:56 MWikiNodeUpdateAttr(CPROC=2,xc14n14)
01/20 19:08:56 MRMNodePostUpdate(xc14n14,Idle)
01/20 19:08:56 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:56 MNodeFind(xc14n15,N)
01/20 19:08:56 MRMNodePreUpdate(xc14n15,Idle,XC14N16)
01/20 19:08:56 MWikiNodeUpdate(AList,xc14n15)
01/20 19:08:56 MWikiNodeUpdateAttr(STATE=Idle,xc14n15)
01/20 19:08:56 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n15)
01/20 19:08:56 MWikiNodeUpdateAttr(CDISK=12283,xc14n15)
01/20 19:08:56 MWikiNodeUpdateAttr(CPROC=2,xc14n15)
01/20 19:08:56 MRMNodePostUpdate(xc14n15,Idle)
01/20 19:08:56 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:08:56 MNodeFind(xc14n16,N)
01/20 19:08:56 MRMNodePreUpdate(xc14n16,Idle,XC14N16)
01/20 19:08:56 MWikiNodeUpdate(AList,xc14n16)
01/20 19:08:56 MWikiNodeUpdateAttr(STATE=Idle,xc14n16)
01/20 19:08:56 MWikiNodeUpdateAttr(CMEMORY=3813,xc14n16)
01/20 19:08:56 MWikiNodeUpdateAttr(CDISK=7867,xc14n16)
01/20 19:08:56 MWikiNodeUpdateAttr(CPROC=4,xc14n16)
01/20 19:08:56 MRMNodePostUpdate(xc14n16,Idle)
01/20 19:08:56 INFO:     0 WIKI resources detected on RM XC14N16
01/20 19:08:56 WARNING:  no resources detected
01/20 19:08:56 MRMWorkloadQuery()
01/20 19:08:56 MWikiWorkloadQuery(XC14N16,JCount,SC)
01/20 19:08:56 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETJOBS
ARG=0:ALL,Data,DataSize,SC)
01/20 19:08:56 MSUConnect(S)
01/20 19:08:56 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:08:56 INFO:     non-blocking mode established
01/20 19:08:56 MSUSelectWrite(8,9000000)
01/20 19:08:56 INFO:     successful connect to TCP server (sd: 8)
01/20 19:08:56 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:08:56 INFO:     header created '00000021
'
01/20 19:08:56 INFO:     sending short packet '00000021
CMD=GETJOBS ARG=0:ALL'
01/20 19:08:56 MSUSendPacket(8,Message,30,9000000)
01/20 19:08:56 MSUSelectWrite(8,9000000)
01/20 19:08:56 INFO:     packet sent (30 bytes of 30)
01/20 19:08:56 INFO:     command sent to server
01/20 19:08:56 INFO:     message sent: 'CMD=GETJOBS ARG=0:ALL'
01/20 19:08:56 MSURecvData(S,9000000,0)
01/20 19:08:56 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:08:56 MSUSelectRead(8,9000000)
01/20 19:08:56 INFO:     9 of 9 bytes read from sd 8
01/20 19:08:56 MSURecvPacket(8,Buffer,274,NULL,9000000)
01/20 19:08:56 MSUSelectRead(8,9000000)
01/20 19:08:56 INFO:     274 of 274 bytes read from sd 8
01/20 19:08:56 INFO:     received message 'CK=4adcbac412129259
TS=1106266136 AUTH=slurm DT=SC=0
ARG=1#43:UPDATETIME=1106265905;STATE=Complete;WCLIMIT=0;TASKS=4;QUEUETIM
E=1106265902;STARTTIME=1106265905;UNAME=lsfadmin;GNAME=lsfadmin;HOSTLIST
=xc14n13:xc14n14:xc14n15:xc14n16;PARTITIONMASK=lsf;NODES=4;RMEM=2000;RDI
SK=1;' from wiki server
01/20 19:08:56 MSUDisconnect(S)
01/20 19:08:56 INFO:     received job list through WIKI RM
01/20 19:08:56 INFO:     loading 1 job(s)
01/20 19:08:56 MWikiGetAttr(job,Name,Status,Attr,Start)
01/20 19:08:56 WARNING:  job '43' detected with unexpected state '20'
01/20 19:08:56 INFO:     1 WIKI jobs detected on RM XC14N16
01/20 19:08:56 INFO:     jobs detected: 1
01/20 19:08:56 MStatClearUsage(node,Active)
01/20 19:08:56 MClusterUpdateNodeState()
01/20 19:08:56 INFO:     node 'xc14n13' C/A/D procs:  2/2/0
01/20 19:08:56 INFO:     node 'xc14n14' C/A/D procs:  2/2/0
01/20 19:08:56 INFO:     node 'xc14n15' C/A/D procs:  2/2/0
01/20 19:08:56 INFO:     node 'xc14n16' C/A/D procs:  4/4/0
01/20 19:08:56 MParUpdate(ALL)
01/20 19:08:56 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:56 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:08:56 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:08:56 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:08:56 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:08:56 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:56 INFO:     jobs in queue
01/20 19:08:56 MResAdjustDRes(NULL,FALSE)
01/20 19:08:56 MQueueSelectAllJobs(Q,HARD,ALL,JIList,DP,Msg)
01/20 19:08:56 MQueueSelectAllJobs(Q,SOFT,ALL,JIList,DP,Msg)
01/20 19:08:56
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,FALSE
)
01/20 19:08:56 INFO:     idle job queue is empty on iteration 96
01/20 19:08:56
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:08:56 INFO:     idle job queue is empty on iteration 96
01/20 19:08:56
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:08:56 INFO:     idle job queue is empty on iteration 96
01/20 19:08:56 INFO:     cannot finalize RM cycle (RM 'XC14N16' does not
support function 'cyclefinalize')
01/20 19:08:56
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:08:56 INFO:     idle job queue is empty on iteration 96
01/20 19:08:56 MSchedUpdateStats()
01/20 19:08:56 INFO:     iteration:   96   scheduling time:  0.002
seconds
01/20 19:08:56 MResUpdateStats()
01/20 19:08:56 INFO:     current util[96]:  0/4 (0.00%)  PH: 1.25%
active jobs: 0 of 0 (completed: 3)
01/20 19:08:56 MQueueCheckStatus()
01/20 19:08:56 MNodeCheckStatus()
01/20 19:08:56 INFO:     checking node 'xc14n13'
01/20 19:08:56 INFO:     checking node 'xc14n14'
01/20 19:08:56 INFO:     checking node 'xc14n15'
01/20 19:08:56 INFO:     checking node 'xc14n16'
01/20 19:08:56 MSysCheck()
01/20 19:08:56 MLimitEnforceAll(ALL)
01/20 19:08:56 MUClearChild(PID)
01/20 19:08:56 MParUpdate(ALL)
01/20 19:08:56 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:56 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:08:56 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:08:56 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:08:56 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:08:56 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:08:56 MResCheckStatus(NULL)
01/20 19:08:56 INFO:     scheduling complete.  sleeping 10 seconds
01/20 19:09:07 ServerProcessRequests()
01/20 19:09:07 MLogRoll(NULL,0,1)
01/20 19:09:07 INFO:     not rolling logs (72138 < 10000000)
01/20 19:09:07 MResAdjust(NULL,0,0)
01/20 19:09:07 MJobSetAttr(,PAL,Value,1,2)
01/20 19:09:07 INFO:     job flags for job : 0
01/20 19:09:07 MJobSetAttr(,GAttr,Value,1,5)
01/20 19:09:07 MStatInitializeActiveSysUsage()
01/20 19:09:07 MStatClearUsage([NONE],Active)
01/20 19:09:07 ServerUpdate()
01/20 19:09:07 MSysUpdateTime()
01/20 19:09:07 INFO:     starting iteration 97
01/20 19:09:07 MSchedProcessJobs()
01/20 19:09:07 MRMGetInfo()
01/20 19:09:07 MClusterClearUsage()
01/20 19:09:07 MRMClusterQuery()
01/20 19:09:07 MWikiClusterLoadInfo(XC14N16,RCount,EMsg,SC)
01/20 19:09:07 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETNODES
ARG=0:ALL,Data,DataSize,SC)
01/20 19:09:07 MSUConnect(S)
01/20 19:09:07 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:09:07 INFO:     non-blocking mode established
01/20 19:09:07 MSUSelectWrite(8,9000000)
01/20 19:09:07 INFO:     successful connect to TCP server (sd: 8)
01/20 19:09:07 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:09:07 INFO:     header created '00000022
'
01/20 19:09:07 INFO:     sending short packet '00000022
CMD=GETNODES ARG=0:ALL'
01/20 19:09:07 MSUSendPacket(8,Message,31,9000000)
01/20 19:09:07 MSUSelectWrite(8,9000000)
01/20 19:09:07 INFO:     packet sent (31 bytes of 31)
01/20 19:09:07 INFO:     command sent to server
01/20 19:09:07 INFO:     message sent: 'CMD=GETNODES ARG=0:ALL'
01/20 19:09:07 MSURecvData(S,9000000,0)
01/20 19:09:07 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:09:07 MSUSelectRead(8,9000000)
01/20 19:09:07 INFO:     9 of 9 bytes read from sd 8
01/20 19:09:07 MSURecvPacket(8,Buffer,269,NULL,9000000)
01/20 19:09:07 MSUSelectRead(8,9000000)
01/20 19:09:07 INFO:     269 of 269 bytes read from sd 8
01/20 19:09:07 INFO:     received message 'CK=7ac9749170bbf4c5
TS=1106266147 AUTH=slurm DT=SC=0
ARG=4#xc14n13:STATE=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n14:STATE
=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n15:STATE=Idle;CMEMORY=2981;
CDISK=12283;CPROC=2;#xc14n16:STATE=Idle;CMEMORY=3813;CDISK=7867;CPROC=4;
' from wiki server
01/20 19:09:07 MSUDisconnect(S)
01/20 19:09:07 INFO:     received node list through WIKI RM
01/20 19:09:07 INFO:     loading 4 node(s)
01/20 19:09:07 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:09:07 MNodeFind(xc14n13,N)
01/20 19:09:07 MRMNodePreUpdate(xc14n13,Idle,XC14N16)
01/20 19:09:07 MWikiNodeUpdate(AList,xc14n13)
01/20 19:09:07 MWikiNodeUpdateAttr(STATE=Idle,xc14n13)
01/20 19:09:07 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n13)
01/20 19:09:07 MWikiNodeUpdateAttr(CDISK=12283,xc14n13)
01/20 19:09:07 MWikiNodeUpdateAttr(CPROC=2,xc14n13)
01/20 19:09:07 MRMNodePostUpdate(xc14n13,Idle)
01/20 19:09:07 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:09:07 MNodeFind(xc14n14,N)
01/20 19:09:07 MRMNodePreUpdate(xc14n14,Idle,XC14N16)
01/20 19:09:07 MWikiNodeUpdate(AList,xc14n14)
01/20 19:09:07 MWikiNodeUpdateAttr(STATE=Idle,xc14n14)
01/20 19:09:07 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n14)
01/20 19:09:07 MWikiNodeUpdateAttr(CDISK=12283,xc14n14)
01/20 19:09:07 MWikiNodeUpdateAttr(CPROC=2,xc14n14)
01/20 19:09:07 MRMNodePostUpdate(xc14n14,Idle)
01/20 19:09:07 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:09:07 MNodeFind(xc14n15,N)
01/20 19:09:07 MRMNodePreUpdate(xc14n15,Idle,XC14N16)
01/20 19:09:07 MWikiNodeUpdate(AList,xc14n15)
01/20 19:09:07 MWikiNodeUpdateAttr(STATE=Idle,xc14n15)
01/20 19:09:07 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n15)
01/20 19:09:07 MWikiNodeUpdateAttr(CDISK=12283,xc14n15)
01/20 19:09:07 MWikiNodeUpdateAttr(CPROC=2,xc14n15)
01/20 19:09:07 MRMNodePostUpdate(xc14n15,Idle)
01/20 19:09:07 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:09:07 MNodeFind(xc14n16,N)
01/20 19:09:07 MRMNodePreUpdate(xc14n16,Idle,XC14N16)
01/20 19:09:07 MWikiNodeUpdate(AList,xc14n16)
01/20 19:09:07 MWikiNodeUpdateAttr(STATE=Idle,xc14n16)
01/20 19:09:07 MWikiNodeUpdateAttr(CMEMORY=3813,xc14n16)
01/20 19:09:07 MWikiNodeUpdateAttr(CDISK=7867,xc14n16)
01/20 19:09:07 MWikiNodeUpdateAttr(CPROC=4,xc14n16)
01/20 19:09:07 MRMNodePostUpdate(xc14n16,Idle)
01/20 19:09:07 INFO:     0 WIKI resources detected on RM XC14N16
01/20 19:09:07 WARNING:  no resources detected
01/20 19:09:07 MRMWorkloadQuery()
01/20 19:09:07 MWikiWorkloadQuery(XC14N16,JCount,SC)
01/20 19:09:07 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETJOBS
ARG=0:ALL,Data,DataSize,SC)
01/20 19:09:07 MSUConnect(S)
01/20 19:09:07 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:09:07 INFO:     non-blocking mode established
01/20 19:09:07 MSUSelectWrite(8,9000000)
01/20 19:09:07 INFO:     successful connect to TCP server (sd: 8)
01/20 19:09:07 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:09:07 INFO:     header created '00000021
'
01/20 19:09:07 INFO:     sending short packet '00000021
CMD=GETJOBS ARG=0:ALL'
01/20 19:09:07 MSUSendPacket(8,Message,30,9000000)
01/20 19:09:07 MSUSelectWrite(8,9000000)
01/20 19:09:07 INFO:     packet sent (30 bytes of 30)
01/20 19:09:07 INFO:     command sent to server
01/20 19:09:07 INFO:     message sent: 'CMD=GETJOBS ARG=0:ALL'
01/20 19:09:07 MSURecvData(S,9000000,0)
01/20 19:09:07 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:09:07 MSUSelectRead(8,9000000)
01/20 19:09:07 INFO:     9 of 9 bytes read from sd 8
01/20 19:09:07 MSURecvPacket(8,Buffer,274,NULL,9000000)
01/20 19:09:07 MSUSelectRead(8,9000000)
01/20 19:09:07 INFO:     274 of 274 bytes read from sd 8
01/20 19:09:07 INFO:     received message 'CK=5354ed835ec69bdc
TS=1106266147 AUTH=slurm DT=SC=0
ARG=1#43:UPDATETIME=1106265905;STATE=Complete;WCLIMIT=0;TASKS=4;QUEUETIM
E=1106265902;STARTTIME=1106265905;UNAME=lsfadmin;GNAME=lsfadmin;HOSTLIST
=xc14n13:xc14n14:xc14n15:xc14n16;PARTITIONMASK=lsf;NODES=4;RMEM=2000;RDI
SK=1;' from wiki server
01/20 19:09:07 MSUDisconnect(S)
01/20 19:09:07 INFO:     received job list through WIKI RM
01/20 19:09:07 INFO:     loading 1 job(s)
01/20 19:09:07 MWikiGetAttr(job,Name,Status,Attr,Start)
01/20 19:09:07 WARNING:  job '43' detected with unexpected state '20'
01/20 19:09:07 INFO:     1 WIKI jobs detected on RM XC14N16
01/20 19:09:07 INFO:     jobs detected: 1
01/20 19:09:07 MStatClearUsage(node,Active)
01/20 19:09:07 MClusterUpdateNodeState()
01/20 19:09:07 INFO:     node 'xc14n13' C/A/D procs:  2/2/0
01/20 19:09:07 INFO:     node 'xc14n14' C/A/D procs:  2/2/0
01/20 19:09:07 INFO:     node 'xc14n15' C/A/D procs:  2/2/0
01/20 19:09:07 INFO:     node 'xc14n16' C/A/D procs:  4/4/0
01/20 19:09:07 MParUpdate(ALL)
01/20 19:09:07 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:09:07 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:09:07 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:09:07 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:09:07 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:09:07 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:09:07 INFO:     jobs in queue
01/20 19:09:07 MResAdjustDRes(NULL,FALSE)
01/20 19:09:07 MQueueSelectAllJobs(Q,HARD,ALL,JIList,DP,Msg)
01/20 19:09:07 MQueueSelectAllJobs(Q,SOFT,ALL,JIList,DP,Msg)
01/20 19:09:07
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,FALSE
)
01/20 19:09:07 INFO:     idle job queue is empty on iteration 97
01/20 19:09:07
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:09:07 INFO:     idle job queue is empty on iteration 97
01/20 19:09:07
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:09:07 INFO:     idle job queue is empty on iteration 97
01/20 19:09:07 INFO:     cannot finalize RM cycle (RM 'XC14N16' does not
support function 'cyclefinalize')
01/20 19:09:07
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:09:07 INFO:     idle job queue is empty on iteration 97
01/20 19:09:07 MSchedUpdateStats()
01/20 19:09:07 INFO:     iteration:   97   scheduling time:  0.002
seconds
01/20 19:09:07 MResUpdateStats()
01/20 19:09:07 INFO:     current util[97]:  0/4 (0.00%)  PH: 1.24%
active jobs: 0 of 0 (completed: 3)
01/20 19:09:07 MQueueCheckStatus()
01/20 19:09:07 MNodeCheckStatus()
01/20 19:09:07 INFO:     checking node 'xc14n13'
01/20 19:09:07 INFO:     checking node 'xc14n14'
01/20 19:09:07 INFO:     checking node 'xc14n15'
01/20 19:09:07 INFO:     checking node 'xc14n16'
01/20 19:09:07 MSysCheck()
01/20 19:09:07 MLimitEnforceAll(ALL)
01/20 19:09:07 MUClearChild(PID)
01/20 19:09:07 MParUpdate(ALL)
01/20 19:09:07 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:09:07 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:09:07 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:09:07 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:09:07 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:09:07 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:09:07 MResCheckStatus(NULL)
01/20 19:09:07 INFO:     scheduling complete.  sleeping 10 seconds
01/20 19:09:18 ServerProcessRequests()
01/20 19:09:18 MLogRoll(NULL,0,1)
01/20 19:09:18 INFO:     not rolling logs (81163 < 10000000)
01/20 19:09:18 MResAdjust(NULL,0,0)
01/20 19:09:18 MJobSetAttr(,PAL,Value,1,2)
01/20 19:09:18 INFO:     job flags for job : 0
01/20 19:09:18 MJobSetAttr(,GAttr,Value,1,5)
01/20 19:09:18 MStatInitializeActiveSysUsage()
01/20 19:09:18 MStatClearUsage([NONE],Active)
01/20 19:09:18 ServerUpdate()
01/20 19:09:18 MSysUpdateTime()
01/20 19:09:18 INFO:     starting iteration 98
01/20 19:09:18 MSchedProcessJobs()
01/20 19:09:18 MRMGetInfo()
01/20 19:09:18 MClusterClearUsage()
01/20 19:09:18 MRMClusterQuery()
01/20 19:09:18 MWikiClusterLoadInfo(XC14N16,RCount,EMsg,SC)
01/20 19:09:18 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETNODES
ARG=0:ALL,Data,DataSize,SC)
01/20 19:09:18 MSUConnect(S)
01/20 19:09:18 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:09:18 INFO:     non-blocking mode established
01/20 19:09:18 MSUSelectWrite(8,9000000)
01/20 19:09:18 INFO:     successful connect to TCP server (sd: 8)
01/20 19:09:18 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:09:18 INFO:     header created '00000022
'
01/20 19:09:18 INFO:     sending short packet '00000022
CMD=GETNODES ARG=0:ALL'
01/20 19:09:18 MSUSendPacket(8,Message,31,9000000)
01/20 19:09:18 MSUSelectWrite(8,9000000)
01/20 19:09:18 INFO:     packet sent (31 bytes of 31)
01/20 19:09:18 INFO:     command sent to server
01/20 19:09:18 INFO:     message sent: 'CMD=GETNODES ARG=0:ALL'
01/20 19:09:18 MSURecvData(S,9000000,0)
01/20 19:09:18 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:09:18 MSUSelectRead(8,9000000)
01/20 19:09:18 INFO:     9 of 9 bytes read from sd 8
01/20 19:09:18 MSURecvPacket(8,Buffer,269,NULL,9000000)
01/20 19:09:18 MSUSelectRead(8,9000000)
01/20 19:09:18 INFO:     269 of 269 bytes read from sd 8
01/20 19:09:18 INFO:     received message 'CK=cff291d95cac87b1
TS=1106266158 AUTH=slurm DT=SC=0
ARG=4#xc14n13:STATE=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n14:STATE
=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n15:STATE=Idle;CMEMORY=2981;
CDISK=12283;CPROC=2;#xc14n16:STATE=Idle;CMEMORY=3813;CDISK=7867;CPROC=4;
' from wiki server
01/20 19:09:18 MSUDisconnect(S)
01/20 19:09:18 INFO:     received node list through WIKI RM
01/20 19:09:18 INFO:     loading 4 node(s)
01/20 19:09:18 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:09:18 MNodeFind(xc14n13,N)
01/20 19:09:18 MRMNodePreUpdate(xc14n13,Idle,XC14N16)
01/20 19:09:18 MWikiNodeUpdate(AList,xc14n13)
01/20 19:09:18 MWikiNodeUpdateAttr(STATE=Idle,xc14n13)
01/20 19:09:18 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n13)
01/20 19:09:18 MWikiNodeUpdateAttr(CDISK=12283,xc14n13)
01/20 19:09:18 MWikiNodeUpdateAttr(CPROC=2,xc14n13)
01/20 19:09:18 MRMNodePostUpdate(xc14n13,Idle)
01/20 19:09:18 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:09:18 MNodeFind(xc14n14,N)
01/20 19:09:18 MRMNodePreUpdate(xc14n14,Idle,XC14N16)
01/20 19:09:18 MWikiNodeUpdate(AList,xc14n14)
01/20 19:09:18 MWikiNodeUpdateAttr(STATE=Idle,xc14n14)
01/20 19:09:18 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n14)
01/20 19:09:18 MWikiNodeUpdateAttr(CDISK=12283,xc14n14)
01/20 19:09:18 MWikiNodeUpdateAttr(CPROC=2,xc14n14)
01/20 19:09:18 MRMNodePostUpdate(xc14n14,Idle)
01/20 19:09:18 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:09:18 MNodeFind(xc14n15,N)
01/20 19:09:18 MRMNodePreUpdate(xc14n15,Idle,XC14N16)
01/20 19:09:18 MWikiNodeUpdate(AList,xc14n15)
01/20 19:09:18 MWikiNodeUpdateAttr(STATE=Idle,xc14n15)
01/20 19:09:18 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n15)
01/20 19:09:18 MWikiNodeUpdateAttr(CDISK=12283,xc14n15)
01/20 19:09:18 MWikiNodeUpdateAttr(CPROC=2,xc14n15)
01/20 19:09:18 MRMNodePostUpdate(xc14n15,Idle)
01/20 19:09:18 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:09:18 MNodeFind(xc14n16,N)
01/20 19:09:18 MRMNodePreUpdate(xc14n16,Idle,XC14N16)
01/20 19:09:18 MWikiNodeUpdate(AList,xc14n16)
01/20 19:09:18 MWikiNodeUpdateAttr(STATE=Idle,xc14n16)
01/20 19:09:18 MWikiNodeUpdateAttr(CMEMORY=3813,xc14n16)
01/20 19:09:18 MWikiNodeUpdateAttr(CDISK=7867,xc14n16)
01/20 19:09:18 MWikiNodeUpdateAttr(CPROC=4,xc14n16)
01/20 19:09:18 MRMNodePostUpdate(xc14n16,Idle)
01/20 19:09:18 INFO:     0 WIKI resources detected on RM XC14N16
01/20 19:09:18 WARNING:  no resources detected
01/20 19:09:18 MRMWorkloadQuery()
01/20 19:09:18 MWikiWorkloadQuery(XC14N16,JCount,SC)
01/20 19:09:18 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETJOBS
ARG=0:ALL,Data,DataSize,SC)
01/20 19:09:18 MSUConnect(S)
01/20 19:09:18 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:09:18 INFO:     non-blocking mode established
01/20 19:09:18 MSUSelectWrite(8,9000000)
01/20 19:09:18 INFO:     successful connect to TCP server (sd: 8)
01/20 19:09:18 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:09:18 INFO:     header created '00000021
'
01/20 19:09:18 INFO:     sending short packet '00000021
CMD=GETJOBS ARG=0:ALL'
01/20 19:09:18 MSUSendPacket(8,Message,30,9000000)
01/20 19:09:18 MSUSelectWrite(8,9000000)
01/20 19:09:18 INFO:     packet sent (30 bytes of 30)
01/20 19:09:18 INFO:     command sent to server
01/20 19:09:18 INFO:     message sent: 'CMD=GETJOBS ARG=0:ALL'
01/20 19:09:18 MSURecvData(S,9000000,0)
01/20 19:09:18 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:09:18 MSUSelectRead(8,9000000)
01/20 19:09:18 INFO:     9 of 9 bytes read from sd 8
01/20 19:09:18 MSURecvPacket(8,Buffer,274,NULL,9000000)
01/20 19:09:18 MSUSelectRead(8,9000000)
01/20 19:09:18 INFO:     274 of 274 bytes read from sd 8
01/20 19:09:18 INFO:     received message 'CK=444627ed44fb8bbb
TS=1106266158 AUTH=slurm DT=SC=0
ARG=1#43:UPDATETIME=1106265905;STATE=Complete;WCLIMIT=0;TASKS=4;QUEUETIM
E=1106265902;STARTTIME=1106265905;UNAME=lsfadmin;GNAME=lsfadmin;HOSTLIST
=xc14n13:xc14n14:xc14n15:xc14n16;PARTITIONMASK=lsf;NODES=4;RMEM=2000;RDI
SK=1;' from wiki server
01/20 19:09:18 MSUDisconnect(S)
01/20 19:09:18 INFO:     received job list through WIKI RM
01/20 19:09:18 INFO:     loading 1 job(s)
01/20 19:09:18 MWikiGetAttr(job,Name,Status,Attr,Start)
01/20 19:09:18 WARNING:  job '43' detected with unexpected state '20'
01/20 19:09:18 INFO:     1 WIKI jobs detected on RM XC14N16
01/20 19:09:18 INFO:     jobs detected: 1
01/20 19:09:18 MStatClearUsage(node,Active)
01/20 19:09:18 MClusterUpdateNodeState()
01/20 19:09:18 INFO:     node 'xc14n13' C/A/D procs:  2/2/0
01/20 19:09:18 INFO:     node 'xc14n14' C/A/D procs:  2/2/0
01/20 19:09:18 INFO:     node 'xc14n15' C/A/D procs:  2/2/0
01/20 19:09:18 INFO:     node 'xc14n16' C/A/D procs:  4/4/0
01/20 19:09:18 MParUpdate(ALL)
01/20 19:09:18 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:09:18 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:09:18 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:09:18 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:09:18 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:09:18 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:09:18 INFO:     jobs in queue
01/20 19:09:18 MResAdjustDRes(NULL,FALSE)
01/20 19:09:18 MQueueSelectAllJobs(Q,HARD,ALL,JIList,DP,Msg)
01/20 19:09:18 MQueueSelectAllJobs(Q,SOFT,ALL,JIList,DP,Msg)
01/20 19:09:18
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,FALSE
)
01/20 19:09:18 INFO:     idle job queue is empty on iteration 98
01/20 19:09:18
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:09:18 INFO:     idle job queue is empty on iteration 98
01/20 19:09:18
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:09:18 INFO:     idle job queue is empty on iteration 98
01/20 19:09:18 INFO:     cannot finalize RM cycle (RM 'XC14N16' does not
support function 'cyclefinalize')
01/20 19:09:18
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:09:18 INFO:     idle job queue is empty on iteration 98
01/20 19:09:18 MSchedUpdateStats()
01/20 19:09:18 INFO:     iteration:   98   scheduling time:  0.001
seconds
01/20 19:09:18 MResUpdateStats()
01/20 19:09:18 INFO:     current util[98]:  0/4 (0.00%)  PH: 1.22%
active jobs: 0 of 0 (completed: 3)
01/20 19:09:18 MQueueCheckStatus()
01/20 19:09:18 MNodeCheckStatus()
01/20 19:09:18 INFO:     checking node 'xc14n13'
01/20 19:09:18 INFO:     checking node 'xc14n14'
01/20 19:09:18 INFO:     checking node 'xc14n15'
01/20 19:09:18 INFO:     checking node 'xc14n16'
01/20 19:09:18 MSysCheck()
01/20 19:09:18 MLimitEnforceAll(ALL)
01/20 19:09:18 MUClearChild(PID)
01/20 19:09:18 MParUpdate(ALL)
01/20 19:09:18 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:09:18 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:09:18 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:09:18 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:09:18 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:09:18 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:09:18 MResCheckStatus(NULL)
01/20 19:09:18 INFO:     scheduling complete.  sleeping 10 seconds
01/20 19:09:29 ServerProcessRequests()
01/20 19:09:29 MLogRoll(NULL,0,1)
01/20 19:09:29 INFO:     not rolling logs (90188 < 10000000)
01/20 19:09:29 MResAdjust(NULL,0,0)
01/20 19:09:29 MJobSetAttr(,PAL,Value,1,2)
01/20 19:09:29 INFO:     job flags for job : 0
01/20 19:09:29 MJobSetAttr(,GAttr,Value,1,5)
01/20 19:09:29 MStatInitializeActiveSysUsage()
01/20 19:09:29 MStatClearUsage([NONE],Active)
01/20 19:09:29 ServerUpdate()
01/20 19:09:29 MSysUpdateTime()
01/20 19:09:29 INFO:     starting iteration 99
01/20 19:09:29 MSchedProcessJobs()
01/20 19:09:29 MRMGetInfo()
01/20 19:09:29 MClusterClearUsage()
01/20 19:09:29 MRMClusterQuery()
01/20 19:09:29 MWikiClusterLoadInfo(XC14N16,RCount,EMsg,SC)
01/20 19:09:29 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETNODES
ARG=0:ALL,Data,DataSize,SC)
01/20 19:09:29 MSUConnect(S)
01/20 19:09:29 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:09:29 INFO:     non-blocking mode established
01/20 19:09:29 MSUSelectWrite(8,9000000)
01/20 19:09:29 INFO:     successful connect to TCP server (sd: 8)
01/20 19:09:29 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:09:29 INFO:     header created '00000022
'
01/20 19:09:29 INFO:     sending short packet '00000022
CMD=GETNODES ARG=0:ALL'
01/20 19:09:29 MSUSendPacket(8,Message,31,9000000)
01/20 19:09:29 MSUSelectWrite(8,9000000)
01/20 19:09:29 INFO:     packet sent (31 bytes of 31)
01/20 19:09:29 INFO:     command sent to server
01/20 19:09:29 INFO:     message sent: 'CMD=GETNODES ARG=0:ALL'
01/20 19:09:29 MSURecvData(S,9000000,0)
01/20 19:09:29 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:09:29 MSUSelectRead(8,9000000)
01/20 19:09:29 INFO:     9 of 9 bytes read from sd 8
01/20 19:09:29 MSURecvPacket(8,Buffer,269,NULL,9000000)
01/20 19:09:29 MSUSelectRead(8,9000000)
01/20 19:09:29 INFO:     269 of 269 bytes read from sd 8
01/20 19:09:29 INFO:     received message 'CK=359b7cffc1689e2a
TS=1106266169 AUTH=slurm DT=SC=0
ARG=4#xc14n13:STATE=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n14:STATE
=Idle;CMEMORY=2981;CDISK=12283;CPROC=2;#xc14n15:STATE=Idle;CMEMORY=2981;
CDISK=12283;CPROC=2;#xc14n16:STATE=Idle;CMEMORY=3813;CDISK=7867;CPROC=4;
' from wiki server
01/20 19:09:29 MSUDisconnect(S)
01/20 19:09:29 INFO:     received node list through WIKI RM
01/20 19:09:29 INFO:     loading 4 node(s)
01/20 19:09:29 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:09:29 MNodeFind(xc14n13,N)
01/20 19:09:29 MRMNodePreUpdate(xc14n13,Idle,XC14N16)
01/20 19:09:29 MWikiNodeUpdate(AList,xc14n13)
01/20 19:09:29 MWikiNodeUpdateAttr(STATE=Idle,xc14n13)
01/20 19:09:29 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n13)
01/20 19:09:29 MWikiNodeUpdateAttr(CDISK=12283,xc14n13)
01/20 19:09:29 MWikiNodeUpdateAttr(CPROC=2,xc14n13)
01/20 19:09:29 MRMNodePostUpdate(xc14n13,Idle)
01/20 19:09:29 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:09:29 MNodeFind(xc14n14,N)
01/20 19:09:29 MRMNodePreUpdate(xc14n14,Idle,XC14N16)
01/20 19:09:29 MWikiNodeUpdate(AList,xc14n14)
01/20 19:09:29 MWikiNodeUpdateAttr(STATE=Idle,xc14n14)
01/20 19:09:29 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n14)
01/20 19:09:29 MWikiNodeUpdateAttr(CDISK=12283,xc14n14)
01/20 19:09:29 MWikiNodeUpdateAttr(CPROC=2,xc14n14)
01/20 19:09:29 MRMNodePostUpdate(xc14n14,Idle)
01/20 19:09:29 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:09:29 MNodeFind(xc14n15,N)
01/20 19:09:29 MRMNodePreUpdate(xc14n15,Idle,XC14N16)
01/20 19:09:29 MWikiNodeUpdate(AList,xc14n15)
01/20 19:09:29 MWikiNodeUpdateAttr(STATE=Idle,xc14n15)
01/20 19:09:29 MWikiNodeUpdateAttr(CMEMORY=2981,xc14n15)
01/20 19:09:29 MWikiNodeUpdateAttr(CDISK=12283,xc14n15)
01/20 19:09:29 MWikiNodeUpdateAttr(CPROC=2,xc14n15)
01/20 19:09:29 MRMNodePostUpdate(xc14n15,Idle)
01/20 19:09:29 MWikiGetAttr(node,Name,Status,Attr,Start)
01/20 19:09:29 MNodeFind(xc14n16,N)
01/20 19:09:29 MRMNodePreUpdate(xc14n16,Idle,XC14N16)
01/20 19:09:29 MWikiNodeUpdate(AList,xc14n16)
01/20 19:09:29 MWikiNodeUpdateAttr(STATE=Idle,xc14n16)
01/20 19:09:29 MWikiNodeUpdateAttr(CMEMORY=3813,xc14n16)
01/20 19:09:29 MWikiNodeUpdateAttr(CDISK=7867,xc14n16)
01/20 19:09:29 MWikiNodeUpdateAttr(CPROC=4,xc14n16)
01/20 19:09:29 MRMNodePostUpdate(xc14n16,Idle)
01/20 19:09:29 INFO:     0 WIKI resources detected on RM XC14N16
01/20 19:09:29 WARNING:  no resources detected
01/20 19:09:29 MRMWorkloadQuery()
01/20 19:09:29 MWikiWorkloadQuery(XC14N16,JCount,SC)
01/20 19:09:29 MWikiDoCommand(XC14N16,7321,9000000,NONE,CMD=GETJOBS
ARG=0:ALL,Data,DataSize,SC)
01/20 19:09:29 MSUConnect(S)
01/20 19:09:29 INFO:     trying to connect to 172.20.0.16 (Port: 7321)
01/20 19:09:29 INFO:     non-blocking mode established
01/20 19:09:29 MSUSelectWrite(8,9000000)
01/20 19:09:29 INFO:     successful connect to TCP server (sd: 8)
01/20 19:09:29 MSUSendData(S,9000000,FALSE,FALSE)
01/20 19:09:29 INFO:     header created '00000021
'
01/20 19:09:29 INFO:     sending short packet '00000021
CMD=GETJOBS ARG=0:ALL'
01/20 19:09:29 MSUSendPacket(8,Message,30,9000000)
01/20 19:09:29 MSUSelectWrite(8,9000000)
01/20 19:09:29 INFO:     packet sent (30 bytes of 30)
01/20 19:09:29 INFO:     command sent to server
01/20 19:09:29 INFO:     message sent: 'CMD=GETJOBS ARG=0:ALL'
01/20 19:09:29 MSURecvData(S,9000000,0)
01/20 19:09:29 MSURecvPacket(8,Buffer,9,NULL,9000000)
01/20 19:09:29 MSUSelectRead(8,9000000)
01/20 19:09:29 INFO:     9 of 9 bytes read from sd 8
01/20 19:09:29 MSURecvPacket(8,Buffer,274,NULL,9000000)
01/20 19:09:29 MSUSelectRead(8,9000000)
01/20 19:09:29 INFO:     274 of 274 bytes read from sd 8
01/20 19:09:29 INFO:     received message 'CK=8d6ccb4f46081583
TS=1106266169 AUTH=slurm DT=SC=0
ARG=1#43:UPDATETIME=1106265905;STATE=Complete;WCLIMIT=0;TASKS=4;QUEUETIM
E=1106265902;STARTTIME=1106265905;UNAME=lsfadmin;GNAME=lsfadmin;HOSTLIST
=xc14n13:xc14n14:xc14n15:xc14n16;PARTITIONMASK=lsf;NODES=4;RMEM=2000;RDI
SK=1;' from wiki server
01/20 19:09:29 MSUDisconnect(S)
01/20 19:09:29 INFO:     received job list through WIKI RM
01/20 19:09:29 INFO:     loading 1 job(s)
01/20 19:09:29 MWikiGetAttr(job,Name,Status,Attr,Start)
01/20 19:09:29 WARNING:  job '43' detected with unexpected state '20'
01/20 19:09:29 INFO:     1 WIKI jobs detected on RM XC14N16
01/20 19:09:29 INFO:     jobs detected: 1
01/20 19:09:29 MStatClearUsage(node,Active)
01/20 19:09:29 MClusterUpdateNodeState()
01/20 19:09:29 INFO:     node 'xc14n13' C/A/D procs:  2/2/0
01/20 19:09:29 INFO:     node 'xc14n14' C/A/D procs:  2/2/0
01/20 19:09:29 INFO:     node 'xc14n15' C/A/D procs:  2/2/0
01/20 19:09:29 INFO:     node 'xc14n16' C/A/D procs:  4/4/0
01/20 19:09:29 MParUpdate(ALL)
01/20 19:09:29 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:09:29 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:09:29 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:09:29 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:09:29 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:09:29 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:09:29 INFO:     jobs in queue
01/20 19:09:29 MResAdjustDRes(NULL,FALSE)
01/20 19:09:29 MQueueSelectAllJobs(Q,HARD,ALL,JIList,DP,Msg)
01/20 19:09:29 MQueueSelectAllJobs(Q,SOFT,ALL,JIList,DP,Msg)
01/20 19:09:29
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,FALSE
)
01/20 19:09:29 INFO:     idle job queue is empty on iteration 99
01/20 19:09:29
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:09:29 INFO:     idle job queue is empty on iteration 99
01/20 19:09:29
MQueueSelectJobs(SrcQ,DstQ,HARD,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:09:29 INFO:     idle job queue is empty on iteration 99
01/20 19:09:29 INFO:     cannot finalize RM cycle (RM 'XC14N16' does not
support function 'cyclefinalize')
01/20 19:09:29
MQueueSelectJobs(SrcQ,DstQ,SOFT,5120,4096,2140000000,EVERY,FReason,TRUE)
01/20 19:09:29 INFO:     idle job queue is empty on iteration 99
01/20 19:09:29 MSchedUpdateStats()
01/20 19:09:29 INFO:     iteration:   99   scheduling time:  0.001
seconds
01/20 19:09:29 MResUpdateStats()
01/20 19:09:29 INFO:     current util[99]:  0/4 (0.00%)  PH: 1.21%
active jobs: 0 of 0 (completed: 3)
01/20 19:09:29 MQueueCheckStatus()
01/20 19:09:29 MNodeCheckStatus()
01/20 19:09:29 INFO:     checking node 'xc14n13'
01/20 19:09:29 INFO:     checking node 'xc14n14'
01/20 19:09:29 INFO:     checking node 'xc14n15'
01/20 19:09:29 INFO:     checking node 'xc14n16'
01/20 19:09:29 MSysCheck()
01/20 19:09:29 MLimitEnforceAll(ALL)
01/20 19:09:29 MUClearChild(PID)
01/20 19:09:29 MParUpdate(ALL)
01/20 19:09:29 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:09:29 INFO:     MNode[xc14n13] added to MPar[lsf] (2:2)
01/20 19:09:29 INFO:     MNode[xc14n14] added to MPar[lsf] (2:2)
01/20 19:09:29 INFO:     MNode[xc14n15] added to MPar[lsf] (2:2)
01/20 19:09:29 INFO:     MNode[xc14n16] added to MPar[lsf] (4:4)
01/20 19:09:29 INFO:     P[ALL]:  Total 4:10  Up 4:10  Idle 4:10  Active
0:0
01/20 19:09:29 MResCheckStatus(NULL)
01/20 19:09:29 INFO:     scheduling complete.  sleeping 10 seconds


-----Original Message-----
From: Dave Jackson [mailto:jacksond at clusterresources.com] 
Sent: Tuesday, January 18, 2005 5:43 PM
To: Balle, Susanne
Cc: mauiusers at supercluster.org
Subject: Re: [Mauiusers] Maui/SLURM-wiki and consumable resources
otherthan processors


Susanne,

>  What does SLURM need to provide Maui for this to work?

  SLURM needs to provide per job memory requirement or per node memory
utilization information.  Maui should be able to manage memory over
subscription if either of these pieces of information are available.
These are specified via the 'DMEM' job attribute and the 'AMEMORY' node
attribute.  NOTE:  if DMEM is specified, Maui can prevent
oversubscription from occuring.  If only AMEMORY is specified, Maui can
only keep it from getting worse once it has occurred.

  A level 7 log should indicate exactly what information is being sent
from SLURM to Maui.  The 'per job' dedicated memory may need to be
specified within the SLURM job at submission time.  

  Please let us know what you find.

Dave
 
On Fri, 2005-01-14 at 16:52 -0500, Balle, Susanne wrote:
> Hi
> 
> I am trying to use the "consumable resources" feature in Maui.
> 
> I did a test to see if Maui register the amount of memory used
> when running a job with srun (slurm) as it does with processors and 
> it doesn't.
> 
> I am trying to use the "consumable resource" feature to allow jobs
> to be scheduled more efficiently. I tested this with processors and it

> works as expected. I didn't get any nodes overallocated. In the case
of 
> memory Maui overallocate my nodes.
> 
> As you can see the job run by "test" is running is using %MEM 36.3%.
> Something is wrong with these numbers as well but the basic idea 
> is that the program uses a non negligeable amount of memory. 
> This usage is not recorded in the output from "diagnose -n".
> 
> >From the output from "diagnose -n" we can see that we are using one
> processor. On xc14n16 but the amount of memory usage is not updated.
> 
> This point is further highlighed by the output from "checknode 
> xc14n16"
> enclosed below. Only processors are tracked.
> 
> Is this a bug? a limitation in the Maui/Slurm-wiki integration?
> 
> What does SLURM need to provide Maui for this to work?
> 
> Thanks for any help,
> 
> Regards,
> 
> Susannne
> 
> -----------------------------
> 
> Output from top:
> ----------------
> Mem:  3905352k av, 2813060k used, 1092292k free,       0k shrd,
188356k
> buff
>       2239404k active,             165076k inactive
> Swap: 6291288k av,       0k used, 6291288k free
325164k
> cached
> 
>   PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU
> COMMAND
>  6655 test      25   0 1385M 1.4G   252 R    24.9 36.3   2:44   1
> matmut2
>     1 root      15   0   528  528   452 S     0.0  0.0   0:44   3 init
>     2 root      RT   0     0    0     0 SW    0.0  0.0   0:00   0
> migration/0
>     3 root      RT   0     0    0     0 SW    0.0  0.0   0:00   1
> migration/1
>     4 root      RT   0     0    0     0 SW    0.0  0.0   0:00   2
> migration/2
>     5 root      RT   0     0    0     0 SW    0.0  0.0   0:00   3
> migration/3
> 
> [root at xc14n16 etc]# diagnose -n
> -------------------------------
> diagnosing node table (5120 slots)
> Name                    State  Procs     Memory         Disk
> [snip]
> xc14n13                  Idle   2:2     2981:2981    12283:12283      
> ]                         [NONE]                         [NONE]
> xc14n14                  Idle   2:2     2981:2981    12283:12283      
> ]                         [NONE]                         [NONE]
> xc14n15                  Idle   2:2     2981:2981    12283:12283      
> ]                         [NONE]                         [NONE]
> xc14n16               Running   3:4     3813:3813     7867:7867       
> ]                         [NONE]                         [NONE]
> -----                     ---   9:10   12756:12756   44716:44716      
> Total Nodes: 4  (Active: 1  Idle: 3  Down: 0)
> 
> [root at xc14n16 etc]# checknode xc14n16
> 
> checking node xc14n16
> 
> State:   Running  (in current state for 00:00:00)
> Configured Resources: PROCS: 4  MEM: 3813M  DISK: 7867M
> Utilized   Resources: [NONE]
> Dedicated  Resources: PROCS: 1
> Opsys:        [NONE]  Arch:      [NONE]
> Speed:      1.00  Load:       0.000
> Features:   [NONE]
> Attributes: [Batch]
> Classes:    [NONE]
> 
> _______________________________________________
> mauiusers mailing list
> mauiusers at supercluster.org 
> http://supercluster.org/mailman/listinfo/mauiusers

-------------- next part --------------
A non-text attachment was scrubbed...
Name: maui.job43.log
Type: application/octet-stream
Size: 99137 bytes
Desc: maui.job43.log
Url : http://www.supercluster.org/pipermail/mauiusers/attachments/20050120/3f7570b8/maui.job43-0001.obj


More information about the mauiusers mailing list