[Mauiusers] Incorrect showbf output

Vicker, Darby (JSC-EG311) darby.vicker-1 at nasa.gov
Wed Mar 9 15:17:20 MST 2011


Hello,

The "showbf" output is inaccurate but only for specific users.  For example, for user A the command will report 384 procs available but none available for user B.  We are using torque 2.3.6 and maui 3.2.6p21 with a pretty basic maui configuration.  There are no other reservations on the system other than the "debug" standing reservation and the individual reservations for each of the running jobs.  Any idea what may be going on here?  The information for lmarek is incorrect (no procs available) - if lmarek starts another job, it immediately starts on the idle procs.  

Thanks,
Darby



ADMIN1                root
ADMIN3                ALL

RMCFG[base] TYPE=PBS

AMCFG[bank]  TYPE=NONE

RMPOLLINTERVAL        00:00:30

SERVERPORT            42559
SERVERMODE            NORMAL

LOGFILE               /var/log/maui.log
LOGFILEMAXSIZE        500000000
LOGLEVEL              3

QUEUETIMEWEIGHT       1 

BACKFILLPOLICY        FIRSTFIT
RESERVATIONPOLICY     CURRENTHIGHEST

NODEALLOCATIONPOLICY  MINRESOURCE
NODEACCESSPOLICY      SINGLEUSER
ENABLEMULTIREQJOBS    TRUE
JOBNODEMATCHPOLICY    EXACTNODE

SRCFG[debug] STARTTIME=8:00:00 ENDTIME=17:00:00
SRCFG[debug] DAYS=Mon,Tue,Wed,Thu,Fri
SRCFG[debug] TASKCOUNT=8
SRCFG[debug] HOSTLIST=r2i3n8,r2i3n9,r2i3n10,r2i3n11,r2i3n12,r2i3n13,r2i3n14,r2i3n15
SRCFG[debug] CLASSLIST=debug

USERCFG[DEFAULT] MAXJOB=2,20 MAXPROC=512,1344







[dvicker at service0 ~]% showbf --loglevel=10 -u dvicker
INFO:     LOGLEVEL set to 10
MUGetOpt(3,ArgV,C:D:F:hP:V?-:Aa:c:d:f:g:m:M:n:p:q:r:Su:vV,OptArg)
INFO:     flag 'u' detected
INFO:     arg 'dvicker' found for flag 'u'
INFO:     processing flag 'u'
MUGetOpt(1,ArgV,C:D:F:hP:V?-:Aa:c:d:f:g:m:M:n:p:q:r:Su:vV,OptArg)
INFO:     flags loaded
INFO:     1 command line args remaining:  'showbf'
MSUConnect(S,FALSE,EMsg)
INFO:     trying to connect to 10.148.0.3 (Port: 42559)
INFO:     non-blocking mode established
MSUSelectWrite(3,30000000)
INFO:     successful connect to TCP server (sd: 3)
MUUIDToName(3179)
MCSendRequest(S)
MSUSendData(S,30000000,TRUE,FALSE)
MUUIDToName(3179)
MSecGetChecksum2(Buf1,30,Buf2,86,Checksum,[NONE],CSKey)
INFO:     header created '00000136
CK=c011d341eca82d34 TS=1299708550 AUTH=dvicker DT='
INFO:     sending short packet '00000136
CK=c011d341eca82d34 TS=1299708550 AUTH=dvicker DT=CMD=showbf AUTH=dvicker ARG=dvicker eg3 ALL ALL 0 0 0 0 0 NC 0 0 [NONE] [NONE] [NONE]
'
MSUSendPacket(3,Buf,145,30000000,SC)
INFO:     sending packet '00000136
CK=c011d341eca82d34 TS=1299708550 AUTH=dvicker DT=CMD=showbf AUTH=dvicker ARG=dvicker eg3 ALL ALL 0 0 0 0 0 NC 0 0 [NONE] [NONE] [NONE]
'
MSUSelectWrite(3,30000000)
INFO:     packet sent (145 bytes of 145)
INFO:     message sent to server
INFO:     message sent: 'CMD=showbf AUTH=dvicker ARG=dvicker eg3 ALL ALL 0 0 0 0 0 NC 0 0 [NONE] [NONE] [NONE]
'
MSURecvData(S,30000000,TRUE,SC,EMsg)
MSURecvPacket(3,BufP,9,NULL,30000000,SC)
MSUSelectRead(3,30000000)
INFO:     9 of 9 bytes read from sd 3
INFO:     message '00000200
' read
MSURecvPacket(3,BufP,200,NULL,30000000,SC)
MSUSelectRead(3,30000000)
INFO:     200 of 200 bytes read from sd 3
INFO:     message 'CK=9c3a97e2b92f76a4 TS=1299708550 AUTH=root CLIENT=[NONE] DT=SC=1        ARG=backfill window (user: 'dvicker' group: 'eg3' partition: ALL) Wed Mar  9 16:09:03

384 procs available with no timelimit


' read
INFO:     message received
INFO:     received message 'CK=9c3a97e2b92f76a4 TS=1299708550 AUTH=root CLIENT=[NONE] DT=SC=1        ARG=backfill window (user: 'dvicker' group: 'eg3' partition: ALL) Wed Mar  9 16:09:03

384 procs available with no timelimit


' from server
MCShowBackfillWindow(Buffer)
backfill window (user: 'dvicker' group: 'eg3' partition: ALL) Wed Mar  9 16:09:03

384 procs available with no timelimit



MSUDisconnect(S)
[dvicker at service0 ~]% 













[dvicker at service0 ~]% showbf --loglevel=10 -u lmarek
INFO:     LOGLEVEL set to 10
MUGetOpt(3,ArgV,C:D:F:hP:V?-:Aa:c:d:f:g:m:M:n:p:q:r:Su:vV,OptArg)
INFO:     flag 'u' detected
INFO:     arg 'lmarek' found for flag 'u'
INFO:     processing flag 'u'
MUGetOpt(1,ArgV,C:D:F:hP:V?-:Aa:c:d:f:g:m:M:n:p:q:r:Su:vV,OptArg)
INFO:     flags loaded
INFO:     1 command line args remaining:  'showbf'
MSUConnect(S,FALSE,EMsg)
INFO:     trying to connect to 10.148.0.3 (Port: 42559)
INFO:     non-blocking mode established
MSUSelectWrite(3,30000000)
INFO:     successful connect to TCP server (sd: 3)
MUUIDToName(3179)
MCSendRequest(S)
MSUSendData(S,30000000,TRUE,FALSE)
MUUIDToName(3179)
MSecGetChecksum2(Buf1,30,Buf2,85,Checksum,[NONE],CSKey)
INFO:     header created '00000135
CK=92b78730bef0a88c TS=1299708641 AUTH=dvicker DT='
INFO:     sending short packet '00000135
CK=92b78730bef0a88c TS=1299708641 AUTH=dvicker DT=CMD=showbf AUTH=dvicker ARG=lmarek eg3 ALL ALL 0 0 0 0 0 NC 0 0 [NONE] [NONE] [NONE]
'
MSUSendPacket(3,Buf,144,30000000,SC)
INFO:     sending packet '00000135
CK=92b78730bef0a88c TS=1299708641 AUTH=dvicker DT=CMD=showbf AUTH=dvicker ARG=lmarek eg3 ALL ALL 0 0 0 0 0 NC 0 0 [NONE] [NONE] [NONE]
'
MSUSelectWrite(3,30000000)
INFO:     packet sent (144 bytes of 144)
INFO:     message sent to server
INFO:     message sent: 'CMD=showbf AUTH=dvicker ARG=lmarek eg3 ALL ALL 0 0 0 0 0 NC 0 0 [NONE] [NONE] [NONE]
'
MSURecvData(S,30000000,TRUE,SC,EMsg)
MSURecvPacket(3,BufP,9,NULL,30000000,SC)
MSUSelectRead(3,30000000)
INFO:     9 of 9 bytes read from sd 3
INFO:     message '00000180
' read
MSURecvPacket(3,BufP,180,NULL,30000000,SC)
MSUSelectRead(3,30000000)
INFO:     180 of 180 bytes read from sd 3
INFO:     message 'CK=e799c937926ff378 TS=1299708641 AUTH=root CLIENT=[NONE] DT=SC=1        ARG=backfill window (user: 'lmarek' group: 'eg3' partition: ALL) Wed Mar  9 16:10:36

no procs available


' read
INFO:     message received
INFO:     received message 'CK=e799c937926ff378 TS=1299708641 AUTH=root CLIENT=[NONE] DT=SC=1        ARG=backfill window (user: 'lmarek' group: 'eg3' partition: ALL) Wed Mar  9 16:10:36

no procs available


' from server
MCShowBackfillWindow(Buffer)
backfill window (user: 'lmarek' group: 'eg3' partition: ALL) Wed Mar  9 16:10:36

no procs available



MSUDisconnect(S)
[dvicker at service0 ~]% 



More information about the mauiusers mailing list