[torqueusers] Consumable resources + maui and torque
scoggins
JScoggins at lbl.gov
Tue Nov 13 14:01:21 MST 2007
I have a client who wish to use consumable resources. I am running
torque 2.1.8 and maui 3.2.6p19.
I have setup the following:
# cat maui.cfg
SERVERHOST xxx
ADMIN1 root
RMCFG[base] TYPE=PBS
AMCFG[bank] TYPE=NONE
RMPOLLINTERVAL 00:00:30
SERVERPORT 42559
SERVERMODE NORMAL
LOGFILE maui.log
LOGFILEMAXSIZE 10000000
LOGLEVEL 3
QUEUETIMEWEIGHT 1
FSPOLICY DEDICATEDPES
FSINTERVAL 24:00:00
FSDEPTH 11
FSDECAY 0.80
FSWEIGHT 1000
USERWEIGHT 1
BACKFILLPOLICY FIRSTFIT
RESERVATIONPOLICY CURRENTHIGHEST
DEFERTIME 0
NODEALLOCATIONPOLICY MINRESOURCE
NODEACCESSPOLICY DEDICATED
NODELOADPOLICY ADJUSTSTATE
ENABLEMULTINODEJOBS TRUE
ENABLEMULTIREQJOBS TRUE
USERCFG[DEFAULT] FSTARGET=25.0
NODECFG[node0002] GRES=bigio:2
NODECFG[node0003] GRES=bigio:2
NODECFG[node0004] GRES=bigio:2
NODECFG[node0005] GRES=bigio:2
NODECFG[node0006] GRES=bigio:2
NODECFG[node0007] GRES=bigio:2
NODECFG[node0008] GRES=bigio:2
NODECFG[node0009] GRES=bigio:2
NODECFG[node0010] GRES=bigio:2
NODECFG[node0011] GRES=bigio:2
NODECFG[node0012] GRES=bigio:2
PBS_MOM on each node:
# cat mom_priv/config
$pbsserver xxx
$pbsclient xxx
$restricted xxx
$logevent 255
$usecp *:/home /home
$usecp *:/nfs /nfs
$ideal_load 3.75
$max_load 4.00
bigio 2
pbsnodes -a even shows with a gres in the list:
node0011
state = free
np = 8
properties = shared
ntype = cluster
status = opsys=linux,uname=Linux node0011 2.6.22.9-114.caos #1
SMP Sat Sep 29 08:01:31 EDT 2007 x86_64,sessions=? 0,nsessions=?
0,nusers=0,idletime=4216
68,totmem=47719416kb,availmem=47502496kb,physmem=16465000kb,ncpus=8,load
ave=0.00,gres=bigio:2,netload=552480303,state=free,jobs=?
0,rectime=1194987659
I reconfigured torque and installed the addparam script from the
website: http://www.clusterresources.com/products/maui/docs/
13.3.1pbsrmextensions.shtml.
But when I do the following:
qsub -W x=GRES:bigio <script>
several times to see if I can control the number of jobs to 2 as
indicated above. I see 3 - 4 jobs running on the same node.
What am I doing wrong?
Thanks
Jackie
More information about the torqueusers
mailing list