[torqueusers] how to use queue mgr setup for memory usage?
Thomas H Dr Pierce
TPierce at rohmhaas.com
Tue Apr 29 12:46:16 MDT 2008
Dear Torqueusers,
How are memory requirements setup on Torque 2.1.10?
I just added the mem=5000mb command to my PBS script. I have a few nodes
in the cluster with 8 GB of physical memory. These nodes are members of
the "medium" queue. I thought this job would be allocated to one of those
nodes. But it is not. I get a NoResources problem.
>From the Maui command checkjob
...
checking job 392
State: Idle EState: Deferred
Creds: user:nabdws group:users class:medium qos:DEFAULT
WallTime: 00:00:00 of 99:23:59:59
SubmitTime: Tue Apr 29 14:14:46
(Time Queued Total: 00:24:36 Eligible: 00:00:00)
Total Tasks: 2
Req[0] TaskCount: 2 Partition: ALL
Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0
Opsys: [NONE] Arch: [NONE] Features: [d1850]
Dedicated Resources Per Task: PROCS: 1 MEM: 2500M SWAP: 16G
IWD: [NONE] Executable: [NONE]
Bypass: 0 StartCount: 0
PartitionMask: [ALL]
Flags: RESTARTABLE
job is deferred. Reason: NoResources (cannot create reservation for job
'392' (intital reservation attempt)
)
Holds: Defer (hold reason: NoResources)
PE: 16.97 StartPriority: 24
cannot select job 392 for partition DEFAULT (job hold active)
...
Job script:
---------------------------
## ***** Parallel run *****
##
#PBS -q medium
#PBS -m ea
#PBS -l nodes=1:ppn=2
#PBS -l mem=5000mb
.....................................................................
qmgr -c "p s" gives
create queue medium
set queue medium queue_type = Execution
set queue medium Priority = 40
set queue medium max_running = 10
set queue medium acl_host_enable = False
set queue medium acl_hosts = node49
set queue medium acl_hosts += node48
set queue medium acl_hosts += node43
set queue medium acl_hosts += node42
set queue medium acl_hosts += node41
set queue medium acl_hosts += node50
set queue medium resources_max.vmem = 32gb
set queue medium resources_default.nodes = 1
set queue medium resources_available.nodect = 40
set queue medium enabled = True
set queue medium started = True
#
set server pbs_version = 2.1.10
----------------------------------------------------
pbsnodes -a gives
--------------------------------
[...}
node42
state = free
np = 2
properties = d1850
ntype = cluster
status = opsys=linux,uname=Linux node42 2.6.9-42.ELsmp #1 SMP Wed Jul
12 23:32:02 EDT 2006 x86_64,ses
sions=4321
5627,nsessions=2,nusers=2,idletime=724411,totmem=4826752kb,availmem=1962804kb,physmem=8169096kb
,ncpus=4,loadave=0.08,netload=4294967294,state=free,jobs=?
0,rectime=1209493220
node43
state = free
np = 2
properties = d1850
ntype = cluster
status = opsys=linux,uname=Linux node43 2.6.9-42.ELsmp #1 SMP Wed Jul
12 23:32:02 EDT 2006 x86_64,ses
sions=4319,nsessions=1,nusers=1,idletime=724050,totmem=6006400kb,availmem=5873948kb,physmem=8169096kb,ncpu
s=4,loadave=0.00,netload=33973156,state=free,jobs=? 0,rectime=1209493221
----------------------------------------------------------------------------
------
Sincerely,
Tom Pierce
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20080429/a93e405f/attachment.html
More information about the torqueusers
mailing list