[torqueusers] how to use queue mgr setup for memory usage?

Thomas H Dr Pierce TPierce at rohmhaas.com
Tue Apr 29 12:46:16 MDT 2008


Dear Torqueusers,

How are memory requirements setup on Torque 2.1.10? 

I just added the mem=5000mb command to my PBS script. I have a few nodes 
in the cluster with 8 GB of physical memory. These nodes are members of 
the "medium" queue.  I thought this job would be allocated to one of those 
nodes. But it is not. I get a NoResources problem.

>From the Maui command checkjob
...

checking job 392

State: Idle  EState: Deferred
Creds:  user:nabdws  group:users  class:medium  qos:DEFAULT
WallTime: 00:00:00 of 99:23:59:59
SubmitTime: Tue Apr 29 14:14:46
  (Time Queued  Total: 00:24:36  Eligible: 00:00:00)

Total Tasks: 2

Req[0]  TaskCount: 2  Partition: ALL
Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
Opsys: [NONE]  Arch: [NONE]  Features: [d1850]
Dedicated Resources Per Task: PROCS: 1  MEM: 2500M  SWAP: 16G

IWD: [NONE]  Executable:  [NONE]
Bypass: 0  StartCount: 0
PartitionMask: [ALL]
Flags:       RESTARTABLE

job is deferred.  Reason:  NoResources  (cannot create reservation for job 
'392' (intital reservation attempt)
)
Holds:    Defer  (hold reason:  NoResources)
PE:  16.97  StartPriority:  24
cannot select job 392 for partition DEFAULT (job hold active)

...
Job script:
---------------------------
## ***** Parallel run *****
##
#PBS -q medium
#PBS -m ea
#PBS -l nodes=1:ppn=2
#PBS -l mem=5000mb
.....................................................................

qmgr -c "p s" gives

create queue medium
set queue medium queue_type = Execution
set queue medium Priority = 40
set queue medium max_running = 10
set queue medium acl_host_enable = False
set queue medium acl_hosts = node49
set queue medium acl_hosts += node48
set queue medium acl_hosts += node43
set queue medium acl_hosts += node42
set queue medium acl_hosts += node41
set queue medium acl_hosts += node50
set queue medium resources_max.vmem = 32gb
set queue medium resources_default.nodes = 1
set queue medium resources_available.nodect = 40
set queue medium enabled = True
set queue medium started = True
#
set server pbs_version = 2.1.10
----------------------------------------------------
pbsnodes -a  gives 
--------------------------------
[...}
node42
     state = free
     np = 2
     properties = d1850
     ntype = cluster
     status = opsys=linux,uname=Linux node42 2.6.9-42.ELsmp #1 SMP Wed Jul 
12 23:32:02 EDT 2006 x86_64,ses
sions=4321 
5627,nsessions=2,nusers=2,idletime=724411,totmem=4826752kb,availmem=1962804kb,physmem=8169096kb
,ncpus=4,loadave=0.08,netload=4294967294,state=free,jobs=? 
0,rectime=1209493220

node43
     state = free
     np = 2
     properties = d1850
     ntype = cluster
     status = opsys=linux,uname=Linux node43 2.6.9-42.ELsmp #1 SMP Wed Jul 
12 23:32:02 EDT 2006 x86_64,ses
sions=4319,nsessions=1,nusers=1,idletime=724050,totmem=6006400kb,availmem=5873948kb,physmem=8169096kb,ncpu
s=4,loadave=0.00,netload=33973156,state=free,jobs=? 0,rectime=1209493221
----------------------------------------------------------------------------


------
Sincerely,

   Tom Pierce
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20080429/a93e405f/attachment.html


More information about the torqueusers mailing list