[Mauiusers] node allocation problem with NODEALLOCATIONPOLICY MINRESOURCE

Lech Nieroda lnieroda at gmail.com
Tue Jan 27 04:24:41 MST 2009


Hello,

On Thu, Jan 22, 2009 at 7:45 PM, Craig West <cwest at astro.umass.edu> wrote:
> Are you able to run more than one job on a single node at all?
Yes, that works.

> Are you also specifying RAM or other restrictions in the qsub, or default
> settings for the queue in torque?
Yes, the requested resources of the jobs (as printed by qstat -f) are
as follows:
*6 cpu job:
   queue = default
   Resource_List.neednodes = 1:ppn=6
   Resource_List.nodect = 1
   Resource_List.nodes = 1:ppn=6
   Resource_List.vmem = 11000mb
   Resource_List.walltime = 96:00:00

*1 cpu job:
   queue = default
   Resource_List.cput = 672:00:00
   Resource_List.neednodes = 1:ppn=1
   Resource_List.nodect = 1
   Resource_List.nodes = 1:ppn=1
   Resource_List.pcput = 672:00:00
   Resource_List.vmem = 7168mb
   Resource_List.walltime = 672:00:00

A checknode on a node running a 6 cpu job shows:
  State:   Running  (in current state for 00:00:00)
  Configured Resources: PROCS: 8  MEM: 31G  SWAP: 25G  DISK: 1M
  Utilized   Resources: [NONE]
  Dedicated  Resources: PROCS: 6  SWAP: 10G
  Opsys:         linux  Arch:      [NONE]
  Speed:      1.00  Load:       5.080
  Location:   Partition: DEFAULT  Frame/Slot:  1/1
  Network:    [DEFAULT]
  Features:   [NONE]
  Attributes: [Batch]
  Classes:    [default 2:8][small 8:8]

  Total Time:   INFINITY  Up:   INFINITY (99.71%)  Active:   INFINITY (53.70%)

  Reservations:
    Job '38882'(x6)  -5:10:00 -> 3:18:50:00 (4:00:00:00)
  JobList:  38882

A checknode on a node running two of the 1 cpu job shows:
  State:   Running  (in current state for 00:00:00)
  Configured Resources: PROCS: 8  MEM: 31G  SWAP: 23G  DISK: 1M
  Utilized   Resources: [NONE]
  Dedicated  Resources: PROCS: 2  SWAP: 14G
  Opsys:         linux  Arch:      [NONE]
  Speed:      1.00  Load:       2.000
  Location:   Partition: DEFAULT  Frame/Slot:  1/1
  Network:    [DEFAULT]
  Features:   [NONE]
  Attributes: [Batch]
  Classes:    [default 6:8][small 8:8]

  Total Time:   INFINITY  Up:   INFINITY (99.45%)  Active:   INFINITY (55.08%)

  Reservations:
    Job '39090'(x1)  -5:19:39:16 -> 22:04:20:44 (28:00:00:00)
    Job '39091'(x1)  -5:19:39:15 -> 22:04:20:45 (28:00:00:00)
  JobList:  39090,39091

> What version of Maui and Torque are you running?
torque 2.1.8 and maui 3.2.6p19.

> A copy of your maui.cfg might help.
An excerpt from our maui.cfg:

RMPOLLINTERVAL          00:00:30
SERVERMODE              NORMAL
RMCFG[base]             TYPE=PBS
LOGFILE               maui.log
LOGFILEMAXSIZE        10000000
LOGLEVEL              3
QUEUETIMEWEIGHT       1
FSPOLICY              DEDICATEDPS
FSDEPTH               7
FSINTERVAL            86400
FSDECAY               0.80
BACKFILLPOLICY        FIRSTFIT
RESERVATIONPOLICY     CURRENTHIGHEST
NODEALLOCATIONPOLICY  MINRESOURCE
USERCFG[DEFAULT]      FSTARGET=20.0+
FSWEIGHT 10
FSUSERWEIGHT 100
ENFORCERESOURCELIMITS ON
RESOURCELIMITPOLICY[0] MEM:ALWAYS:CANCEL
SRCFG[small] TASKCOUNT=1 RESOURCES=PROCS:4,MEM:16384
SRCFG[small] HOSTLIST=cluster1.local
SRCFG[small] PERIOD=INFINITY
SRCFG[small] TIMELIMIT=1:00:00
SRCFG[small] CLASSLIST=small

The general idea behind this config is to have 2 queues: a default one
for 32 nodes and one for small jobs (with a walltime of maximum one
hour) which run on one dedicated host.

Greetings,
Lech


More information about the mauiusers mailing list