[Mauiusers] maui and file limits

Valery Mitsyn vvm at mammoth.jinr.ru
Thu Apr 27 08:39:49 MDT 2006


On Wed, 26 Apr 2006, Bogdan Lobodzinski wrote:

>
> Hello,
>
> I have a trouble with maui (3.2.6p14) and torque (2.0.0p7)
> integration.
> The problem appears when I am using resources_max.file in the queue
> definition. Jobs submitted to this queue cannot be executed via maui
> scheduler. Everything is fine when the line with resources_max.file is
> removed from the queue setup.

  This should be a line in <pbs_home>/mom_priv/config
for mount point which is in interest:
size[fs=<mount_point>]
  On our farm:
------------------------------
lxcsesrv:~ # checknode nodexxx


checking node nodexxx

State:   Running  (in current state for 00:00:00)
Configured Resources: PROCS: 2  MEM: 2008M  SWAP: 3531M  DISK: 19G
Utilized   Resources: DISK: 70M
Dedicated  Resources: PROCS: 2
-------------------------------

>
> My configuration is:
>
> queue definition
> -------------------------
> % qmgr -c 'l q SM5'
> Queue SM5
>        queue_type = Execution
>        Priority = 6
>        total_jobs = 3
>        state_count = Transit:0 Queued:3 Held:0 Waiting:0 Running:0 Exiting:0
>        max_running = 200
>        from_route_only = True
>        resources_max.file = 1950mb <-----------!!!!!!
>        resources_max.nodect = 1
>        resources_max.pcput = 24:00:00
>        resources_max.pmem = 256mb
>        resources_max.pvmem = 350mb
>        resources_min.nodect = 1
>        resources_default.neednodes = 1:medium5
>        resources_default.nice = 15
>        resources_default.nodect = 1
>        resources_default.nodes = 1:medium5
>        resources_assigned.nodect = 0
>        max_user_run = 170
>        enabled = True
>        started = True
>
> ----------
>
> maui checkjob status:
> ----------
> % checkjob -v 1001
>
>
> checking job 1001 (RM job '1001.h1farm03.desy.de')
>
> State: Idle
> Creds:  user:bogdan  group:h1  class:SM5  qos:DEFAULT
> WallTime: 00:00:00 of 00:00:00
> SubmitTime: Wed Apr 26 19:52:19
>  (Time Queued  Total: 00:00:22  Eligible: 00:00:22)
>
> Total Tasks: 1
>
> Req[0]  TaskCount: 1  Partition: ALL
> Network: [NONE]  Memory >= 0  Disk >= 1950M  Swap >= 0
> Opsys: [NONE]  Arch: [NONE]  Features: [medium5][1]
> Exec:  ''  ExecSize: 0  ImageSize: 0
> Dedicated Resources Per Task: PROCS: 1  MEM: 256M  SWAP: 350M  DISK: 1950M
> NodeAccess: SHARED
> NodeCount: 0
>
>
> IWD: [NONE]  Executable:  [NONE]
> Bypass: 0  StartCount: 0
> PartitionMask: [ALL]
> Flags:       RESTARTABLE
>
> PE:  15600.00  StartPriority:  0
> job cannot run in partition DEFAULT (idle procs do not meet requirements :
> 0 of 1 procs found)
> idle procs:  12  feasible procs:   0
>
> Rejection Reasons: [Features     :    4]
>
> Detailed Node Availability Information:
>
> h1bombeiros.desy.de      rejected : Features
> h1farm150.desy.de        rejected : Features
> h1farm152.desy.de        rejected : Features
> h1farm157.desy.de        rejected : Features
>
> --------
>
> maui checknode shows:
> --------
> % checknode h1farm150.desy.de
>
>
> checking node h1farm150.desy.de
>
> State:      Idle  (in current state for 00:12:27)
> Configured Resources: PROCS: 4  MEM: 1007M  SWAP: 1734M  DISK: 1M
> Utilized   Resources: [NONE]
> Dedicated  Resources: [NONE]
> Opsys:         linux  Arch:        farm
> Speed:      1.00  Load:       0.000
> Network:    [DEFAULT]
> Features:
> [short5][xshort5][medium5][long5][bigmedium5][biglong5][oo5][mc5]
> Attributes: [Batch]
> Classes:    [SM5 4:4][bigmemM5 4:4][qoo 4:4][oo 4:4][SX5 4:4][mc5 4:4][BL5
> 4:4][SL5 4:4][BM5 4:4][SC5 4:4][mc2 0:4][qmc2 0:4][qmc5 4:4]
>
> Total Time: 5:42:45  Up: 5:42:45 (100.00%)  Active: 00:52:52 (15.42%)
>
> Reservations:
> NOTE:  no reservations on node
>
> ------------
>
> As I see the problem is due to the fact that maui does not recognize
> properly configured DISK resource, more exactly: maui returns bad value
> for DISK resource.
>
> checkjob shows:
> % checkjob -v 1001
> ...
> Dedicated Resources Per Task: PROCS: 1  MEM: 256M  SWAP: 350M  DISK: 1950M
>                                                               ^^^^^^^^^^^
> ...
> In this case: DISK: 1950M comes from queue setup.
>
> While checknode returns:
> % checknode h1farm150.desy.de
> ...
> Configured Resources: PROCS: 4  MEM: 1007M  SWAP: 1734M  DISK: 1M
>                                                         ^^^^^^^^
> ...
> No idea which parameter determines such value: DISK: 1M .
>
> So, jobs cannot be started because DISK requirements (1950M) is higher
> then Configured DISK Resource (1M) .
>
> I will be grateful for any informations about
> maui patches removing mentioned behavior or hints how to avoid this
> conflict with file limits .
>
> Best Regards,
>
> Bogdan
>
>
> _______________________________________________
> mauiusers mailing list
> mauiusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/mauiusers
>

-- 
Best regards,
  Valery Mitsyn


More information about the mauiusers mailing list