[Mauiusers] Error in HARD MAXJOB LIMIT

Francesco Del Citto del.citto at ing.uniroma2.it
Wed Feb 15 03:31:26 MST 2006


I've just applied the patch you suggested (
http://www.clusterresources.com/pipermail/mauiusers/2006-February/002009.html
)
and all seems work fine, now!
Thank you very much!
Francesco

etienne gondet ha scritto:

>
> Apparently there is a BUG in fairness with patch 14
> doubling the real usage of procs or jobs,
>
> USERCFG[DEFAULT]      MAXJOB=3
>
> so with 2 serial jobs you probably violate thois hard limit.
>
> There is a  patch :
> or just com back to p13 .
>
> http://www.clusterresources.com/pipermail/mauiusers/2006-February/002009.html
>
>
>    Etienne Gondet.
>
> Francesco Del Citto a écrit:
>
>> Hi!
>> I have a problem with "HARD MAXJOB LIMIT", using maui 3.2.6p14 and
>> torque 2.0.0p7
>> In maui.cfg I have:
>>
>> USERCFG[DEFAULT]      MAXPROC=8,11
>> USERCFG[DEFAULT]      MAXJOB=3
>> USERCFG[DEFAULT]      FSTARGET=25.0
>> USERCFG[DEFAULT]      PRIORITY=1000
>> GROUPCFG[DEFAULT]     MAXPROC=8,11
>> GROUPCFG[DEFAULT]     MAXJOB=6
>> GROUPCFG[DEFAULT]     FSTARGET=25.0
>> USERCFG[DEFAULT]      PRIORITY=1000
>>
>> GROUPCFG[kiva]        MAXJOB=8
>> GROUPCFG[kiva]        MAXPROC=11
>> GROUPCFG[kiva]        PRIORITY=5000
>>
>> and I'm trying to run 3 jobs. 2 serial jobs and 1 parallel job (2
>> processors),
>> submiting them in this order (first the serials, then the parallel one).
>> The cluster is completely free, now. This is what I get with showq and
>> checkjob:
>>
>> ------------------------------------------------------------------------------
>>
>> [francesco at epsilon run]$ showq
>> ACTIVE JOBS--------------------
>> JOBNAME            USERNAME      STATE  PROC   REMAINING          
>> STARTTIME
>>
>> 2046               francesco    Running     1 99:23:35:36  Tue Feb 14
>> 09:23:42
>> 2047               francesco    Running     1 99:23:51:22  Tue Feb 14
>> 09:39:28
>>
>>     2 Active Jobs       2 of    9 Processors Active (22.22%)
>>
>> IDLE JOBS----------------------
>> JOBNAME            USERNAME      STATE  PROC     WCLIMIT          
>> QUEUETIME
>>
>>
>> 0 Idle Jobs
>>
>> BLOCKED JOBS----------------
>> JOBNAME            USERNAME      STATE  PROC     WCLIMIT          
>> QUEUETIME
>>
>> 2048               francesco       Idle     2 99:23:59:59  Tue Feb 14
>> 09:40:21
>>
>> Total Jobs: 3   Active Jobs: 2   Idle Jobs: 0   Blocked Jobs: 1
>> ------------------------------------------------------------------------------
>>
>>
>> ------------------------------------------------------------------------------
>>
>> [francesco at epsilon run]$ checkjob 2048
>>
>>
>> checking job 2048
>>
>> State: Idle
>> Creds:  user:francesco  group:kiva  class:batch  qos:DEFAULT
>> WallTime: 00:00:00 of 99:23:59:59
>> SubmitTime: Tue Feb 14 09:40:21
>>  (Time Queued  Total: 00:08:39  Eligible: 00:00:00)
>>
>> Total Tasks: 2
>>
>> Req[0]  TaskCount: 2  Partition: ALL
>> Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
>> Opsys: [NONE]  Arch: [NONE]  Features: [new]
>> NodeCount: 2
>>
>>
>> IWD: [NONE]  Executable:  [NONE]
>> Bypass: 0  StartCount: 0
>> PartitionMask: [ALL]
>> Flags:       RESTARTABLE
>>
>> PE:  2.00  StartPriority:  6725
>> cannot select job 2048 for partition DEFAULT (job 2048 violates
>> active HARD
>> MAXJOB limit of 3 for user francesco  (R: 1, U: 4)
>> )
>> ------------------------------------------------------------------------------
>>
>>
>> Is it a bug or a misconfiguration?
>>
>> Francesco
>> _______________________________________________
>> mauiusers mailing list
>> mauiusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/mauiusers
>>
>>
>>
>>
>>
>>  
>>
>
>
>




More information about the mauiusers mailing list