[Mauiusers] Error in HARD MAXJOB LIMIT
Francesco Del Citto
del.citto at ing.uniroma2.it
Wed Feb 15 03:31:26 MST 2006
I've just applied the patch you suggested (
http://www.clusterresources.com/pipermail/mauiusers/2006-February/002009.html
)
and all seems work fine, now!
Thank you very much!
Francesco
etienne gondet ha scritto:
>
> Apparently there is a BUG in fairness with patch 14
> doubling the real usage of procs or jobs,
>
> USERCFG[DEFAULT] MAXJOB=3
>
> so with 2 serial jobs you probably violate thois hard limit.
>
> There is a patch :
> or just com back to p13 .
>
> http://www.clusterresources.com/pipermail/mauiusers/2006-February/002009.html
>
>
> Etienne Gondet.
>
> Francesco Del Citto a écrit:
>
>> Hi!
>> I have a problem with "HARD MAXJOB LIMIT", using maui 3.2.6p14 and
>> torque 2.0.0p7
>> In maui.cfg I have:
>>
>> USERCFG[DEFAULT] MAXPROC=8,11
>> USERCFG[DEFAULT] MAXJOB=3
>> USERCFG[DEFAULT] FSTARGET=25.0
>> USERCFG[DEFAULT] PRIORITY=1000
>> GROUPCFG[DEFAULT] MAXPROC=8,11
>> GROUPCFG[DEFAULT] MAXJOB=6
>> GROUPCFG[DEFAULT] FSTARGET=25.0
>> USERCFG[DEFAULT] PRIORITY=1000
>>
>> GROUPCFG[kiva] MAXJOB=8
>> GROUPCFG[kiva] MAXPROC=11
>> GROUPCFG[kiva] PRIORITY=5000
>>
>> and I'm trying to run 3 jobs. 2 serial jobs and 1 parallel job (2
>> processors),
>> submiting them in this order (first the serials, then the parallel one).
>> The cluster is completely free, now. This is what I get with showq and
>> checkjob:
>>
>> ------------------------------------------------------------------------------
>>
>> [francesco at epsilon run]$ showq
>> ACTIVE JOBS--------------------
>> JOBNAME USERNAME STATE PROC REMAINING
>> STARTTIME
>>
>> 2046 francesco Running 1 99:23:35:36 Tue Feb 14
>> 09:23:42
>> 2047 francesco Running 1 99:23:51:22 Tue Feb 14
>> 09:39:28
>>
>> 2 Active Jobs 2 of 9 Processors Active (22.22%)
>>
>> IDLE JOBS----------------------
>> JOBNAME USERNAME STATE PROC WCLIMIT
>> QUEUETIME
>>
>>
>> 0 Idle Jobs
>>
>> BLOCKED JOBS----------------
>> JOBNAME USERNAME STATE PROC WCLIMIT
>> QUEUETIME
>>
>> 2048 francesco Idle 2 99:23:59:59 Tue Feb 14
>> 09:40:21
>>
>> Total Jobs: 3 Active Jobs: 2 Idle Jobs: 0 Blocked Jobs: 1
>> ------------------------------------------------------------------------------
>>
>>
>> ------------------------------------------------------------------------------
>>
>> [francesco at epsilon run]$ checkjob 2048
>>
>>
>> checking job 2048
>>
>> State: Idle
>> Creds: user:francesco group:kiva class:batch qos:DEFAULT
>> WallTime: 00:00:00 of 99:23:59:59
>> SubmitTime: Tue Feb 14 09:40:21
>> (Time Queued Total: 00:08:39 Eligible: 00:00:00)
>>
>> Total Tasks: 2
>>
>> Req[0] TaskCount: 2 Partition: ALL
>> Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0
>> Opsys: [NONE] Arch: [NONE] Features: [new]
>> NodeCount: 2
>>
>>
>> IWD: [NONE] Executable: [NONE]
>> Bypass: 0 StartCount: 0
>> PartitionMask: [ALL]
>> Flags: RESTARTABLE
>>
>> PE: 2.00 StartPriority: 6725
>> cannot select job 2048 for partition DEFAULT (job 2048 violates
>> active HARD
>> MAXJOB limit of 3 for user francesco (R: 1, U: 4)
>> )
>> ------------------------------------------------------------------------------
>>
>>
>> Is it a bug or a misconfiguration?
>>
>> Francesco
>> _______________________________________________
>> mauiusers mailing list
>> mauiusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/mauiusers
>>
>>
>>
>>
>>
>>
>>
>
>
>
More information about the mauiusers
mailing list