[Mauiusers] NODEALLOCATIONPOLICY = CPULOAD (Jana Uhlirova)

Zerony Zhao bw.linux at gmail.com
Thu Jun 25 08:18:02 MDT 2009


Hi Maui Users,
I am new to Maui. I have the same problem too. I use
NODEACCESSPOLICY        SINGLEJOB
NODEALLOCATIONPOLICY  MINRESOURCE

My clusters are multi-core Nodes. Some jobs will use multiple cores, some do
not. I wish the policy is, allocating the new job idle nodes first, if there
are no idle nodes, then allocating it the available partially free nodes,
last put it in queues . When I use pbs_sched, it is easier to set set server
node_pack = False, and everything works fine. Using Maui with
NODEACCESSPOLICY SINGLEJOB setting, Every nodes ONLY runs 1 job. How should
I do to adjust the maui policy to fulfill the objective?
Thanks in advance,

ZZ


> Hello,
>
> we'd like to submit jobs on a load balanced basis. We set
> NODEALLOCATIONPOLICY to CPULOAD.
> When we submit a job and there are enough fully free nodes, then
> everything is ok, the job is running. But if there aren't enough fully
> free nodes - only partially free nodes are available, the job doesn't
> run. It is in the idle state.
>
> For example:
> A job requests -l nodes=4:ppn=3
> Available nodes (every node has 8 CPUs):
> Nodes#    Free CPUs
> r1i1n4#    4
> r1i1n5#    3
> r1i1n7#    6
> r1i1n8#    2
> r1i1n9#    2
> r1i1n12#    3
> r1i1n13#    3
> r1i1n15#    3
>
> The requested resources are available, but the job doesn't run.
>
> The output of checkjob 8970:
>
> State: Idle
> Creds: user:black group:users class:batch qos:DEFAULT
> WallTime: 00:00:00 of 41:16:00:00
> SubmitTime: Thu Jun 11 15:01:53
> (Time Queued Total: 00:16:26 Eligible: 00:16:26)
>
> StartDate: 00:00:01 Thu Jun 11 15:18:20
> Total Tasks: 12
>
> Req[0] TaskCount: 12 Partition: ALL
> Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0
> Opsys: [NONE] Arch: [NONE] Features: [batch]
>
> IWD: [NONE] Executable: NONE]
> Bypass: 0 StartCount: 0
> PartitionMask: [ALL]
> Flags: RESTARTABLE
>
> Reservation '8970' (00:00:01 -> 41:16:00:01 Duration: 41:16:00:00)
> PE: 12.00 StartPriority: 16
> cannot select job 8970 for partition DEFAULT (startdate in '00:00:01')
>
> We changed NODEALLOCATIONPOLICY from CPULOAD to MINRESOURCES and the job
> started immediately. But we'd like to use load balance.
>
> Any suggestion?
>
> Thank you.
>
> Best regards
>
> Jana Uhlirova
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20090625/0a6d68b7/attachment.html 


More information about the mauiusers mailing list