[Mauiusers] Patch for nodeaccespolicy SINGLEJOB and MAXPS for SMP machines

Garrick Staples garrick at usc.edu
Mon Feb 14 19:42:50 MST 2005


On Mon, Feb 14, 2005 at 05:45:06PM +0100, Bas van der Vlies alleged:
> At our side we run one job per nodes and have an MAXPS setting of 600 
> hours and max walltime 120 hours. Our nodes have 2 processors. When the 
> user submits
> an job for eg:
>   1) qsub -I -lnodes=60:ppn=1 -lwalltime=10:00:00 ( will run )
>   2) qsub -I -lnodes=60:ppn=2 -lwalltime=10:00:00 ( wil not run MAXPS
>                                                     violation)
> 
> Now when job 1 runs is allocates the whole node and maui sees that it 
> oocupies 4 task ( 2 nodes and each node two cpu's = 4 tasks). So the 
> used tme will becalculated as 60 * 2 * 10 = 1200 hours. What is far more 
> then allowed!
> 
> The next example will only run one job instead of 2:
>   qsub -I -lnodes=30:ppn=1 -lwalltime=10:00:00 ( will run )
>   qsub -I -lnodes=30:ppn=1 -lwalltime=10:00:00 ( will not  run MAXPS
>                                                  violation )
> 
> I have an patch that checks if NODEACCESSPOLICY SINGLEJOB is set. If so 
> then it forgets the the cpu's per node.

I understand what you are doing (and the patch looks fine to me), and I could
even see myself using it, but I'm not sure this is the right thing to do.  If
nothing else, could this behaviour be a configuration option?

What we really need is a policy on "node seconds".  It's what you are actually
trying to control.  It would be simple in the SINGLEJOB world, and might only
be valid there.  But I can also imagine assigning fractional seconds to jobs on
a shared node too, but that would be complicated.

I've always worked around this with routing queues in pbs.  First route
to a queue with small nodes and walltime, if that fails, route to a queue with
medium nodes and walltime, etc.  If the job doesn't fit through any of the
queues, then you reject the job.

-- 
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/mauiusers/attachments/20050214/b8c8f71c/attachment.bin


More information about the mauiusers mailing list