[torquedev] ulimit/setrlimit doesn't enforce RLIMIT_DATA on Linux

"Mgr. Šimon Tóth" SimonT at mail.muni.cz
Tue Oct 19 09:46:26 MDT 2010


>>> People will need to run external schedulers with Torque, so it should
>>> leave the possibilities to adopt its semantics to the scheduler's one
>>> (and vice-versa, for an ideal world).
>>
>> External schedulers should just request job runs without any resource
>> semantics at all. Simply request a run with exec host specified. From
>> what I have been told, most do.
> 
> Nope, scheduler should consider resource usage to get the "best fit" and
> "best utilization", whatever this will mean for the local administrator.
> They just can't really rely on the batch server for this -- it is not
> the business of the batch server to decide where the job should be
> executed, it is the work of the scheduler (even if it lives within the
> batch server).  What batch server can do is check that the job constraints
> allow the particular job to run on the set of nodes that were chosen
> by the scheduler.  But this should also be configurable, because
> often batch server is "too smart" and is preventing scheduler from
> doing the proper stuff.
> 
> Think of batch server as just a dumb job transport that can additionally
> set the limits on the target resources, but it shouldn't really decide
> if the job is eligible to run on the particular node -- it's a scheduler
> job.  Or, at least, such behaviour should be configurable.
> 
> Of course, the reality is a bit more complicated, since there are
> routing queues that effectively restrict (in some configurations) the
> set of resources on which the job will be able to run.  So, really,
> people tend to distribute scheduler's duties amongst the scheduler
> and a batch system.

I think you misunderstood. Once again, if scheduler requests resources
in the run request, then these resources should be checked. If the
scheduler doesn't request any resource semantics and specifies only the
target nodes in the run request, then no resources should be (and this
is how it is now) checked.

-- 
Mgr. Šimon Tóth

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3366 bytes
Desc: S/MIME Cryptographic Signature
Url : http://www.supercluster.org/pipermail/torquedev/attachments/20101019/86780a08/attachment-0001.bin 


More information about the torquedev mailing list