[torquedev] [Bug 86] Implement transparent resource limits

bugzilla-daemon at supercluster.org bugzilla-daemon at supercluster.org
Thu Oct 7 10:38:19 MDT 2010


--- Comment #12 from Eygene Ryabinkin <rea+maui at grid.kiae.ru> 2010-10-07 10:38:19 MDT ---
Once again, I am asking you to stop spamming -- you don't understand the point
of the patch.
If you have something to explain to me -- do it privately, don't let others to
read this stupid conversation when both me and you are just repeating our

You seem to have a formed opinion on how to schedule jobs and how to allocate
resources.  What you can't understand is that all use cases can't be described
by your view on the things -- there are many different situations and many
different scheduling and resource usage policies, especially in a mixed
environments where multi-organizational jobs should be processed.

Becides, you seem not to be familiar with my patch you're trying to criticize. 
So, what't the point?  You think that the stuff I am trying to implement is
doable without this patch?  OK, prove it, don't say "You should ask that in
Maui mailing list, not here".  Or, if you can't prove, don't throw assertions
like "Not a very good idea to fix a Maui configuration problem with a patch for
Torque", especially when you're saying that you're not an expert in Maui/Moab,

(In reply to comment #10)
> What you seem to totally ignore is the correct way to approach this. You can
> configure the server to add resource limitation to jobs (which will then be
> enforced using ulimit) and YOU need to configure YOUR scheduler to understand
> this correctly. In your case it means "ignore vmem resource".

Scheduler shouldn't ignore the 'vmem' resource if user was asked for it -- that
will be plain wrong.  And your "correct" approach won't solve this problem: I
need the scheduler to consider the 'vmem' and other attributes when they were
explicitly requested, but I also need to limit the total 'vmem' consumption in
any case.  It isn't doable if one is using resources_max/resources_default,
because there is no way to understand who set the requirement -- user or
'resources_default': this will be just the requirement for the scheduler.  And
I can't turn it off, as explained in the beginning of this paragraph.

> Once again this is a scheduler configuration problem. Your scheduler is not
> ignoring what you want him to ignore but instead of configuring the scheduler
> you wrote a patch for Torque.

Once again, you just don't understand the nature of the patch.

> I am talking about central configuration on the server!
> Ulimit is already supported in Torque.

You seem to be unaware what is the difference between 'vmem' and 'pvmem'. 
There is no sane way to limit 'vmem' using ulimit.  Please, go and read the
sources -- mom_set_limits(), mom_over_limit() and mom_do_poll() within the
arch-dependent mom_mach.c should be very enlightening.  Or, at least, glance
over http://www.clusterresources.com/torquedocs/2.1jobsubmission.shtml

And this "ulimit implementation" has a side effects on the scheduling, as was
explained a number of times.

> If you would implement cgroup support into Torque instead of this patch it would be awesome, because cgroups give you much more control over the limits.

Please, stop telling me what I should do, especially in such a way.  May be I
will implement cgroup support, if I will need this, but certainly I won't do it
if someone tells me "Hey, you!  Instead of doing your patches, implement this
brilliant idea".  Communication is a skill and you seem to be lacking in that

Configure bugmail: http://www.clusterresources.com/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

More information about the torquedev mailing list