[torqueusers] setting Resources_min while queue has running jobs
dbeer at adaptivecomputing.com
Wed Nov 13 11:41:16 MST 2013
On Tue, Nov 12, 2013 at 8:16 PM, Andrew Mather <mathera at gmail.com> wrote:
> Hi All
> I'd like to modify one of our queues by using resources_min to enforce
> a minimum requirement for a specific single queue on our cluster. I'd
> like to use this parameter to force all jobs sent to this queue to
> 'ask for' 2 CPUs.
Setting a resources_min doesn't handle this by itself. resources_min and
resources_max are used to filter jobs among queues and resources_default is
used to apply defaults.
> The thing I am not sure about is what will happen to jobs already
> queued and more particularly, currently running, if they've requested
> only 1. Will the running ones be killed off for violating the minimum
> requirements and will the queued ones simply be held forever ?
The running jobs will not be killed. I don't believe that it will change
the jobs that are already queued as these limits and defaults are applied
at the time of queuing the job.
> Is it safe to do this while these jobs exist, or should I stop the
> queue and allow those jobs to drain before making this type of change
> ? There's currently a thousand or so jobs queued or running via this
> queue, some of which are hundreds of hours into their 1500hour
> walltime run, so I don't want to kill them off !
Obviously what you have described is the safest option, but I think it is
> Also, once this change is made, would a specific request for 1 CPU in
> the submission script override this value ?
If you only use resources_min, then yes. You need to use a combination of
resources_min and resources_default.
> The reason for the change is that this particular queue is currently
> sending a large number of small, CPU intensive jobs onto our nodes,
> which currently have hyperthreading enabled, which is causing the
> machines to bog down and performance drops right off. This is likely
> to be a long-term state of affairs due to the nature of some of the
> current projects using the cluster.
> In general, we get sufficient benefit from the hyperthreading that
> we'd prefer to leave it on cluster-wide if we could.
> Since all the problem jobs are coming down one particular queue, I
> figured that if we could tweak the levers of this queue, we wouldn't
> need to mess with the rest, which on the whole is working fine.
> Thanks for any help you can provide and see you in Denver next week !
If you want to force all jobs to request at least 2 cpus, perhaps the
easiest way to accomplish this is to 1) instruct all users to do so and 2)
create a submit filter that would outright reject these jobs. You can also
do it using resources_min and resources_default, but you need to remember
to set the min to reject these jobs in all queues or they'll just get
routed wherever you forgot to set it.
> http://surfcoast.redbubble.com |
> "Unless someone like you, cares a whole awful lot, nothing is going to
> get better...It's not !" - The Lorax
> A committee is a cul-de-sac, down which ideas are lured and then
> quietly strangled.
> Sir Barnett Cocks
> "A mind is like a parachute. It doesnt work if it's not open." :- Frank
> torqueusers mailing list
> torqueusers at supercluster.org
David Beer | Senior Software Engineer
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the torqueusers