[Mauiusers] Maui doesnt update resources reqs

Michael Barnes Michael.Barnes at jlab.org
Mon May 19 12:59:09 MDT 2008


On Sat, May 17, 2008 at 09:29:57AM +0300, Wickliffe, Blake W wrote:
> We are seeing a strange issue with Maui interacting with Torque. It
> has to do with qalter'ing the resource requirements of a job.
>
> The scenario is like this: We have a very large pool of CPU's that we
> can run jobs on, and this pool is heterogeneous. The older, slower,
> less reliable processors we keep in reserve for when we get a large
> backlog of jobs. So, as an example, a user submits a job and asks for
> 100 "fast" processors. Doing a checkjob on his queued job, you'd see
> something like:
>
> Opsys: [NONE] Arch: [NONE] Features: [fast]
>
> If the administrator sees a large backlog, he can opt (at the users'
> request) to manually move some jobs to the "slow" nodes. He does this
> in the normal way with a qalter command. However, what you see from
> checkjob is something like this:
>
> Opsys: [NONE] Arch: [NONE] Features: [fast][slow]
>
> So, somehow Maui is retaining the old resource requirements of the job
> and adding the new requirements. If you cycle Maui, you get:
>
> Opsys: [NONE] Arch: [NONE] Features: [slow]
>
> And everything works as one would expect. We'd like to avoid having
> to cycle Maui every time we need to do something like this. Is anyone
> else seeing this issue?
>
> Vital stats:
>
> Maui version 3.2.6p19 Torque version: 2.3.0


I've found that qalter does not reliably relay the changed information
to Maui. Sometimes a maui restart works for some changes, sometimes it
doesn't.

Being that this seems like a manual intervention, you could just set
these older, slower nodes offline in torque, and/or use reservations to
get the desired result.

There are a number of subtle miscommunications between Maui and torque,
and this is one that I've experienced for quite sometime, and I've
either have done restarts or have had to find other ways around these
issues.

-mb

-- 
+-----------------------------------------------
| Michael Barnes
|
| Thomas Jefferson National Accelerator Facility
| 12000 Jefferson Ave.
| Newport News, VA 23606
| (757) 269-7634
+-----------------------------------------------


More information about the mauiusers mailing list