[torqueusers] Re: [Mauiusers] Question on suspending a job

Mikko Huhtala mhuhtala at abo.fi
Sat Apr 23 01:31:48 MDT 2005


David Jackson writes:
 > Fedele,
 > 
 >   We are releasing patch 13 of Maui next week.  We would very much like
 > to get whatever fixes are needed into this release.  Please contact us
 > at help at supercluster.org on Monday and we will do what we can to isolate
 > the problem.
 > 
 > Thanks,
 > Dave
 > 
 > On Thu, 2005-04-21 at 10:05 +0000, Fedele Stabile wrote:
 > > I'm running torque-1.2.0p2 e maui-3.2.6p11 and i have patched
 > > torque with the Benward Platz req_signal.c as suggested to release
 > > resources when a job is suspended.
 > > BUT, i can't suspend my job directly by qsig -s suspend command:
 > > first i need to give mjobctl -s then qsig -s suspend
 > > That's ok, i need to ask to scheduler (maui) and resource
 > > manager(torque) to suspend the job.
 > > But when i try to configure maui to use preemption it doesn't work
 > > because job is suspended by maui, not by torque .
 > > Can i have help?
 > > Thank you
 > > Fedele
 > > 

I may have a similar problem. On our cluster, 'mjobctl -s' does work
as advertised, without using Torque commands, but automatic preemption
does not. I have configured three QoS types: 'hi', 'med' and
'low'. 'Hi' and 'med' jobs are PREEMTORs and 'low' jobs are
PREEMPTEEs. Various Unix groups are then assigned to different QoS
settings. However, Maui never automatically suspends any jobs, even if
all running jobs are 'low' and there are 'hi' jobs in the queue. The
relevant settings in maui.cfg are:


PREEMPTPOLICY         SUSPEND

QOSCFG[hi]  PRIORITY=1000 FLAGS=PREEMPTOR
QOSCFG[med] PRIORITY=0 FLAGS=PREEMPTOR
QOSCFG[low] PRIORITY=-100 FLAGS=PREEMPTEE

GROUPCFG[molmol]       QLIST=hi:med:low QDEF=hi
GROUPCFG[devel]        QLIST=hi:med:low QDEF=hi
GROUPCFG[xtal]         QLIST=hi:med:low QDEF=hi
GROUPCFG[student]      QLIST=med:low QDEF=med
GROUPCFG[hasbeen]      QLIST=low QDEF=low
GROUPCFG[guest]        QLIST=low QDEF=low

GROUPWEIGHT    1


This may be a Torque configuration problem, but I haven't been able to
find anything that looks suspect.

This is Torque 1.2.0p2 and Maui 3.2.6p11 running on Fedora Core 3.

Mikko



More information about the torqueusers mailing list