[Mauiusers] maui breaking standing reservations after restart

Garrick Staples garrick at usc.edu
Thu Jul 7 19:50:15 MDT 2005


Two bugs in current maui-3.2.6p14-snap.1120169322...


The first bug is that maui is segfaulting on startup after setting a
reservation.  I had to remove maui.ck to get maui to run.  I didn't save the
backtrace, but it was segfaulting in MResCreate->MResInitialize()->calloc()

I recall someone reporting this awhile ago but can't find the
email in the mauiusers archive.

The offending setres command is:
setres -c priya -d +8:00:00:00 -s +22:00:00 -f priya -n priya ALL

The offending line in maui.ck is:
RESERVATION              priya.0 1120769489 <res ACL="RES=%=priya=;CLASS=%=priya+;" HostExp="ALL" MaxTasks="0" Name="priya.0" NodeCount="520" Resources="PROCS=[ALL]" StatCAPS="0.00" StatCIPS="0.00" StatTAPS="0.00" StatTIPS="0.00" Type="2" endtime="1121539686" flags="32" starttime="1120848486"></res>




The second bug is that restarting maui is dropping reservations on my test
machine.  It is correctly saving the res in maui.ck before exiting, but fails
to create the res on startup.  Interesting startup logs:

07/07 18:18:32 MCPLoadSched(CP,Line,S)
07/07 18:18:32 INFO:     loading RESERVATION checkpoint data 'RESERVATION priya.0 1120785441 <res ACL="RES=%=priya=;CLASS=%=batch+;" HostExp="ALL" MaxTasks="0" Name="priya.0" NodeCount="5" Resources="PROCS=[ALL]" StatCAPS="0.00" StatCIPS="0.00" StatTAPS="0.00" StatTIPS="0.00" Type="2" endtime="1120802400" flags="32" starttime="1120798800"></res>'
07/07 18:18:32 MResLoadCP(RS,RESERVATION              priya.0 1120785441 <res ACL="RES=%=priya=;CLASS=%=batch+;" HostExp="ALL" MaxTasks="0" Name="priya.0" NodeCount="5" Resources="PROCS=[ALL]" StatCAPS="0.00" StatCIPS="0.00" StatTAPS="0.00" StatTIPS="0.00" Type="2" endtime="1120802400" flags="32" starttime="1120798800"></res>)
07/07 18:18:32 MResFind(priya.0,Res)
07/07 18:18:32 INFO:     cannot locate reservation 'priya.0'
07/07 18:18:32 MUGetIndex(ACL,ValList,0)
07/07 18:18:32 MUGetIndex(RES=%=priya=,ValList,0)
07/07 18:18:32 MACLLoadConfig(ACL,=%=priya=,1,RES)
07/07 18:18:32 MUCmpFromString(%=priya,Size)
07/07 18:18:32 INFO:     ACL[0] loaded with RES priya (Affinity: 3)
07/07 18:18:32 MUGetIndex(CLASS=%=batch+,ValList,0)
07/07 18:18:32 MACLLoadConfig(ACL,=%=batch+,1,CLASS)
07/07 18:18:32 MUCmpFromString(%=batch,Size)
07/07 18:18:32 INFO:     ACL[1] loaded with CLASS batch (Affinity: 2)
07/07 18:18:32 MUGetIndex(HostExp,ValList,0)
07/07 18:18:32 MUGetIndex(MaxTasks,ValList,0)
07/07 18:18:32 MUGetIndex(Name,ValList,0)
07/07 18:18:32 MUGetIndex(NodeCount,ValList,0)
07/07 18:18:32 MUGetIndex(Resources,ValList,0)
07/07 18:18:32 MUGetIndex(StatCAPS,ValList,0)
07/07 18:18:32 MUGetIndex(StatCIPS,ValList,0)
07/07 18:18:32 MUGetIndex(StatTAPS,ValList,0)
07/07 18:18:32 MUGetIndex(StatTIPS,ValList,0)
07/07 18:18:32 MUGetIndex(Type,ValList,0)
07/07 18:18:32 MUGetIndex(endtime,ValList,0)
07/07 18:18:32 MUGetIndex(flags,ValList,0)
07/07 18:18:32 MUGetIndex(starttime,ValList,0)
07/07 18:18:32 MResCreate(User,ACL,,32,NodeList,1120798800,-2053366096,0,0,priya.0,ResP,'ALL',DRes)

Notice the correct starttime (1120798800) and the bogus endtime (-2053366096).
The interesting stuff all happens withing Mres.c:MResLoadCP(), the bug is
either in MXMLFromString() or MResFromXML().



-- 
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/mauiusers/attachments/20050707/aca9013c/attachment.bin


More information about the mauiusers mailing list