[Mauiusers] Re: [torqueusers] maui + torque job start rate
Ling C. Ho
ling at fnal.gov
Thu Apr 9 11:35:54 MDT 2009
Argh, I recreated your patch by hand, and didn't noticed you had changed "MasterHost" to "HostList"
in the pbs_asyrunjob call. This all make sense now, and it works beautifully on my test setup.
Thank you all!
...
ling
Tom Rudwick wrote:
> For the async call the hostlist is passed in. I guess if someone is making
> the changes configurable, they would have to choose one method or the other
> for the synchronous call style, either compatible with the current
> sequence,
> or the faster method that eliminates the MPBSJobModify calls.
>
> Josh Butikofer wrote:
>> Actually, I just checked out the Maui source code and it looks like
>> you will need to keep at least one of the neednodes calls (the one
>> before the call to pbs_runjob()), as Maui is not passing a host list
>> into pbs_runjob(). If Maui does pass in the hostlist to pbs_runjob(),
>> the neednodes calls are probably not needed.
>>
>> Josh Butikofer
>> Cluster Resources, Inc.
>> #############################
>>
>>
>> Josh Butikofer wrote:
>>> Tom is right that the "neednodes" modification is no longer needed
>>> for newer versions of TORQUE. You should, in fact, be able to remove
>>> any MPBSJobModify() code that changes "neednodes". I don't have the
>>> Maui code in front of me, but if both MPBSJobModify() calls deal with
>>> neednodes, you should be able to safely remove both of them if using
>>> a newer version of TORQUE.
>>>
>>> Josh Butikofer
>>> Cluster Resources, Inc.
>>> #############################
>>>
>>>
>>> Ling C. Ho wrote:
>>>> Yes, I meant the second MPBSJobModify, not MPBSJobStart. So if I
>>>> need maui to still assign the nodes for me (using
>>>> NODEALLOCATIONPOLICY), could I still use both MPBSJobModify()'s, and
>>>> just change pbs_runjob() to pbs_asyrunjob()?
>>>>
>>>> Thanks for your quick reply.
>>>>
>>>> ...
>>>> ling
>>>>
>>>>
>>>>
>>>> Tom Rudwick wrote:
>>>>
>>>>> If you mean the second MPBSJobModify, my understanding is that that
>>>>> call was supposed
>>>>> to work around an old bug in PBS.
>>>>>
>>>>> Tom
>>>>>
>>>>>
>>>>> Ling C. Ho wrote:
>>>>>> Hi Tom,
>>>>>>
>>>>>> In your patch, you have commented out both MPBSJobModify calls
>>>>>> before and after pbs_asystart(). I can understand the first
>>>>>> MPBSJobStart() which set the node where the job should run. What
>>>>>> is the purpose of the second MPBSJobStart as it set the neednodes
>>>>>> to 1?
>>>>>>
>>>>>> Thanks,
>>>>>> ...
>>>>>> ling
>>>>>>
>>>>>> Tom Rudwick wrote:
>>>>>>
>>>>>>> If you search the maui list archives for my asynchronous job
>>>>>>> start patch
>>>>>>> you can increase that speed greatly.
>>>>>>>
>>>>>>> Tom
>>>>>>>
>>>>>>>
>>>>>>> Stijn De Weirdt wrote:
>>>>>>>> hi all,
>>>>>>>>
>>>>>>>> (this is a crosspost to both maui and torque users list)
>>>>>>>>
>>>>>>>> we are having issues with the job start rate using maui+torque.
>>>>>>>> starting
>>>>>>>> a job takes on average 2 seconds, which is slow for what our
>>>>>>>> users are
>>>>>>>> dumping in our queues.
>>>>>>>>
>>>>>>>> with a job start i mean the following cycle
>>>>>>>> 04/01 10:01:08 MRMJobStart(374900,Msg,SC)
>>>>>>>> 04/01 10:01:08 MPBSJobStart(374900,gengar,Msg,SC)
>>>>>>>> 04/01 10:01:08
>>>>>>>> MPBSJobModify(374900,Resource_List,Resource,node088.gengar.gent.vsc)
>>>>>>>>
>>>>>>>> 04/01 10:01:10 MPBSJobModify(374900,Resource_List,Resource,1)
>>>>>>>> 04/01 10:01:10 INFO: job '374900' successfully started
>>>>>>>> 04/01 10:01:10 INFO: command sent to server
>>>>>>>> 04/01 10:01:10 INFO: response received from server
>>>>>>>>
>>>>>>>> i've already tried to follow the "large cluster" tuning tips to
>>>>>>>> see if
>>>>>>>> it helps, but no real result. (the only tip that might solve the
>>>>>>>> problemn is the asyncstart option from moab ;). (we have a 200
>>>>>>>> node, 8
>>>>>>>> core/node cluster (i actually don't think this is "large"))
>>>>>>>>
>>>>>>>> anyway, before i dig in the code looking for options, i'm
>>>>>>>> wondering what
>>>>>>>> other people are seeing as minimal start time, so i know if it is
>>>>>>>> possible at all.
>>>>>>>>
>>>>>>>> many thanks,
>>>>>>>>
>>>>>>>> stijn
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> torqueusers mailing list
>>>>>>> torqueusers at supercluster.org
>>>>>>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> mauiusers mailing list
>>>> mauiusers at supercluster.org
>>>> http://www.supercluster.org/mailman/listinfo/mauiusers
>>> _______________________________________________
>>> mauiusers mailing list
>>> mauiusers at supercluster.org
>>> http://www.supercluster.org/mailman/listinfo/mauiusers
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>
More information about the torqueusers
mailing list