[torqueusers] Strange problem in Torque
Peter Wyckoff
wyckoff at yahoo-inc.com
Sat Aug 4 17:26:49 MDT 2007
Hi Garrick,
Version 2.1.8 - does this version have the problem?
Thanks, pete
On 8/4/07 8:52 AM, "Garrick Staples" <garrick at usc.edu> wrote:
> On Fri, Aug 03, 2007 at 06:24:53PM -0700, Peter Wyckoff alleged:
>>
>> Last week we were seeing about a third of our cluster's pbs_moms not
>> trusting the rest of the cluster - i.e., always saying unauthorized request
>> when a sister mom tried to involve them in a job. See included snippet from
>> the logs.
>>
>> We restarted the pbs_server and all the pbs_moms and the problem was
>> resolved.
>>
>> The only thing I can think of is that earlier in the day, we had too many
>> queues for the default maui (had 18 and the #define says the max is 12) and
>> maui was having problems. We deleted the queues and restarted maui which
>> fixed the maui problem. But, could maui have caused torque to get in a bad
>> state?
>>
>> Any help would be appreciated.
>>
>> Thanks, pete
>>
>>
>> Jul 24 20:12:17 kry1776 pbs_mom: im_request, bad connect from
>> 72.30.63.43:1023 - unauthorized (okclients:
>> 72.30.62.70,72.30.62.72,72.30.62.71,72.30.62.73,72.30.62.75,72.30.62.77,72.3
>> 0.62.79,72.30.62.78,72.30.62.81,72.
>
> Just restarting pbs_server is probably enough for this error, and no, maui
> probably doesn't have anything to do with it.
>
> What version of torque? Earlier versions had some bugs with the okclients
> list.
>
More information about the torqueusers
mailing list