[torqueusers] Strange problem in Torque

Garrick Staples garrick at usc.edu
Sat Aug 4 09:52:50 MDT 2007


On Fri, Aug 03, 2007 at 06:24:53PM -0700, Peter Wyckoff alleged:
> 
> Last week we were seeing about a third of our cluster's pbs_moms not
> trusting the rest of the cluster - i.e., always saying unauthorized request
> when a sister mom tried to involve them in a job. See included snippet from
> the logs.
> 
> We restarted the pbs_server and all the pbs_moms and the problem was
> resolved. 
> 
> The only thing I can think of is that earlier in the day, we had too many
> queues for the default maui (had 18 and the #define says the max is 12) and
> maui was having problems. We deleted the queues and restarted maui which
> fixed the maui problem. But, could maui have caused torque to get in a bad
> state?
> 
> Any help would be appreciated.
> 
> Thanks, pete
> 
> 
> Jul 24 20:12:17 kry1776 pbs_mom: im_request, bad connect from
> 72.30.63.43:1023 - unauthorized (okclients:
> 72.30.62.70,72.30.62.72,72.30.62.71,72.30.62.73,72.30.62.75,72.30.62.77,72.3
> 0.62.79,72.30.62.78,72.30.62.81,72.

Just restarting pbs_server is probably enough for this error, and no, maui probably doesn't have anything to do with it.

What version of torque?  Earlier versions had some bugs with the okclients list.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20070804/e053bf63/attachment.bin


More information about the torqueusers mailing list