[torqueusers] problems w/ mixed case domain names

Michael Hanulec hanulec at hanulec.com
Wed Nov 17 00:29:49 MST 2004


Unfortunately this snapshot doesn't seem to solve the problem... but make 
it worse.  my working configuration, after modifying /etc/hosts on the 
master node, is now broken.  'qstat', 'pbsnodes', 'qterm', & 'qmgr' all 
fail (the qstat and pbsnodes failures are new):

[root at falcon00 server_logs]# qterm -t quick
pbs_iff: cannot connect to host
No Permission.
qterm: could not connect to server  (15007)
[root at falcon00 server_logs]# qstat
pbs_iff: cannot connect to host
No Permission.
qstat: cannot connect to server falcon00 (errno=15007)
[root at falcon00 server_logs]# qmgr
pbs_iff: cannot connect to host
No Permission.
qmgr: cannot connect to server
[root at falcon00 server_logs]# ps -auwx|grep pbs
root     21394  0.0  0.0  9476 1376 ?        S    01:25   0:00 
/usr/local/pbs/sbin/pbs_server
root     21457  0.0  0.0 36960  700 pts/3    S    01:28   0:00 grep pbs
[root at falcon00 server_logs]#


I've verified a compute node and start its pbs_mom daemon and say HELLO 
but this compute node also cannot execute qterm or pbsnodes.

What level of debugging output would be helpful in getting this resolved??

Thanks again!

--
hanulec at hanulec.com		cell: 858.518.2647 && 516.410.4478
https://secure.hanulec.com	      EFnet irc && aol im: hanulec

On Tue, 16 Nov 2004, Dave Jackson wrote:

> Mike,
>
>  We have modified authentication based host evaluation to be case
> insensitive in the latest TORQUE snapshot.  Please give it a try and let
> us know if it solves your problems.
>
> Thanks,
> Dave
> Cluster Resources, Inc
>
> On Mon, 2004-11-15 at 20:15, Michael Hanulec wrote:
>> Hi Everybody...
>>
>> I'm current attempting to run torque-1.1.0p5-snap.1099755743 on an RHEL
>> 3/AMD64 based system.  I might of found a bug... or maybe this is a know
>> issue.  My server name is 'falcon00.Force' but when the pbs_server starts
>> the logs say 'falcon00.force':
>>
>> <begin pbs server log file>
>> 11/15/2004 20:28:45;0002;PBS_Server;Svr;Log;Log opened
>> 11/15/2004 20:28:45;0006;PBS_Server;Svr;PBS_Server;Server falcon00.force started, initialization type = 4
>> 11/15/2004 20:28:45;0002;PBS_Server;Svr;Act;Account file /var/spool/pbs/server_priv/accounting/20041115 opened
>> 11/15/2004 20:28:45;0040;PBS_Server;Req;setup_nodes;setup_nodes()
>> 11/15/2004 20:28:45;0004;PBS_Server;Svr;falcon00.force;No Node description file found in setup_nodes
>> 11/15/2004 20:28:45;0002;PBS_Server;Svr;PBS_Server;Expected 0, recovered 0 queues
>> 11/15/2004 20:28:45;0002;PBS_Server;Svr;PBS_Server;Expected 0, recovered 0 jobs
>> 11/15/2004 20:28:45;0006;PBS_Server;Svr;PBS_Server;Using ports Server:15001  Scheduler:15004  MOM:15002
>> 11/15/2004 20:28:45;0002;PBS_Server;Svr;PBS_Server;Server Ready, pid = 8075
>> </end pbs server log file>
>>
>>
>> My attempts to use 'qmgr' or 'qterm' fail though ... note the domain is
>> now listed as Force:
>>
>> <begin more pbs server log file>
>> 11/15/2004 20:28:57;0100;PBS_Server;Req;;Type authenticateuser request received from root at falcon00.Force, sock=10
>> 11/15/2004 20:29:02;0100;PBS_Server;Req;;Type manager request received from root at falcon00.Force, sock=9
>> 11/15/2004 20:29:02;0080;PBS_Server;Req;req_reject;Reject reply code=15007(Unauthorized Request ), aux=0, type=9, from root at falcon00.Force
>> </end more pbs server log file>
>>
>>
>> After changing my /etc/hosts entry to falcon00.force I am able to use both
>> qterm and qmgr:
>>
>> <final pbs server log file>
>> 11/15/2004 20:49:21;0100;PBS_Server;Req;;Type authenticateuser request received from root at falcon00.force, sock=16
>> 11/15/2004 20:49:21;0100;PBS_Server;Req;;Type shutdown request received from root at falcon00.force,sock=15
>> 11/15/2004 20:49:21;0086;PBS_Server;Svr;PBS_Server;Shutdown request from root at falcon00.force
>> 11/15/2004 20:49:21;0086;PBS_Server;Svr;PBS_Server;Starting to shutdown the server, type is Quick
>> 11/15/2004 20:49:21;0002;PBS_Server;Svr;PBS_Server;Server shutdown completed
>> 11/15/2004 20:49:21;0002;PBS_Server;Svr;Log;Log closed
>> </final pbs server log file>
>>
>>
>> Tonight I'm also going to try out 1.0.1p6 to see if this error exists
>> there too.  Maybe I'll even dig up the code causing the problem.
>>
>> -Mike
>>
>> --
>> hanulec at hanulec.com		cell: 858.518.2647 && 516.410.4478
>> https://secure.hanulec.com	      EFnet irc && aol im: hanulec
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://supercluster.org/mailman/listinfo/torqueusers
>
>
>


More information about the torqueusers mailing list