[torqueusers] unable to contact node, Connection refused
garrick at usc.edu
Fri Nov 4 11:55:00 MST 2005
On Fri, Nov 04, 2005 at 08:50:08AM -0500, Alejandro Hurtado Turi?o alleged:
> I've installed a torque-1.1.0p6 on a cluster, but the jobs don't run
> unless forced w/ qrun I'm not planning on installing Maui
> and just using the default fifo scheduler (pbs_sched)
1.1.0p6 is really old. There have been countless improvements since
> The pbs server log say at start up:
> 10/31/2005 13:54:28;0006;PBS_Server;Svr;PBS_Server;Using ports Server:
> 15001 Scheduler:15004 MOM:15002
> 10/31/2005 13:54:28;0002;PBS_Server;Svr;PBS_Server;Server Ready, pid =
> 10/31/2005 13:54:28;0004;PBS_Server;Svr;WARNING;!!! unable to contact node
> grid1 !!!
> 10/31/2005 13:54:28;0001;PBS_Server;Svr;PBS_Server;Connection refused
> (111) in contact_sched, Could not contact Scheduler - port 15004
Is pbs_mom running on grid1? Is pbs_sched running on the server?
> grid1 is the pbs server with pbsmon installed.
> no firewall
> my mom-priv/config
> $clienthost grid1
> $logevent 255
> $restricted grid1
> $usecp *:/data /data
Is grid1 a node or server? The information above is confusing.
The server logs indicate that it is a node. The MOM config looks like
grid1 is the server.
And you don't need the $restricted line, that just weakens security.
> Looking for it in the web, i see the problem is common but notbody answer
> could anybody helpme please!??
These kinds of things are just config errors that are hard to diagnose
over email. Eventually the admin figures it out and doesn't tell anyone
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20051104/bababcab/attachment.bin
More information about the torqueusers