[torqueusers] No contact with server at hostaddr problem (followup)

Carbo, Timothy J. TIMOTHY.J.CARBO at saic.com
Tue Jul 10 13:16:30 MDT 2007


Sorry I wasn't clear

My set up is 

Node1 (cree):  running pbs_server, pbs_mom and maui

cree np=8
Huron np=8

$pbsserver cree

Node2 (huron):  running pbs_mom only

$pbsserver cree

When I submit the following on cree

echo "sleep 30" | qsub

the job appears to be scheduled on huron and runs OK but then I start
seeing the "No contact with server at hostaddr port 15001" error
messages repeated in the mom_logs file on huron and it appears that the
pbs_server never is notified that the job ran to completion.

Hope this clears things up a little.


-----Original Message-----
From: torqueusers-bounces at supercluster.org
[mailto:torqueusers-bounces at supercluster.org] On Behalf Of Garrick
Sent: Tuesday, July 10, 2007 12:28 PM
To: torqueusers at supercluster.org
Subject: Re: [torqueusers] No contact with server at hostaddr problem

On Mon, Jul 09, 2007 at 09:30:09AM -0600, Carbo, Timothy J. alleged:
> Hello all.
> I was tracking the following email chain and was wondering if there is
> any resolution to the problem below.  I just installed TORQUE 2.1.8
> Maui 3.2.6-p19 on a two node system (both x86-64 bit Xeon quad core
> systems running Red Hat AS 4 update 4) and am having the same exact
> problem when I try to submit a job on my client node (jobs run fine on
> the server node).  Oddly, the remote node is trying to connect to port
> 15001 on the server node but netstat -a indicates there is nothing
> listening at that port.  I am pretty new to Torque so am I missing
> something?

It is a little hard to figure out your setup here with "client",
"server", and "remote" nodes.

If both hosts are to handle compute jobs, then you want pbs_mom running
on both hosts and both hostnames in server_priv/nodes.

Garrick Staples, GNU/Linux HPCC SysAdmin
University of Southern California

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html

More information about the torqueusers mailing list