[torqueusers] my node is down

Giuseppe Grieco giuseppe.grieco at gmail.com
Wed Oct 3 06:32:59 MDT 2012


Hi all,

I installed torque 4.1.0. I could install it correctly but I cannot
launch any job. When I launch the command pbs_server and after
pbs_mom, after pbsnodes -a I experience the following message:

applied_spectroscopy
     state = down
     np = 6
     ntype = cluster
     mom_service_port = 15002
     mom_manager_port = 15003
     gpus = 0

The machine where I installed torque is not a cluster. It is equipped
with 6 CPU and the files server_priv/nodes and mom_priv/config are the
following

server_priv/nodes

applied_spectroscopy np=6


mom_priv/config

$pbsserver applied_spectroscopy
$logevent 255

In log file server_log/20121003 I have the following error message

10/03/2012 14:23:22;0006;PBS_Server.17957;Svr;PBS_Server;Server
applied_spectroscopy started, initialization type = 1
10/03/2012 14:23:22;0002;PBS_Server.17957;Svr;get_default_threads;Defaulting
min_threads to 25 threads
10/03/2012 14:23:22;0002;PBS_Server.17957;Svr;Act;Account file
/var/spool/torque/server_priv/accounting/20121003 opened
10/03/2012 14:23:22;0040;PBS_Server.17957;Req;setup_nodes;setup_nodes()
10/03/2012 14:23:22;0086;PBS_Server.17957;Svr;PBS_Server;Recovered queue batch
10/03/2012 14:23:22;0002;PBS_Server.17957;Svr;PBS_Server;Expected 1,
recovered 1 queues
10/03/2012 14:23:22;0080;PBS_Server.17957;Svr;PBS_Server;2 total files
read from disk
10/03/2012 14:23:22;0002;PBS_Server.17957;Svr;PBS_Server;handle_job_recovery:3
10/03/2012 14:23:22;0006;PBS_Server.17957;Svr;PBS_Server;Using ports
Server:15001  Scheduler:15004  MOM:15002 (server:
'applied_spectroscopy')
10/03/2012 14:23:22;0002;PBS_Server.17957;Svr;PBS_Server;Server Ready,
pid = 17957, loglevel=0
10/03/2012 14:23:22;0001;PBS_Server.17960;Svr;PBS_Server;LOG_ERROR::Connection
refused (111) in tcp_connect_sockaddr, Failed when trying to open tcp
connection - connect() failed [rc = 15096] [addr = 127.0.1.1:15003]
10/03/2012 14:23:22;0001;PBS_Server.17960;Svr;PBS_Server;LOG_ERROR::send_hierarchy,
Could not send mom hierarchy to host applied_spectroscopy:15003
10/03/2012 14:23:37;0002;PBS_Server.17961;Svr;PBS_Server;Torque Server
Version = 4.1.2, loglevel = 0
10/03/2012 14:23:42;0001;PBS_Server.17960;Svr;PBS_Server;LOG_ERROR::Connection
refused (111) in tcp_connect_sockaddr, Failed when trying to open tcp
connection - connect() failed [rc = 15096] [addr = 127.0.1.1:15003]
10/03/2012 14:23:42;0001;PBS_Server.17960;Svr;PBS_Server;LOG_ERROR::send_hierarchy,
Could not send mom hierarchy to host applied_spectroscopy:15003
10/03/2012 14:24:02;0001;PBS_Server.17960;Svr;PBS_Server;LOG_ERROR::Connection
refused (111) in tcp_connect_sockaddr, Failed when trying to open tcp
connection - connect() failed [rc = 15096] [addr = 127.0.1.1:15003]
10/03/2012 14:24:02;0001;PBS_Server.17960;Svr;PBS_Server;LOG_ERROR::send_hierarchy,
Could not send mom hierarchy to host applied_spectroscopy:15003
10/03/2012 14:24:22;0001;PBS_Server.17960;Svr;PBS_Server;LOG_ERROR::Connection
refused (111) in tcp_connect_sockaddr, Failed when trying to open tcp
connection - connect() failed [rc = 15096] [addr = 127.0.1.1:15003]
10/03/2012 14:24:22;0001;PBS_Server.17960;Svr;PBS_Server;LOG_ERROR::send_hierarchy,
Could not send mom hierarchy to host applied_spectroscopy:15003
10/03/2012 14:28:37;0002;PBS_Server.17961;Svr;PBS_Server;Torque Server
Version = 4.1.2, loglevel = 0


It seems I have some problems in the configuration process but I
cannot understand what. Can anyone help me?

Thanks in advance,

Giuseppe

-- 
Dr. Giuseppe Grieco
Post Doc
School of Engineering
University of Basilicata
Tel. 00390971205158


More information about the torqueusers mailing list