[torqueusers] NUMA question on build from trunk.
Mike Coyne
Mike.Coyne at PACCAR.com
Mon Sep 27 14:41:56 MDT 2010
In building the NUMA build from the svn trunk , I am wondering if there
is something else I need to set there seems to be a problem revolving
around doing a gethostbyaddrs call in the node_manager.c , I tried
enabling a -e on the commandl pbs_server command line, the server dumps
core . This machine also has a secondary private network attached.
>From my server_logs.
09/27/2010 14:09:23;0008;PBS_Server;Job;dispatch_request;dispatching
request StatusJob on sd=10
09/27/2010 14:09:23;0008;PBS_Server;Job;reply_send;Reply sent for
request type StatusJob on socket 10
09/27/2010 14:09:23;0040;PBS_Server;Req;do_rpp;rpp request received on
stream 0
09/27/2010 14:09:23;0040;PBS_Server;Req;do_rpp;inter-server request
received
09/27/2010 14:09:23;0004;PBS_Server;Svr;is_request;message received from
stream 0 (version 1)
09/27/2010 14:09:23;0004;PBS_Server;Svr;is_request;message received from
stream <my_ip_address>.116:1016: mom_port 16302 - rm_port 16303
09/27/2010 14:09:23;0040;PBS_Server;Req;is_request;bad attempt to
connect from <my_ip_address>.116:1016 (address not trusted - check entry
in server_priv/nodes)
09/27/2010
14:09:23;0001;PBS_Server;Svr;PBS_Server;LOG_ERROR::is_request, bad
attempt to connect from <my_ip_address>.116:1016 (address not trusted -
check entry in server_priv/nodes)
09/27/2010 14:09:24;0040;PBS_Server;Req;do_rpp;rpp request received on
stream 0
09/27/2010 14:09:24;0040;PBS_Server;Req;do_rpp;inter-server request
received
09/27/2010 14:09:24;0004;PBS_Server;Svr;is_request;message received from
stream 0 (version 1)
09/27/2010 14:09:24;0004;PBS_Server;Svr;is_request;message received from
stream <my_ip_address>.116:1016: mom_port 16302 - rm_port 16303
09/27/2010 14:09:24;0040;PBS_Server;Req;is_request;bad attempt to
connect from <my_ip_address>.116:1016 (address not trusted - check entry
in server_priv/nodes)
09/27/2010
14:09:24;0001;PBS_Server;Svr;PBS_Server;LOG_ERROR::is_request, bad
attempt to connect from <my_ip_address>.116:1016 (address not trusted -
check entry in server_priv/nodes)
# pbsnodes -a
styx.<mydomainname>-0
state = down
np = 2
ntype = cluster
mom_service_port = 15002
mom_manager_port = 15003
styx.<mydomainname>-1
state = down
np = 2
ntype = cluster
mom_service_port = 15002
mom_manager_port = 15003
shows my momports to be on 15002 etc but my mom was started as
pbs_mom -S 16301 -M 16302 -R 16303
pbs_server -S styx.<mydomainname>:72559 -p 16301 -M 16302 -R 16303
I set the node files as
styx.<mydomainname> np=4 num_numa_nodes=2
and set my mom.layout
cpus=0-1 mem=0
cpus=2-3 mem=1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20100927/e13755c9/attachment-0001.html
More information about the torqueusers
mailing list