[torqueusers] NUMA question on build from trunk.

Mike Coyne Mike.Coyne at PACCAR.com
Mon Sep 27 14:41:56 MDT 2010


In building the NUMA build  from the svn trunk , I am wondering if there
is something else I need to set  there seems to be a problem revolving
around doing a gethostbyaddrs call in the node_manager.c  , I tried
enabling a -e on the commandl pbs_server command line, the server dumps
core . This machine also has a secondary private network attached. 

 

 

>From my server_logs.

 

 

09/27/2010 14:09:23;0008;PBS_Server;Job;dispatch_request;dispatching
request StatusJob on sd=10

09/27/2010 14:09:23;0008;PBS_Server;Job;reply_send;Reply sent for
request type StatusJob on socket 10

09/27/2010 14:09:23;0040;PBS_Server;Req;do_rpp;rpp request received on
stream 0

09/27/2010 14:09:23;0040;PBS_Server;Req;do_rpp;inter-server request
received

09/27/2010 14:09:23;0004;PBS_Server;Svr;is_request;message received from
stream 0 (version 1)

09/27/2010 14:09:23;0004;PBS_Server;Svr;is_request;message received from
stream <my_ip_address>.116:1016: mom_port 16302  - rm_port 16303

09/27/2010 14:09:23;0040;PBS_Server;Req;is_request;bad attempt to
connect from <my_ip_address>.116:1016 (address not trusted - check entry
in server_priv/nodes)

09/27/2010
14:09:23;0001;PBS_Server;Svr;PBS_Server;LOG_ERROR::is_request, bad
attempt to connect from <my_ip_address>.116:1016 (address not trusted -
check entry in server_priv/nodes)

09/27/2010 14:09:24;0040;PBS_Server;Req;do_rpp;rpp request received on
stream 0

09/27/2010 14:09:24;0040;PBS_Server;Req;do_rpp;inter-server request
received

09/27/2010 14:09:24;0004;PBS_Server;Svr;is_request;message received from
stream 0 (version 1)

09/27/2010 14:09:24;0004;PBS_Server;Svr;is_request;message received from
stream <my_ip_address>.116:1016: mom_port 16302  - rm_port 16303

09/27/2010 14:09:24;0040;PBS_Server;Req;is_request;bad attempt to
connect from <my_ip_address>.116:1016 (address not trusted - check entry
in server_priv/nodes)

09/27/2010
14:09:24;0001;PBS_Server;Svr;PBS_Server;LOG_ERROR::is_request, bad
attempt to connect from <my_ip_address>.116:1016 (address not trusted -
check entry in server_priv/nodes)

 

 

 

 

# pbsnodes -a

styx.<mydomainname>-0

     state = down

     np = 2

     ntype = cluster

     mom_service_port = 15002

     mom_manager_port = 15003

 

styx.<mydomainname>-1

     state = down

     np = 2

     ntype = cluster

     mom_service_port = 15002

     mom_manager_port = 15003

 

 

shows my momports to be on 15002 etc but my mom was started as 

pbs_mom -S 16301 -M 16302 -R 16303

pbs_server -S styx.<mydomainname>:72559 -p 16301 -M 16302 -R 16303

 

I set the node files as

styx.<mydomainname> np=4 num_numa_nodes=2

 

and set my mom.layout 

cpus=0-1  mem=0 

cpus=2-3  mem=1

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20100927/e13755c9/attachment-0001.html 


More information about the torqueusers mailing list