[torqueusers] Problem with TORQUE set-up on macbook pro snow leopard 10.6.4
Joshua Bernstein
jbernstein at penguincomputing.com
Tue Jul 6 13:36:44 MDT 2010
Hello Yves,
You're missing a "nodes" file that tells pbs_server what the hostnames
are it should reach out to. You'll want to have a look section 1.2.2:
http://www.clusterresources.com/products/torque/docs/1.2basicconfig.shtml
How many compute nodes do you have? Basically all you need to do is
stick the hostname of where a pbs_mom is running in this file.
-Joshua Bernstein
Software Development Manager
Penguin Computing
Yves PEYSSON wrote:
> Hi, I try to install TORQUE 2.4.8 on my macbook pro, snow leopard 10.6.4, core i7, with the ifollowing command
>
> ./configure --disable-gcc-warnings --with-server-home=/var/spool/pbs
>
> then
>
> make
>
>
> then sudo make install
>
> then following the basic documentation, I defined the global path variable
>
> $TORQUEHOME=/var/spool/pbs
>
>
> and ultimately
>
> I run
>
> sudo pbs_server -t create
>
> the cores become highly active, but endless ! So I killed the process and I went to the file in server_logs,
>
> 07/03/2010 12:34:18;0002;PBS_Server;Svr;Log;Log opened
> 07/03/2010 12:34:18;0006;PBS_Server;Svr;PBS_Server;Server ypeysson.local started, initialization type = 4
> 07/03/2010 12:34:18;0002;PBS_Server;Svr;Act;Account file /var/spool/pbs/server_priv/accounting/20100703 opened
> 07/03/2010 12:34:18;0040;PBS_Server;Req;setup_nodes;setup_nodes()
> 07/03/2010 12:34:18;0004;PBS_Server;Svr;ypeysson.local;cannot open node description file '/var/spool/pbs/server_priv/nodes' in setup_nodes()
> 07/03/2010 12:34:18;0002;PBS_Server;Svr;PBS_Server;Expected 0, recovered 0 queues
> 07/03/2010 12:34:18;0002;PBS_Server;Svr;PBS_Server;Expected 0, recovered 0 jobs
> 07/03/2010 12:34:18;0006;PBS_Server;Svr;PBS_Server;Using ports Server:15001 Scheduler:15004 MOM:15002 (server: 'ypeysson.local')
> 07/03/2010 12:34:18;0002;PBS_Server;Svr;daemonize_server;INFO: parent is exiting
> 07/03/2010 12:34:18;0002;PBS_Server;Svr;daemonize_server;INFO: parent is exiting
> 07/03/2010 12:34:18;0002;PBS_Server;Svr;daemonize_server;INFO: child process in background
> 07/03/2010 12:34:18;0002;PBS_Server;Svr;PBS_Server;Server Ready, pid = 48431, loglevel=0
> 07/03/2010 12:34:18;0001;PBS_Server;Svr;PBS_Server;LOG_ERROR::PBS_Server, wait_request failed
> 07/03/2010 12:34:18;0001;PBS_Server;Svr;PBS_Server;LOG_ERROR::PBS_Server, wait_request failed
> 07/03/2010 12:34:18;0001;PBS_Server;Svr;PBS_Server;LOG_ERROR::PBS_Server, wait_request failed
> 07/03/2010 12:34:18;0001;PBS_Server;Svr;PBS_Server;LOG_ERROR::PBS_Server, wait_request failed
> 07/03/2010 12:34:18;0001;PBS_Server;Svr;PBS_Server;LOG_ERROR::PBS_Server, wait_request failed
> 07/03/2010 12:34:18;0001;PBS_Server;Svr;PBS_Server;LOG_ERROR::PBS_Server, wait_request failed
> ....
>
>
> Has anybody an idea where I do a mistake ? What should I do ? Many thanks in advance !
>
> Yves Peysson
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
More information about the torqueusers
mailing list