[torqueusers] PBS on 1 node 64 cores

Neelima Chavali gneelima at vt.edu
Mon Dec 30 15:27:07 MST 2013


Update:

In the file torque.setup I edited instances of

*pbs_server, qmgr and qterm*
to their full paths

*/usr/local/bin/qterm, /usr/local/sbin/pbs_server and /usr/local/bin/qmgr*

This completed the setup and I have a serverdb file in $TORQUE/serve_priv/ .

qstat -Q -f gives me this:












*$ qstat -Q -fQueue: batch    queue_type = Execution    total_jobs = 0
state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0 Exiting:0
Comp    lete:0     resources_default.nodes = 1
resources_default.walltime = 01:00:00    mtime = 1388437622    enabled =
True    started = True*
So there is a default queue called batch.

But

*pbsnodes -a *
gives me this error:

*pbsnodes: Server has no node list MSG=node list is empty - check
'server_priv/nodes' file*
But the file server_priv/nodes contains this:




*#  nodename   number of processes   properties#
========================================        node000  np=63  shared*
Again, I have 1 node with 64 procs.

qsub wont work due to this error:



*$ qsub sub.pbs qsub: submit error (Job exceeds queue resource limits
MSG=cannot locate feasible nodes (nodes file is empty or all systems are
busy))*
Also, pbs_server is active and so is pbs_mom




*$ ps aux|grep pbsnchavali  5455  0.0  0.0 103244   860 pts/6    S+
17:28   0:00 grep pbsroot     29303  0.0  0.0 793492 29328 ?        Sl
16:07   0:02 /usr/local/sbin/pbs_serverroot     48265  0.3  0.0  96364
49316 ?        SLsl 16:01   0:16 /usr/local/sbin/pbs_mom*

Any thoughts on this?

Regards,
Neelima


On Mon, Dec 30, 2013 at 3:17 PM, Neelima Chavali <gneelima at vt.edu> wrote:

> Hello All,
>
> I am trying to install PBS on a 1 node 64 core AMD machine. I tried
> following instructions given here
> http://www.clusterresources.com/torquedocs21/1.1installation.shtml
>
> I am stuck in step 1.2.1.2 ./torque.setup
>
> I get this error
>
> $ sudo ./torque.setup  user
> initializing TORQUE (admin: xxx at xxx.edu)
> ./torque.setup: line 45: pbs_server: command not found
> ERROR: pbs_server failed to start, check syslog and server logs for more
> information
>
> $ which pbs_server
> /usr/local/sbin/pbs_server
>
> $ ls -l /usr/local/sbin/pbs_server
> -rwxr-xr-x. 1 root root 3097146 Dec 28 23:20 /usr/local/sbin/pbs_server
>
> I don't understand how a program installed as root cannot be seen by root.
> Regardless can someone offer suggestions on how to get around this issue?
>
> Regards,
> N
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20131230/1fa9c761/attachment.html 


More information about the torqueusers mailing list