[torqueusers] Torque Jobs Stay Queued

Gus Correa gus at ldeo.columbia.edu
Tue Oct 9 13:53:09 MDT 2012


Is this missing, perhaps?

qmgr -c 'set server scheduling = True'

It may help if you send the output of

qmgr -c 'print server'

Gus Correa

On 10/09/2012 10:19 AM, Ablen wrote:
> Hello friends,
>
> I am working to install Torque on a FC16 Linux Cluster.  So far I have added
> only what I need for it to run on the master node - and I think I have done
> everything correctly.  When I submit a job, however, it shows up as being in the
> queued state - and won't run.  I think there must be a minor step I've missed.
> Below are the steps I've taken to set up torque as well as the sample job I am
> trying.  Could someone please let me know what I may still need to do in order
> for this job to run?  All comments appreciated.
>
> Many thanks.
> ablen
>
> 1 – Log into server as root
> 2 – Edit the /etc/hosts file and change the first line so it looks like this:
>
> 127.0.0.1 mysrv  localhost.localdomain localhost
>
> 3 – yum install openssl-devel
> 3 - yum install libxml2-devel
> 4 – yum –y install ‘torque*’
> 5 - pbs_server –t create
> 6 - systemctl start pbs_{mom,server,sched}.service
> 7 - systemctl enable pbs_{mom,server,sched}.service
> 8 -  /usr/local/sbin/trqauthd start
> 9 – pbs_server
> 10 – vi /var/spool/torque/server_name and also vi /etc/torque/server_name
>
> change server name to mysrv if needed
>
> 11 – vi /var/spool/torque/mom_priv/config and vi /etc/torque/mom/config
> add/modify this line:
>
> $pbsserver mysrv
>
> 12 – vi /var/spool/torque/server_priv/nodes   (create this file) and add all
> nodes in the cluster like this (np for number of processors – VERIFY THAT THESE
> ARE 4 processors ea).
>
> mysrv np=4
> node2 np=4
> node3 np=4
>>
> 13 - vi /etc/sysconfig/network (and make sure that HOSTNAME is set as follows):
>
> HOSTNAME=mysrv
>
> 14 - Append these lines to the /etc/profile file (for bash)
> PBS_DEFAULT=mysrv
> export PBS_DEFAULT
>
> Append these lines to the /etc/bashrc file (also for bash)
> PBS_DEFAULT=mysrv
> export PBS_DEFAULT
>
> 15 – execute all of the following commands:
>
> qmgr -c "set server operators += root at mysrv"
> qmgr -c "set server managers += root at mysrv"
> qmgr -c 'create queue batch'
> qmgr -c 'set queue batch queue_type = execution'
> qmgr -c 'set queue batch started = true'
> qmgr -c 'set queue batch enabled = true'
> qmgr -c 'set queue batch resources_default.walltime = 480:00:00'
> qmgr -c 'set queue batch resources_default.nodes = 1'
> qmgr -c 'set queue batch max_running = 1000'
> qmgr -c 'set server default_queue = batch'
>
> 16 – Log into a non-root account and run these commands as a preliminary test:
>
> qmgr -c "list server"
> qmgr -c "list queue batch"
>
> 17 – Submit  a test job from the nonroot account, then view it using qstat:
>
> echo "sleep 30" | qsub
> qstat
>
> Results look like this:
>
> [mine at mysrv ~]$ qstat
> Job id                    Name             User            Time Use S Queue
> ------------------------- ---------------- --------------- -------- - -----
> 0.mysrv                    STDIN            mine                   0 Q batch
>
> 1.mysrv                    STDIN            mine                   0 Q batch
>
> [mine at mysrv ~]$
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers



More information about the torqueusers mailing list