[torqueusers] torque installation

Kevin Van Workum vanw at sabalcore.com
Fri Feb 21 12:04:02 MST 2014


You say you installed 2.4.10, but are running 'torque.setup'
from /home/ezhil/torque-4.0.0. Something is not consistent there.


On Fri, Feb 21, 2014 at 11:43 AM, RB. Ezhilalan (Principal Physicist, CUH) <
RB.Ezhilalan at hse.ie> wrote:

> Dear Users,
>
> I've installed and successfully running torque 2.4.10 on a P4 PC
> (linux-01) with Suse10.1 Linux. But I hit a problem with the hard disc
> on this PC due to lack of space. So I decided to install torque 2.4.10
> on another PC (linux-02, with higher HD capacity). But I could not
> install torque successfully; I am getting some error message as below:
> (some installation messages attached).
> ________________________________________________
> linux-02:/home/ezhil/torque-4.0.0 # ldconfig
> linux-02:/home/ezhil/torque-4.0.0 # ./torque.setup ezhil
> initializing TORQUE (admin: ezhil at linux-02.physics)
> ./torque.setup: line 31: pbs_server: command not found
> ./torque.setup: line 37: qmgr: command not found
> ERROR: cannot set TORQUE admins
> ./torque.setup: line 41: qterm: command not found
> linux-02:/home/ezhil/torque-4.0.0 # qterm
> bash: qterm: command not found
> linux-02:/home/ezhil/torque-4.0.0 # cd /usr/local/sbin/
> linux-02:/usr/local/sbin # dir
> total 3463
> -rwxr-xr-x 1 root root   33141 2014-02-21 15:35 momctl
> -rwxr-xr-x 1 root root   17052 2014-02-21 15:35 pbs_demux
> -rwxr-xr-x 1 root root 1243559 2014-02-21 15:35 pbs_mom
> -rwxr-xr-x 1 root root  259859 2014-02-21 15:35 pbs_sched
> -rwxr-xr-x 1 root root 1976214 2014-02-21 15:35 pbs_server
> lrwxrwxrwx 1 root root       7 2014-02-21 15:35 qnoded -> pbs_mom
> lrwxrwxrwx 1 root root       9 2014-02-21 15:35 qschedd -> pbs_sched
> lrwxrwxrwx 1 root root      10 2014-02-21 15:35 qserverd -> pbs_server
>
> _______________________________________________________________________
>
> I can't figure out what's going wrong. Any feed back would be much
> appreciated.
>
> Regards,
> Ezil
> Ezhilalan Ramalingam M.Sc.,DABR.,
> Principal Physicist (Radiotherapy),
> Medical Physics Department,
> Cork University Hospital,
> Wilton, Cork
> Ireland
> Tel. 00353 21 4922533
> Fax.00353 21 4921300
> Email: rb.ezhilalan at hse.ie
> -----Original Message-----
> From: torqueusers-bounces at supercluster.org
> [mailto:torqueusers-bounces at supercluster.org] On Behalf Of
> torqueusers-request at supercluster.org
> Sent: 20 February 2014 19:28
> To: torqueusers at supercluster.org
> Subject: torqueusers Digest, Vol 115, Issue 15
>
> Send torqueusers mailing list submissions to
>         torqueusers at supercluster.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://www.supercluster.org/mailman/listinfo/torqueusers
> or, via email, send a message with subject or body 'help' to
>         torqueusers-request at supercluster.org
>
> You can reach the person managing the list at
>         torqueusers-owner at supercluster.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of torqueusers digest..."
>
>
> Today's Topics:
>
>    1. Re: qsub and mpiexec -f machinefile (Tiago Silva (Cefas))
>    2. Re: qsub and mpiexec -f machinefile (Michel B?land)
>    3. Re: qsub and mpiexec -f machinefile (Tiago Silva (Cefas))
>    4. Re: qsub and mpiexec -f machinefile (Gus Correa)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 20 Feb 2014 09:51:24 +0000
> From: "Tiago Silva (Cefas)" <tiago.silva at cefas.co.uk>
> Subject: Re: [torqueusers] qsub and mpiexec -f machinefile
> To: Torque Users Mailing List <torqueusers at supercluster.org>
> Message-ID:
>         <10AEF77DEA6D7845AF91E549D3C7505B5AEB2E at w14mbx06.gb.so.ccs>
> Content-Type: text/plain; charset="windows-1252"
>
> Thanks, this seems promising. Before I try building with openmpi, if I
> parse PBS_NODEFILE to produce my own machinefile for mpiexec, for
> instance following my previous example:
>
> n100
> n100
> n101
> n101
> n101
> n101
>
> won't mpiexec start mpi  processes with ranks 0-1 onto n100 and with
> rank 2-5 on n101? That what I think it does when I don't use qsub.
>
> Tiago
>
> > -----Original Message-----
> > From: torqueusers-bounces at supercluster.org [mailto:torqueusers-
> > bounces at supercluster.org] On Behalf Of Gus Correa
> > Sent: 19 February 2014 15:11
> > To: Torque Users Mailing List
> > Subject: Re: [torqueusers] qsub and mpiexec -f machinefile
> >
> > Hi Tiago
> >
> > The Torque/PBS node file is available to your job script through the
> > environmnent variable $PBS_NODEFILE.
> > This file has one line listing the node name for each processor/core
> > that you requested.
> > Just do a "cat $PBS_NODEFILE" inside your job script to see how it
> > looks.
> > Inside your job script, and before the mpiexec command, you can run a
> > brief auxiliary script to create the machinefile you need from the the
> > $PBS_NODEFILE.
> > You will need to create this auxiliary script, tailored to your
> > application.
> > Still, this method won't bind the MPI processes to the appropriate
> > hardware components (cores, sockets, etc), (in case this is also part
> > of your goal).
> >
> > Having said that, if you are using OpenMPI, it can be built with
> Torque
> > support (with the --with-tm=/torque/location configuration option).
> > This would give you a range of options on how to assign different
> > cores, sockets, etc, to different MPI ranks/processes, directly in the
> > mpiexec command, or in the OpenMPI runtime configuration files.
> > This method would't require creating the machinefile from the
> > PBS_NODEFILE.
> > This second approach has the advantage of allowing you to bind the
> > processes to cores, sockets, etc.
> >
> > I hope this helps,
> > Gus Correa
> >
> > n 02/19/2014 07:40 AM, Tiago Silva (Cefas) wrote:
> > > Hi,
> > >
> > > My MPI code is normally executed across a set of nodes with
> something
> > like:
> > >
> > > mpiexec -f machinefile -np 6 ./bin
> > >
> > > where the machinefile has 6 entries with node names, for instance:
> > >
> > > n01
> > >
> > > n01
> > >
> > > n02
> > >
> > > n02
> > >
> > > n02
> > >
> > > n02
> > >
> > > Now the issue here is that this list has been optimised to balance
> > the
> > > load between nodes and to reduce internode communication. So for
> > > instance model domain tiles 0 and 1 will run on n01 while tiles 2 to
> > 5
> > > will run on n02.
> > >
> > > Is there a way to integrate this into qsub since I don't know which
> > > nodes will be assigned before submission? Or in other words can I
> > > control grouping processes in one node?
> > >
> > > In my example I used 6 processes for simplicity but normally I
> > > parallelise across 4-16 nodes and >100 processes.
> > >
> > > Thanks,
> > >
> > > tiago
> > >
> > >
> > >
> > >
> > >
> > > This email and any attachments are intended for the named recipient
> > > only. Its unauthorised use, distribution, disclosure, storage or
> > > copying is not permitted. If you have received it in error, please
> > > destroy all copies and notify the sender. In messages of a
> > > non-business nature, the views and opinions expressed are the
> > author's
> > > own and do not necessarily reflect those of Cefas. Communications on
> > > Cefas' computer systems may be monitored and/or recorded to secure
> > the
> > > effective operation of the system and for other lawful purposes.
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > torqueusers mailing list
> > > torqueusers at supercluster.org
> > > http://www.supercluster.org/mailman/listinfo/torqueusers
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> This email and any attachments are intended for the named recipient
> only. Its unauthorised use, distribution, disclosure, storage or copying
> is not permitted.
> If you have received it in error, please destroy all copies and notify
> the sender. In messages of a non-business nature, the views and opinions
> expressed are the author's own
> and do not necessarily reflect those of Cefas.
> Communications on Cefas? computer systems may be monitored and/or
> recorded to secure the effective operation of the system and for other
> lawful purposes.
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://www.supercluster.org/pipermail/torqueusers/attachments/20140220/c
> 3fffa9c/attachment-0001.html
>
> ------------------------------
>
> Message: 2
> Date: Thu, 20 Feb 2014 08:09:50 -0500
> From: Michel B?land <michel.beland at calculquebec.ca>
> Subject: Re: [torqueusers] qsub and mpiexec -f machinefile
> To: torqueusers at supercluster.org
> Message-ID: <5305FE9E.3080103 at calculquebec.ca>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Tiago Silva (Cefas) wrote:
>
> > Thanks, this seems promising. Before I try building with openmpi, if I
>
> > parse PBS_NODEFILE to produce my own machinefile for mpiexec, for
> > instance following my previous example:
> >
> > n100
> > n100
> > n101
> > n101
> > n101
> > n101
> >
> > won't mpiexec start mpi processes with ranks 0-1 onto n100 and with
> > rank 2-5 on n101? That what I think it does when I don't use qsub.
>
> Yes, but you should not change the nodes inside the $PBS_NODEFILE. You
> can change the order but do not delete machines and add new ones,
> otherwise your MPI code will try to run on nodes belonging to other
> jobs.
>
> If you want to have exactly the nodes above, you can ask for
> -lnodes=n100:ppn=2+n101:ppn=4. If you only want two cores on the first
> node and four on the second but the specific nodes are irrelevant, you
> can ask for -lnodes=1:ppn=2+1:ppn=4 instead.
>
> Michel B?land
> Calcul Qu?bec
>
>
>
> ------------------------------
>
> Message: 3
> Date: Thu, 20 Feb 2014 13:32:48 +0000
> From: "Tiago Silva (Cefas)" <tiago.silva at cefas.co.uk>
> Subject: Re: [torqueusers] qsub and mpiexec -f machinefile
> To: Torque Users Mailing List <torqueusers at supercluster.org>
> Message-ID:
>         <10AEF77DEA6D7845AF91E549D3C7505B5AEBCC at w14mbx06.gb.so.ccs>
> Content-Type: text/plain; charset="windows-1252"
>
> Sure, I will want to stick to the exact same nodes. In my case I don't
> need to worry about free slots on the nodes I am specifying exclusive
> usage with -W x="NACCESSPOLICY:SINGLEJOB".
> I actually oversubscribe the cores as some processes have very little to
> do, that is part of the performance optimisation I want to retain
>
> Thanks again
> tiago
>
> > -----Original Message-----
> > From: torqueusers-bounces at supercluster.org [mailto:torqueusers-
> > bounces at supercluster.org] On Behalf Of Michel B?land
> > Sent: 20 February 2014 13:10
> > To: torqueusers at supercluster.org
> > Subject: Re: [torqueusers] qsub and mpiexec -f machinefile
> >
> > Tiago Silva (Cefas) wrote:
> >
> > > Thanks, this seems promising. Before I try building with openmpi, if
> > I
> > > parse PBS_NODEFILE to produce my own machinefile for mpiexec, for
> > > instance following my previous example:
> > >
> > > n100
> > > n100
> > > n101
> > > n101
> > > n101
> > > n101
> > >
> > > won't mpiexec start mpi processes with ranks 0-1 onto n100 and with
> > > rank 2-5 on n101? That what I think it does when I don't use qsub.
> >
> > Yes, but you should not change the nodes inside the $PBS_NODEFILE. You
> > can change the order but do not delete machines and add new ones,
> > otherwise your MPI code will try to run on nodes belonging to other
> > jobs.
> >
> > If you want to have exactly the nodes above, you can ask for -
> > lnodes=n100:ppn=2+n101:ppn=4. If you only want two cores on the first
> > node and four on the second but the specific nodes are irrelevant, you
> > can ask for -lnodes=1:ppn=2+1:ppn=4 instead.
> >
> > Michel B?land
> > Calcul Qu?bec
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> This email and any attachments are intended for the named recipient
> only. Its unauthorised use, distribution, disclosure, storage or copying
> is not permitted.
> If you have received it in error, please destroy all copies and notify
> the sender. In messages of a non-business nature, the views and opinions
> expressed are the author's own
> and do not necessarily reflect those of Cefas.
> Communications on Cefas? computer systems may be monitored and/or
> recorded to secure the effective operation of the system and for other
> lawful purposes.
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://www.supercluster.org/pipermail/torqueusers/attachments/20140220/1
> d2604b5/attachment-0001.html
>
> ------------------------------
>
> Message: 4
> Date: Thu, 20 Feb 2014 14:12:10 -0500
> From: Gus Correa <gus at ldeo.columbia.edu>
> Subject: Re: [torqueusers] qsub and mpiexec -f machinefile
> To: Torque Users Mailing List <torqueusers at supercluster.org>
> Message-ID: <5306538A.5090300 at ldeo.columbia.edu>
> Content-Type: text/plain; charset=windows-1252; format=flowed
>
> Hi Tiago
>
> Which MPI and which mpiexec are you using?
> I am not familiar to all of them, but the behavior
> depends primarily on which you are using.
> Most likely, by default you will get the sequential
> rank-to-node mapping that you mentioned.
> Have you tried it?
> What result did you get?
>
> You can insert the MPI function MPI_Get_processor_name,
> early on your code, say, right after MPI_Init, MPI_Comm_size, and
> MPI_Comm_rank,
> and then printout the pairs rank and processor name
> (which will probably be your nodes' names).
>
> https://www.open-mpi.org/doc/v1.4/man3/MPI_Get_processor_name.3.php
> http://www.mcs.anl.gov/research/projects/mpi/www/www3/MPI_Get_processor_
> name.html
>
> With OpenMPI there are easier ways (through mpiexec) to report this
> information.
>
> However, there are ways to change the sequential
> rank-to-node mapping, if this is your goal,
> again, depending on which mpiexec you are using.
>
> Anyway, this is more of an MPI then of a Torque question.
>
> I hope this helps,
> Gus Correa
>
>
> On 02/20/2014 04:51 AM, Tiago Silva (Cefas) wrote:
> > Thanks, this seems promising. Before I try building with openmpi, if I
> > parse PBS_NODEFILE to produce my own machinefile for mpiexec, for
> > instance following my previous example:
> >
> > n100
> > n100
> > n101
> > n101
> > n101
> > n101
> >
> > won't mpiexec start mpi processes with ranks 0-1 onto n100 and with
> rank
> > 2-5 on n101? That what I think it does when I don't use qsub.
> >
> > Tiago
> >
> >  > -----Original Message-----
> >  > From: torqueusers-bounces at supercluster.org
> > <mailto:torqueusers-bounces at supercluster.org> [mailto:torqueusers-
> >  > bounces at supercluster.org] On Behalf Of Gus Correa
> >  > Sent: 19 February 2014 15:11
> >  > To: Torque Users Mailing List
> >  > Subject: Re: [torqueusers] qsub and mpiexec -f machinefile
> >  >
> >  > Hi Tiago
> >  >
> >  > The Torque/PBS node file is available to your job script through
> the
> >  > environmnent variable $PBS_NODEFILE.
> >  > This file has one line listing the node name for each
> processor/core
> >  > that you requested.
> >  > Just do a "cat $PBS_NODEFILE" inside your job script to see how it
> >  > looks.
> >  > Inside your job script, and before the mpiexec command, you can run
> a
> >  > brief auxiliary script to create the machinefile you need from the
> the
> >  > $PBS_NODEFILE.
> >  > You will need to create this auxiliary script, tailored to your
> >  > application.
> >  > Still, this method won't bind the MPI processes to the appropriate
> >  > hardware components (cores, sockets, etc), (in case this is also
> part
> >  > of your goal).
> >  >
> >  > Having said that, if you are using OpenMPI, it can be built with
> Torque
> >  > support (with the --with-tm=/torque/location configuration option).
> >  > This would give you a range of options on how to assign different
> >  > cores, sockets, etc, to different MPI ranks/processes, directly in
> the
> >  > mpiexec command, or in the OpenMPI runtime configuration files.
> >  > This method would't require creating the machinefile from the
> >  > PBS_NODEFILE.
> >  > This second approach has the advantage of allowing you to bind the
> >  > processes to cores, sockets, etc.
> >  >
> >  > I hope this helps,
> >  > Gus Correa
> >  >
> >  > n 02/19/2014 07:40 AM, Tiago Silva (Cefas) wrote:
> >  > > Hi,
> >  > >
> >  > > My MPI code is normally executed across a set of nodes with
> something
> >  > like:
> >  > >
> >  > > mpiexec -f machinefile -np 6 ./bin
> >  > >
> >  > > where the machinefile has 6 entries with node names, for
> instance:
> >  > >
> >  > > n01
> >  > >
> >  > > n01
> >  > >
> >  > > n02
> >  > >
> >  > > n02
> >  > >
> >  > > n02
> >  > >
> >  > > n02
> >  > >
> >  > > Now the issue here is that this list has been optimised to
> balance
> >  > the
> >  > > load between nodes and to reduce internode communication. So for
> >  > > instance model domain tiles 0 and 1 will run on n01 while tiles 2
> to
> >  > 5
> >  > > will run on n02.
> >  > >
> >  > > Is there a way to integrate this into qsub since I don't know
> which
> >  > > nodes will be assigned before submission? Or in other words can I
> >  > > control grouping processes in one node?
> >  > >
> >  > > In my example I used 6 processes for simplicity but normally I
> >  > > parallelise across 4-16 nodes and >100 processes.
> >  > >
> >  > > Thanks,
> >  > >
> >  > > tiago
> >  > >
> >  > >
> >  > >
> >  > >
> >  > >
> >  > > This email and any attachments are intended for the named
> recipient
> >  > > only. Its unauthorised use, distribution, disclosure, storage or
> >  > > copying is not permitted. If you have received it in error,
> please
> >  > > destroy all copies and notify the sender. In messages of a
> >  > > non-business nature, the views and opinions expressed are the
> >  > author's
> >  > > own and do not necessarily reflect those of Cefas. Communications
> on
> >  > > Cefas' computer systems may be monitored and/or recorded to
> secure
> >  > the
> >  > > effective operation of the system and for other lawful purposes.
> >  > >
> >  > >
> >  > >
> >  > >
> >  > > _______________________________________________
> >  > > torqueusers mailing list
> >  > > torqueusers at supercluster.org
> <mailto:torqueusers at supercluster.org>
> >  > > http://www.supercluster.org/mailman/listinfo/torqueusers
> >  >
> >  > _______________________________________________
> >  > torqueusers mailing list
> >  > torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
> >  > http://www.supercluster.org/mailman/listinfo/torqueusers
> >
> >
> >
> >
> > This email and any attachments are intended for the named recipient
> > only. Its unauthorised use, distribution, disclosure, storage or
> copying
> > is not permitted. If you have received it in error, please destroy all
> > copies and notify the sender. In messages of a non-business nature,
> the
> > views and opinions expressed are the author's own and do not
> necessarily
> > reflect those of Cefas. Communications on Cefas? computer systems may
> be
> > monitored and/or recorded to secure the effective operation of the
> > system and for other lawful purposes.
> >
> >
> >
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
>
> ------------------------------
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
> End of torqueusers Digest, Vol 115, Issue 15
> ********************************************
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>


-- 
Kevin Van Workum, PhD
Sabalcore Computing Inc.
"Where Data Becomes Discovery"
http://www.sabalcore.com
877-492-8027 ext. 11

-- 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20140221/06fb2e69/attachment-0001.html 


More information about the torqueusers mailing list