[torqueusers] torque installation

RB. Ezhilalan (Principal Physicist, CUH) RB.Ezhilalan at hse.ie
Fri Feb 21 09:43:23 MST 2014


Dear Users,

I've installed and successfully running torque 2.4.10 on a P4 PC
(linux-01) with Suse10.1 Linux. But I hit a problem with the hard disc
on this PC due to lack of space. So I decided to install torque 2.4.10
on another PC (linux-02, with higher HD capacity). But I could not
install torque successfully; I am getting some error message as below:
(some installation messages attached).
________________________________________________
linux-02:/home/ezhil/torque-4.0.0 # ldconfig
linux-02:/home/ezhil/torque-4.0.0 # ./torque.setup ezhil
initializing TORQUE (admin: ezhil at linux-02.physics)
./torque.setup: line 31: pbs_server: command not found
./torque.setup: line 37: qmgr: command not found
ERROR: cannot set TORQUE admins
./torque.setup: line 41: qterm: command not found
linux-02:/home/ezhil/torque-4.0.0 # qterm
bash: qterm: command not found
linux-02:/home/ezhil/torque-4.0.0 # cd /usr/local/sbin/
linux-02:/usr/local/sbin # dir
total 3463
-rwxr-xr-x 1 root root   33141 2014-02-21 15:35 momctl
-rwxr-xr-x 1 root root   17052 2014-02-21 15:35 pbs_demux
-rwxr-xr-x 1 root root 1243559 2014-02-21 15:35 pbs_mom
-rwxr-xr-x 1 root root  259859 2014-02-21 15:35 pbs_sched
-rwxr-xr-x 1 root root 1976214 2014-02-21 15:35 pbs_server
lrwxrwxrwx 1 root root       7 2014-02-21 15:35 qnoded -> pbs_mom
lrwxrwxrwx 1 root root       9 2014-02-21 15:35 qschedd -> pbs_sched
lrwxrwxrwx 1 root root      10 2014-02-21 15:35 qserverd -> pbs_server

_______________________________________________________________________

I can't figure out what's going wrong. Any feed back would be much
appreciated.

Regards,
Ezil
Ezhilalan Ramalingam M.Sc.,DABR.,
Principal Physicist (Radiotherapy),
Medical Physics Department,
Cork University Hospital,
Wilton, Cork
Ireland
Tel. 00353 21 4922533
Fax.00353 21 4921300
Email: rb.ezhilalan at hse.ie 
-----Original Message-----
From: torqueusers-bounces at supercluster.org
[mailto:torqueusers-bounces at supercluster.org] On Behalf Of
torqueusers-request at supercluster.org
Sent: 20 February 2014 19:28
To: torqueusers at supercluster.org
Subject: torqueusers Digest, Vol 115, Issue 15

Send torqueusers mailing list submissions to
	torqueusers at supercluster.org

To subscribe or unsubscribe via the World Wide Web, visit
	http://www.supercluster.org/mailman/listinfo/torqueusers
or, via email, send a message with subject or body 'help' to
	torqueusers-request at supercluster.org

You can reach the person managing the list at
	torqueusers-owner at supercluster.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of torqueusers digest..."


Today's Topics:

   1. Re: qsub and mpiexec -f machinefile (Tiago Silva (Cefas))
   2. Re: qsub and mpiexec -f machinefile (Michel B?land)
   3. Re: qsub and mpiexec -f machinefile (Tiago Silva (Cefas))
   4. Re: qsub and mpiexec -f machinefile (Gus Correa)


----------------------------------------------------------------------

Message: 1
Date: Thu, 20 Feb 2014 09:51:24 +0000
From: "Tiago Silva (Cefas)" <tiago.silva at cefas.co.uk>
Subject: Re: [torqueusers] qsub and mpiexec -f machinefile
To: Torque Users Mailing List <torqueusers at supercluster.org>
Message-ID:
	<10AEF77DEA6D7845AF91E549D3C7505B5AEB2E at w14mbx06.gb.so.ccs>
Content-Type: text/plain; charset="windows-1252"

Thanks, this seems promising. Before I try building with openmpi, if I
parse PBS_NODEFILE to produce my own machinefile for mpiexec, for
instance following my previous example:

n100
n100
n101
n101
n101
n101

won't mpiexec start mpi  processes with ranks 0-1 onto n100 and with
rank 2-5 on n101? That what I think it does when I don't use qsub.

Tiago

> -----Original Message-----
> From: torqueusers-bounces at supercluster.org [mailto:torqueusers-
> bounces at supercluster.org] On Behalf Of Gus Correa
> Sent: 19 February 2014 15:11
> To: Torque Users Mailing List
> Subject: Re: [torqueusers] qsub and mpiexec -f machinefile
> 
> Hi Tiago
> 
> The Torque/PBS node file is available to your job script through the
> environmnent variable $PBS_NODEFILE.
> This file has one line listing the node name for each processor/core
> that you requested.
> Just do a "cat $PBS_NODEFILE" inside your job script to see how it
> looks.
> Inside your job script, and before the mpiexec command, you can run a
> brief auxiliary script to create the machinefile you need from the the
> $PBS_NODEFILE.
> You will need to create this auxiliary script, tailored to your
> application.
> Still, this method won't bind the MPI processes to the appropriate
> hardware components (cores, sockets, etc), (in case this is also part
> of your goal).
> 
> Having said that, if you are using OpenMPI, it can be built with
Torque
> support (with the --with-tm=/torque/location configuration option).
> This would give you a range of options on how to assign different
> cores, sockets, etc, to different MPI ranks/processes, directly in the
> mpiexec command, or in the OpenMPI runtime configuration files.
> This method would't require creating the machinefile from the
> PBS_NODEFILE.
> This second approach has the advantage of allowing you to bind the
> processes to cores, sockets, etc.
> 
> I hope this helps,
> Gus Correa
> 
> n 02/19/2014 07:40 AM, Tiago Silva (Cefas) wrote:
> > Hi,
> >
> > My MPI code is normally executed across a set of nodes with
something
> like:
> >
> > mpiexec -f machinefile -np 6 ./bin
> >
> > where the machinefile has 6 entries with node names, for instance:
> >
> > n01
> >
> > n01
> >
> > n02
> >
> > n02
> >
> > n02
> >
> > n02
> >
> > Now the issue here is that this list has been optimised to balance
> the
> > load between nodes and to reduce internode communication. So for
> > instance model domain tiles 0 and 1 will run on n01 while tiles 2 to
> 5
> > will run on n02.
> >
> > Is there a way to integrate this into qsub since I don't know which
> > nodes will be assigned before submission? Or in other words can I
> > control grouping processes in one node?
> >
> > In my example I used 6 processes for simplicity but normally I
> > parallelise across 4-16 nodes and >100 processes.
> >
> > Thanks,
> >
> > tiago
> >
> >
> >
> >
> >
> > This email and any attachments are intended for the named recipient
> > only. Its unauthorised use, distribution, disclosure, storage or
> > copying is not permitted. If you have received it in error, please
> > destroy all copies and notify the sender. In messages of a
> > non-business nature, the views and opinions expressed are the
> author's
> > own and do not necessarily reflect those of Cefas. Communications on
> > Cefas' computer systems may be monitored and/or recorded to secure
> the
> > effective operation of the system and for other lawful purposes.
> >
> >
> >
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
This email and any attachments are intended for the named recipient
only. Its unauthorised use, distribution, disclosure, storage or copying
is not permitted.
If you have received it in error, please destroy all copies and notify
the sender. In messages of a non-business nature, the views and opinions
expressed are the author's own
and do not necessarily reflect those of Cefas. 
Communications on Cefas? computer systems may be monitored and/or
recorded to secure the effective operation of the system and for other
lawful purposes.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://www.supercluster.org/pipermail/torqueusers/attachments/20140220/c
3fffa9c/attachment-0001.html 

------------------------------

Message: 2
Date: Thu, 20 Feb 2014 08:09:50 -0500
From: Michel B?land <michel.beland at calculquebec.ca>
Subject: Re: [torqueusers] qsub and mpiexec -f machinefile
To: torqueusers at supercluster.org
Message-ID: <5305FE9E.3080103 at calculquebec.ca>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Tiago Silva (Cefas) wrote:

> Thanks, this seems promising. Before I try building with openmpi, if I

> parse PBS_NODEFILE to produce my own machinefile for mpiexec, for 
> instance following my previous example:
>
> n100
> n100
> n101
> n101
> n101
> n101
>
> won't mpiexec start mpi processes with ranks 0-1 onto n100 and with 
> rank 2-5 on n101? That what I think it does when I don't use qsub.

Yes, but you should not change the nodes inside the $PBS_NODEFILE. You 
can change the order but do not delete machines and add new ones, 
otherwise your MPI code will try to run on nodes belonging to other
jobs.

If you want to have exactly the nodes above, you can ask for 
-lnodes=n100:ppn=2+n101:ppn=4. If you only want two cores on the first 
node and four on the second but the specific nodes are irrelevant, you 
can ask for -lnodes=1:ppn=2+1:ppn=4 instead.

Michel B?land
Calcul Qu?bec



------------------------------

Message: 3
Date: Thu, 20 Feb 2014 13:32:48 +0000
From: "Tiago Silva (Cefas)" <tiago.silva at cefas.co.uk>
Subject: Re: [torqueusers] qsub and mpiexec -f machinefile
To: Torque Users Mailing List <torqueusers at supercluster.org>
Message-ID:
	<10AEF77DEA6D7845AF91E549D3C7505B5AEBCC at w14mbx06.gb.so.ccs>
Content-Type: text/plain; charset="windows-1252"

Sure, I will want to stick to the exact same nodes. In my case I don't
need to worry about free slots on the nodes I am specifying exclusive
usage with -W x="NACCESSPOLICY:SINGLEJOB".
I actually oversubscribe the cores as some processes have very little to
do, that is part of the performance optimisation I want to retain

Thanks again
tiago

> -----Original Message-----
> From: torqueusers-bounces at supercluster.org [mailto:torqueusers-
> bounces at supercluster.org] On Behalf Of Michel B?land
> Sent: 20 February 2014 13:10
> To: torqueusers at supercluster.org
> Subject: Re: [torqueusers] qsub and mpiexec -f machinefile
> 
> Tiago Silva (Cefas) wrote:
> 
> > Thanks, this seems promising. Before I try building with openmpi, if
> I
> > parse PBS_NODEFILE to produce my own machinefile for mpiexec, for
> > instance following my previous example:
> >
> > n100
> > n100
> > n101
> > n101
> > n101
> > n101
> >
> > won't mpiexec start mpi processes with ranks 0-1 onto n100 and with
> > rank 2-5 on n101? That what I think it does when I don't use qsub.
> 
> Yes, but you should not change the nodes inside the $PBS_NODEFILE. You
> can change the order but do not delete machines and add new ones,
> otherwise your MPI code will try to run on nodes belonging to other
> jobs.
> 
> If you want to have exactly the nodes above, you can ask for -
> lnodes=n100:ppn=2+n101:ppn=4. If you only want two cores on the first
> node and four on the second but the specific nodes are irrelevant, you
> can ask for -lnodes=1:ppn=2+1:ppn=4 instead.
> 
> Michel B?land
> Calcul Qu?bec
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
This email and any attachments are intended for the named recipient
only. Its unauthorised use, distribution, disclosure, storage or copying
is not permitted.
If you have received it in error, please destroy all copies and notify
the sender. In messages of a non-business nature, the views and opinions
expressed are the author's own
and do not necessarily reflect those of Cefas. 
Communications on Cefas? computer systems may be monitored and/or
recorded to secure the effective operation of the system and for other
lawful purposes.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://www.supercluster.org/pipermail/torqueusers/attachments/20140220/1
d2604b5/attachment-0001.html 

------------------------------

Message: 4
Date: Thu, 20 Feb 2014 14:12:10 -0500
From: Gus Correa <gus at ldeo.columbia.edu>
Subject: Re: [torqueusers] qsub and mpiexec -f machinefile
To: Torque Users Mailing List <torqueusers at supercluster.org>
Message-ID: <5306538A.5090300 at ldeo.columbia.edu>
Content-Type: text/plain; charset=windows-1252; format=flowed

Hi Tiago

Which MPI and which mpiexec are you using?
I am not familiar to all of them, but the behavior
depends primarily on which you are using.
Most likely, by default you will get the sequential
rank-to-node mapping that you mentioned.
Have you tried it?
What result did you get?

You can insert the MPI function MPI_Get_processor_name,
early on your code, say, right after MPI_Init, MPI_Comm_size, and 
MPI_Comm_rank,
and then printout the pairs rank and processor name
(which will probably be your nodes' names).

https://www.open-mpi.org/doc/v1.4/man3/MPI_Get_processor_name.3.php
http://www.mcs.anl.gov/research/projects/mpi/www/www3/MPI_Get_processor_
name.html

With OpenMPI there are easier ways (through mpiexec) to report this 
information.

However, there are ways to change the sequential
rank-to-node mapping, if this is your goal,
again, depending on which mpiexec you are using.

Anyway, this is more of an MPI then of a Torque question.

I hope this helps,
Gus Correa


On 02/20/2014 04:51 AM, Tiago Silva (Cefas) wrote:
> Thanks, this seems promising. Before I try building with openmpi, if I
> parse PBS_NODEFILE to produce my own machinefile for mpiexec, for
> instance following my previous example:
>
> n100
> n100
> n101
> n101
> n101
> n101
>
> won't mpiexec start mpi processes with ranks 0-1 onto n100 and with
rank
> 2-5 on n101? That what I think it does when I don't use qsub.
>
> Tiago
>
>  > -----Original Message-----
>  > From: torqueusers-bounces at supercluster.org
> <mailto:torqueusers-bounces at supercluster.org> [mailto:torqueusers-
>  > bounces at supercluster.org] On Behalf Of Gus Correa
>  > Sent: 19 February 2014 15:11
>  > To: Torque Users Mailing List
>  > Subject: Re: [torqueusers] qsub and mpiexec -f machinefile
>  >
>  > Hi Tiago
>  >
>  > The Torque/PBS node file is available to your job script through
the
>  > environmnent variable $PBS_NODEFILE.
>  > This file has one line listing the node name for each
processor/core
>  > that you requested.
>  > Just do a "cat $PBS_NODEFILE" inside your job script to see how it
>  > looks.
>  > Inside your job script, and before the mpiexec command, you can run
a
>  > brief auxiliary script to create the machinefile you need from the
the
>  > $PBS_NODEFILE.
>  > You will need to create this auxiliary script, tailored to your
>  > application.
>  > Still, this method won't bind the MPI processes to the appropriate
>  > hardware components (cores, sockets, etc), (in case this is also
part
>  > of your goal).
>  >
>  > Having said that, if you are using OpenMPI, it can be built with
Torque
>  > support (with the --with-tm=/torque/location configuration option).
>  > This would give you a range of options on how to assign different
>  > cores, sockets, etc, to different MPI ranks/processes, directly in
the
>  > mpiexec command, or in the OpenMPI runtime configuration files.
>  > This method would't require creating the machinefile from the
>  > PBS_NODEFILE.
>  > This second approach has the advantage of allowing you to bind the
>  > processes to cores, sockets, etc.
>  >
>  > I hope this helps,
>  > Gus Correa
>  >
>  > n 02/19/2014 07:40 AM, Tiago Silva (Cefas) wrote:
>  > > Hi,
>  > >
>  > > My MPI code is normally executed across a set of nodes with
something
>  > like:
>  > >
>  > > mpiexec -f machinefile -np 6 ./bin
>  > >
>  > > where the machinefile has 6 entries with node names, for
instance:
>  > >
>  > > n01
>  > >
>  > > n01
>  > >
>  > > n02
>  > >
>  > > n02
>  > >
>  > > n02
>  > >
>  > > n02
>  > >
>  > > Now the issue here is that this list has been optimised to
balance
>  > the
>  > > load between nodes and to reduce internode communication. So for
>  > > instance model domain tiles 0 and 1 will run on n01 while tiles 2
to
>  > 5
>  > > will run on n02.
>  > >
>  > > Is there a way to integrate this into qsub since I don't know
which
>  > > nodes will be assigned before submission? Or in other words can I
>  > > control grouping processes in one node?
>  > >
>  > > In my example I used 6 processes for simplicity but normally I
>  > > parallelise across 4-16 nodes and >100 processes.
>  > >
>  > > Thanks,
>  > >
>  > > tiago
>  > >
>  > >
>  > >
>  > >
>  > >
>  > > This email and any attachments are intended for the named
recipient
>  > > only. Its unauthorised use, distribution, disclosure, storage or
>  > > copying is not permitted. If you have received it in error,
please
>  > > destroy all copies and notify the sender. In messages of a
>  > > non-business nature, the views and opinions expressed are the
>  > author's
>  > > own and do not necessarily reflect those of Cefas. Communications
on
>  > > Cefas' computer systems may be monitored and/or recorded to
secure
>  > the
>  > > effective operation of the system and for other lawful purposes.
>  > >
>  > >
>  > >
>  > >
>  > > _______________________________________________
>  > > torqueusers mailing list
>  > > torqueusers at supercluster.org
<mailto:torqueusers at supercluster.org>
>  > > http://www.supercluster.org/mailman/listinfo/torqueusers
>  >
>  > _______________________________________________
>  > torqueusers mailing list
>  > torqueusers at supercluster.org <mailto:torqueusers at supercluster.org>
>  > http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
>
>
> This email and any attachments are intended for the named recipient
> only. Its unauthorised use, distribution, disclosure, storage or
copying
> is not permitted. If you have received it in error, please destroy all
> copies and notify the sender. In messages of a non-business nature,
the
> views and opinions expressed are the author's own and do not
necessarily
> reflect those of Cefas. Communications on Cefas? computer systems may
be
> monitored and/or recorded to secure the effective operation of the
> system and for other lawful purposes.
>
>
>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers



------------------------------

_______________________________________________
torqueusers mailing list
torqueusers at supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers


End of torqueusers Digest, Vol 115, Issue 15
********************************************
-------------- next part --------------
A non-text attachment was scrubbed...
Name: install screen dump
Type: application/octet-stream
Size: 62926 bytes
Desc: install screen dump
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20140221/edd4cde2/attachment-0001.obj 


More information about the torqueusers mailing list