[torqueusers] The strange issue when submit job with pbs: libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes. and so on.
lloyd_brown at byu.edu
Mon Nov 14 09:57:51 MST 2011
Well, first of all, if you are using OpenMPI with Torque, then you
really should get it recompiled with the TM API, so that the remote
pbs_mom's can be the parent process of the corresponding user
processes. To check if your version of OpenMPI was compiled with it, do
something like "ompi_info | grep tm", and see what the output shows.
Here's mine (Ignore the ptmalloc line):
> $ ompi_info | grep tm
> MCA memory: ptmalloc2 (MCA v2.0, API v2.0, Component v1.4.2)
> MCA ras: tm (MCA v2.0, API v2.0, Component v1.4.2)
> MCA plm: tm (MCA v2.0, API v2.0, Component v1.4.2)
Even without that, the first process in the list will STILL be launched
as a child of pbs_mom; the TM API enables the *other* processes to also be.
Having said all that, if you have something that launches remote
processes some other way, like some other MPI version, or some
commercial package (usually uses ssh or rsh), then I'm not sure. Maybe
the SSH daemon config or startup script? Maybe somewhere in
On 11/14/11 8:07 AM, Stijn De Weirdt wrote:
> hi lloyd,
> we have the -n limit set like this as well, but if an application
> doesn't use torque to start the mpi processes on the other nodes (eg
> some mpi build without torque support and relying on eg ssh), how is
> this limit then set?
More information about the torqueusers