[torqueusers] Torque is not handling jobs to nodes
Brock Palen
brockp at umich.edu
Fri Sep 26 16:13:40 MDT 2008
Maybe your openMPI did not fine the location of libtorque when it was
built.
run the command:
ompi_info
And make sure you see something like:
MCA ras: tm (MCA v1.0, API v1.3, Component v1.2.7)
MCA pls: tm (MCA v1.0, API v1.3, Component v1.2.7)
If you do then your openmpi has tm (pbs/torque) support to start jobs.
If not, and you have libtorque in a non standard location rebuilt
openmpi with:
./configure --with-tm=/path/to/torque
kthxbye
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
brockp at umich.edu
(734)936-1985
On Sep 26, 2008, at 4:57 PM, Zhiliang Hu wrote:
> I have a situation:
>
>> qsub -l nodes=6:ppn=2 /path/to/mpi_program
>
> where "mpi_program" is:
>
> /path/to/mpirun -np 12 /path/to/my_program
>
> -- Everything went to run on head node (one time on the first
> compute node). Jobs can be done anyway.
>
> While the mpirun can run on its own by specifying a "-machinefile",
> it is pointed out by Glen, and also on this web site http://
> wiki.hpc.ufl.edu/index.php/Common_Problems that it's not a good
> idea to provide machinefile since it's "already handled by OpenMPI
> and Torque".
>
> But in my case why the OpenMPI and Torque is not handling the jobs
> to nodes?
>
> Zhiliang
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
More information about the torqueusers
mailing list