[torqueusers] Torque is not handling jobs to nodes

Brock Palen brockp at umich.edu
Fri Sep 26 16:13:40 MDT 2008


Maybe your openMPI did not fine the location of libtorque when it was  
built.

run the command:

ompi_info

And make sure you see something like:

MCA ras: tm (MCA v1.0, API v1.3, Component v1.2.7)
MCA pls: tm (MCA v1.0, API v1.3, Component v1.2.7)

If you do then your openmpi has tm (pbs/torque) support to start jobs.
If not, and you have libtorque in a non standard location rebuilt  
openmpi with:

./configure --with-tm=/path/to/torque

kthxbye

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
brockp at umich.edu
(734)936-1985



On Sep 26, 2008, at 4:57 PM, Zhiliang Hu wrote:
> I have a situation:
>
>> qsub -l nodes=6:ppn=2 /path/to/mpi_program
>
> where "mpi_program" is:
>
> /path/to/mpirun -np 12 /path/to/my_program
>
> -- Everything went to run on head node (one time on the first  
> compute node).  Jobs can be done anyway.
>
> While the mpirun can run on its own by specifying a "-machinefile",  
> it is pointed out by Glen, and also on this web site http:// 
> wiki.hpc.ufl.edu/index.php/Common_Problems that it's not a good  
> idea to provide machinefile since it's "already handled by OpenMPI  
> and Torque".
>
> But in my case why the OpenMPI and Torque is not handling the jobs  
> to nodes?
>
> Zhiliang
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>



More information about the torqueusers mailing list