[torqueusers] OSC mpiexec with torque on Fedora6

Tony Schreiner schreian at bc.edu
Wed Jan 31 13:27:51 MST 2007


On Jan 30, 2007, at 1:34 PM, Tony Schreiner wrote:

> I am wanting to upgrade a cluster from Fedora 4 to Fedora 6, but am  
> hung up on the OSC mpiexec part.
>
> I have torque  2.1.6-1 from the Fedora repo installed.
>
> mpiexec compiles fine, I used
> ./configure ---with-default-comm=mpich-p4
>
> my script, dompi is basically
>
> /path/to/mpiexec ./app
>
> I submit the dompi script, with
> qsub -l nodes=nodeX dompi
>
> on the node I upgraded (node5), I get in the error log
> mpiexec: Error: get_hosts: pbs_connect: no error.
>
> and this is because pbs_connect(0) in  get_hosts.c returns -1 for  
> me on this node, I guess it's supposed to return the number of  
> available nodes.
>
> It still works on the other ones though.
>
> Some sort  of host resolution error? Everything seems fine to me.
>

If I may answer my own question. I got the vital clue from Pete  
Wyckoff at OSC. The error pointed to problems with the pbs_iff program.

I had installed the torque, torque-mom and libtorque RPMs from  
Fedora, but had not installed torque-client which is where pbs_iff is  
found. After I corrected that the problem was solved.

Tony Schreiner



More information about the torqueusers mailing list