[torqueusers] OSC mpiexec with torque on Fedora6
schreian at bc.edu
Wed Jan 31 13:27:51 MST 2007
On Jan 30, 2007, at 1:34 PM, Tony Schreiner wrote:
> I am wanting to upgrade a cluster from Fedora 4 to Fedora 6, but am
> hung up on the OSC mpiexec part.
> I have torque 2.1.6-1 from the Fedora repo installed.
> mpiexec compiles fine, I used
> ./configure ---with-default-comm=mpich-p4
> my script, dompi is basically
> /path/to/mpiexec ./app
> I submit the dompi script, with
> qsub -l nodes=nodeX dompi
> on the node I upgraded (node5), I get in the error log
> mpiexec: Error: get_hosts: pbs_connect: no error.
> and this is because pbs_connect(0) in get_hosts.c returns -1 for
> me on this node, I guess it's supposed to return the number of
> available nodes.
> It still works on the other ones though.
> Some sort of host resolution error? Everything seems fine to me.
If I may answer my own question. I got the vital clue from Pete
Wyckoff at OSC. The error pointed to problems with the pbs_iff program.
I had installed the torque, torque-mom and libtorque RPMs from
Fedora, but had not installed torque-client which is where pbs_iff is
found. After I corrected that the problem was solved.
More information about the torqueusers