[torqueusers] Torque with OpenMPI

Jozef Káčer quickparser at gmail.com
Thu Feb 21 12:27:01 MST 2008


You may be right. I'm no programmer ;)
However I have one more question for you. I have 11 working nodes, which
have
2 processors (actually 2 logical cores on P4 with HT). So I have 22
processors,\
even torque recognize them as well.
When I want to submit job to more than 11 nodes, it won't allow me to do so.

I can't tell you the exact message as I don't have access to my cluster (not
even remotely) at the moment.
Is there a way to set it up? I'm sorry I can't tell you any further details
now.

Anyway, thank you very much with that code. It works 100% now.

On Thu, Feb 21, 2008 at 7:55 PM, Craig West <cwest at astro.umass.edu> wrote:

> Jozef,
>
> There isn't actually a processor lost. I just guessed at how the code
> worked before I had seen the code itself. After looking at the code you
> can see that the first processor sends and receives messages to all the
> other processors. It doesn't send one to itself.
>
> >
> > It seems to me that one processor is still lost, but I have no bug
> > info with this.
> > However, when I run it using torque, the job seems to be hung. 'showq'
> > shows
> > that the job is running but never finishes.
> >
> > All my nodes are running now. qstat -f tells me that the job was
> > assigned to these hosts:
> >
> >     exec_host =
> > f135-15/1+f135-15/0+f135-14/1+f135-14/0+f135-13/1+f135-13/0+f1
> >         35-12/0
> >
> > I'm thankful for your time and effort.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20080221/943891af/attachment.html


More information about the torqueusers mailing list