[torqueusers] Multi-node jobs get one node

Glen Beane glen.beane at gmail.com
Fri Oct 3 21:11:49 MDT 2008


On Thu, Oct 2, 2008 at 11:03 AM, Glen Jungels <jungelsman at hotmail.com>wrote:

> I'm a new torque user.  In fact, this is for my directed project for my
> masters degree, so I hope someone can help.  My small cluster consists of 4
> physical nodes and 4 virtual machine nodes.  I can ssh from any machine to
> any machine w/o being prompted for a password.  My shared home folder is on
> a Lustre file system.  None of the Torque binaries or libraries are located
> there.  I am also using Maui.  When I submit jobs, even when I specify
> specific machines, only the first node in the list from exehosts processes.
> I thought it may be node specific, so I killed pbs_mom on every node except
> two, and no matter which node is first in the list, it is the only one that
> processes.  My executables are NPB 2.4.  I've tried both the CG benchmark as
> well as the EP benchmark.  What on earth is going on?
>

 We'll need more information about what you are trying to do.  Are these MPI
programs?  Can you share one of your job scripts?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20081003/c567847a/attachment.html


More information about the torqueusers mailing list