[torqueusers] Multi-node jobs get one node
Glen Beane
glen.beane at gmail.com
Fri Oct 3 21:11:49 MDT 2008
On Thu, Oct 2, 2008 at 11:03 AM, Glen Jungels <jungelsman at hotmail.com>wrote:
> I'm a new torque user. In fact, this is for my directed project for my
> masters degree, so I hope someone can help. My small cluster consists of 4
> physical nodes and 4 virtual machine nodes. I can ssh from any machine to
> any machine w/o being prompted for a password. My shared home folder is on
> a Lustre file system. None of the Torque binaries or libraries are located
> there. I am also using Maui. When I submit jobs, even when I specify
> specific machines, only the first node in the list from exehosts processes.
> I thought it may be node specific, so I killed pbs_mom on every node except
> two, and no matter which node is first in the list, it is the only one that
> processes. My executables are NPB 2.4. I've tried both the CG benchmark as
> well as the EP benchmark. What on earth is going on?
>
We'll need more information about what you are trying to do. Are these MPI
programs? Can you share one of your job scripts?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20081003/c567847a/attachment.html
More information about the torqueusers
mailing list