[torqueusers] Multi-node jobs get one node
glen.beane at gmail.com
Fri Oct 3 21:11:49 MDT 2008
On Thu, Oct 2, 2008 at 11:03 AM, Glen Jungels <jungelsman at hotmail.com>wrote:
> I'm a new torque user. In fact, this is for my directed project for my
> masters degree, so I hope someone can help. My small cluster consists of 4
> physical nodes and 4 virtual machine nodes. I can ssh from any machine to
> any machine w/o being prompted for a password. My shared home folder is on
> a Lustre file system. None of the Torque binaries or libraries are located
> there. I am also using Maui. When I submit jobs, even when I specify
> specific machines, only the first node in the list from exehosts processes.
> I thought it may be node specific, so I killed pbs_mom on every node except
> two, and no matter which node is first in the list, it is the only one that
> processes. My executables are NPB 2.4. I've tried both the CG benchmark as
> well as the EP benchmark. What on earth is going on?
We'll need more information about what you are trying to do. Are these MPI
programs? Can you share one of your job scripts?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the torqueusers