[torqueusers] mpiexec and mixing timeshared/job-exclusive nodes
carheden at cira.colostate.edu
Wed Mar 8 17:40:14 MST 2006
I'm setting up Torque and Mpiexec on a cluster to run an mpi job that doesn't treat all nodes as equals. More specifically, the master node allocates lots of memory but does very little processing while the compute nodes need less memory but do lots of CPU. With mpirun, things work fine because your run it on the master node and that always ends up as the first node in the MPI node list. With mpiexec and torque, however, node3 of my 3-node test cluster always ends up as mpi node 0. Furhtermore, it gets job-exclusive access to the node, preventing other jobs from running.
Ideally what I need is a PBS script that will allow me to specify a specific node (which happens to be my PBS server as well) for time-shared access as well as X additional nodes for job-exclusive access. I then need mpiexec to pick up this time-shared node as the first mpi node and the rest as the additional mpi nodes.
Does anyone know if this is possible and/or how to do it?
More information about the torqueusers