[torqueusers] No use time for parallel jobs
martin.schaffoener at e-technik.uni-magdeburg.de
Thu Feb 2 07:26:27 MST 2006
On Thursday 02 February 2006 14:11, Adrien Leygue wrote:
> All jobs that are requiring more than one processor have a "time use"
> of 0, even though some of them have been running for days!
How are the processes on the parallel jobs' nodes started? If it's done using
rsh (which is also used in mpirun, for example), then Torque cannot know
anything about the spawned processes and their resource usage. It only
accounts for the children of the PBS script (i.e. the possible mpirun and rsh
child processes), but not the processes on remote nodes.
Try Pete Wyckoff's mpiexec which supports spawning remote processes through
Torque's TM interface.
Cognitive Systems Group, Institute of Electronics, Signal Processing and
Communication Technologies, Department of Electrical Engineering,
Otto-von-Guericke University Magdeburg
Phone: +49 391 6720063
More information about the torqueusers