[torqueusers] No use time for parallel jobs

Martin Schafföner martin.schaffoener at e-technik.uni-magdeburg.de
Thu Feb 2 07:26:27 MST 2006


On Thursday 02 February 2006 14:11, Adrien Leygue wrote:

> All jobs that are requiring more than one processor have a "time use"
> of 0, even though some of them have been running for days!

How are the processes on the parallel jobs' nodes started? If it's done using 
rsh (which is also used in mpirun, for example), then Torque cannot know 
anything about the spawned processes and their resource usage. It only 
accounts for the children of the PBS script (i.e. the possible mpirun and rsh 
child processes), but not the processes on remote nodes.

Try Pete Wyckoff's mpiexec which supports spawning remote processes through 
Torque's TM interface.

Regards,
-- 
Martin Schafföner

Cognitive Systems Group, Institute of Electronics, Signal Processing and 
Communication Technologies, Department of Electrical Engineering, 
Otto-von-Guericke University Magdeburg
Phone: +49 391 6720063


More information about the torqueusers mailing list