[torqueusers] Warning against using Torque 2.1.5 ?
Ole Holm Nielsen
Ole.H.Nielsen at fysik.dtu.dk
Tue Oct 24 00:56:16 MDT 2006
Since upgrading to Torque 2.1.5 we've seen a number of jobs
mysteriously hang doing no work. These jobs are all single-node
Open-MPI jobs using 4 CPUs. In /var/log/messages I see:
Oct 24 08:46:08 n057 pbs_mom: File exists (17) in open_std_out_err, Unable to
open standard output/error
Oct 24 08:46:08 n057 pbs_mom: Inappropriate ioctl for device (25) in
start_process, cannot open job stderr/stdout files
and the MOM log says:
10/24/2006 08:46:08;0001; pbs_mom;Job;6614.audhumbla.fysik.dtu.dk;task not
started, 'orted', stdio setup failed (see syslog)
I guess that these problems may be related to Garrick's note in
http://www.clusterresources.com/pipermail/torquedev/2006-October/000356.html
Perhaps it is necessary to issue a general warning against using
Torque 2.1.5, and postpone upgrading until 2.1.6 is available ??
--
Ole Holm Nielsen
Department of Physics, Technical University of Denmark
More information about the torqueusers
mailing list