[torqueusers] can't run jobs over 62 nodes 2ppn on OS X
Glen Beane
beaneg at umcs.maine.edu
Fri Sep 24 12:59:06 MDT 2004
I get Invalid argument errors in open_demux, every time I try to start
a job using more than 62 nodes
I noticed that sysconf(_SC_OPEN_MAX) always returns 255 on my apples.
I hacked mpiexec and pbs_demux to use 1024 instead of what is returned
by sysconf(_SC_OPEN_MAX) always returns 255 but that didn't help.
09/24/2004 14:21:17;0008;
pbs_mom;Job;1251.bender.bender.clusters.umaine.edu;JOIN JOB as node 63
09/24/2004 14:21:19;0008;
pbs_mom;Job;1251.bender.bender.clusters.umaine.edu;start_process: task
started, tid 128, sid 831, cmd /bin/sh
09/24/2004 14:21:21;0001; pbs_mom;Svr;pbs_mom;Invalid argument (22)
in open_demux, open_demux: connect 10.0.1.73:49512
09/24/2004 14:21:21;0001; pbs_mom;Svr;pbs_mom;Invalid argument (22)
in search_env_and_open, failed connect to mpiexec process on MS
09/24/2004 14:21:21;0001; pbs_mom;Svr;pbs_mom;Invalid argument (22)
in search_env_and_open, MPIEXEC_STDOUT_PORT=49512
Does anyone have any ideas?
More information about the torqueusers
mailing list