[torqueusers] exec of shell '/usr/sbin/pbs_demux' failed
aaderhold at gmail.com
Tue Mar 6 09:18:41 MST 2007
I'm using the batch system in a distributed manner with pbsdsh and
it's running several days usually. It works fine but from time to time
the following error message is generated:
PBS: exec of shell '/usr/sbin/pbs_demux' failed
/var/torque/mom_priv/jobs/24432.k10.b.SC: line 19: pbsdsh: command not found
when trying to submit a new job. When this happens all consecutive
jobs can fail like this for some minutes or even hours.
Something seems to change on the batch system nodes states at times
When I then start qsub interactive in this situation with several
nodes e.g. like this
> qsub -l nodes=10 -I
the first line of the error message (PBS: exec of shell
'/usr/sbin/pbs_demux' failed) appears. I wonder if this might be due
some port or pbs_mom status change.
Has anybody came across this problem? Thanks for Help!!
More information about the torqueusers