[torqueusers] torque-2.2.1 pbs_mom cannot write to its log
zorba at vt.edu
Fri Apr 24 10:19:27 MDT 2009
I'm having a problem that started within the last month on some of
our our existing SGI Altix servers, which are all running
torque-2.2.1 pbs_mom. These pbs_mom's are talking to a torque-2.1.8
pbs_server, which is talking to a moab-5.2.1 scheduler. The problem
seems to have started after we added a second Torque server --
running torque-2.3.6 and serving the torque-2.3.6 pbs_mom's of a new
x86-64 cluster -- and configured our main (and only) Moab server to
additionally talk to that second pbs_server. Here is what I've found:
These daemon messages are appearing in the system logs:
pbs_mom: Broken pipe (32) in log_record, PBS cannot write to its log
Log files in /var/spool/torque/mom_logs start out fine but become
Log file does not show up in output of "lsof -p `pidof pbs_mom`" or
even "lsof | grep mom_logs" (it does on the systems that don't have
log file corruption)
This seems to be initiated sometime after the start of new jobs.
Anyone seen this type of behavior or have ideas?
Senior Systems Engineer
Systems Engineering & Administration
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the torqueusers