[torqueusers] torque-2.2.1 pbs_mom cannot write to its log

Bill Marmagas zorba at vt.edu
Fri Apr 24 10:19:27 MDT 2009

I'm having a problem that started within the last month on some of  
our our existing SGI Altix servers, which are all running  
torque-2.2.1 pbs_mom.  These pbs_mom's are talking to a torque-2.1.8  
pbs_server, which is talking to a moab-5.2.1 scheduler.  The problem  
seems to have started after we added a second Torque server --  
running torque-2.3.6 and serving the torque-2.3.6 pbs_mom's of a new  
x86-64 cluster -- and configured our main (and only) Moab server to  
additionally talk to that second pbs_server.  Here is what I've found:

Log Messages

These daemon messages are appearing in the system logs:

pbs_mom: Broken pipe (32) in log_record, PBS cannot write to its log


Log files in /var/spool/torque/mom_logs start out fine but become  

Log file does not show up in output of "lsof -p `pidof pbs_mom`" or  
even "lsof | grep mom_logs" (it does on the systems that don't have  
log file corruption)


This seems to be initiated sometime after the start of new jobs.

Anyone seen this type of behavior or have ideas?


Bill Marmagas
Senior Systems Engineer
Systems Engineering & Administration
Virginia Tech

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20090424/d9c72341/attachment.html 

More information about the torqueusers mailing list