[torqueusers] torque-2.1.6 - pbs_mom cannot write to its log

Garrick Staples garrick at usc.edu
Wed Oct 17 13:55:02 MDT 2007


On Wed, Oct 17, 2007 at 11:49:01AM +0200, Alessandro Federico alleged:
> Hi all.
> 
> I'm running torque-2.1.6 on SLES10 x86_64 (2.6.16.27-0.9-smp).
> Sometimes I observe this strange behavior:
> 
> 1) before a node starts/joins the first job of the day
> the file descriptor of the log file is correct
> 
> --------------------------------------------
> # lsof -p `pidof pbs_mom` | grep mom_logs
> pbs_mom 7541 root    3w   REG     8,1  208319 126550 
> /opt/spool/torque/mom_logs/20071017
> --------------------------------------------
> 
> 2) after the node starts/joins the first job of the day
> the file descriptor of the log file becomes corrupted

It's probably some other memory corruption going on.  Can you duplicate with 2.1.9?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20071017/1d0a34ca/attachment.bin


More information about the torqueusers mailing list