[torquedev] File not found with heavy PBS use

Luiz Angelo Daros de Luca luizluca at gmail.com
Thu Jul 16 11:14:27 MDT 2009


There is a monitor process that checks for running jobs that runned more
than the walltime in unreachable nodes. The nodes are diskless and loses job
info on reboot (or crash :-) ).

07/15/2009 01:50:23;0008;PBS_Server;Job;2915531.servidor.pcarga.local;purging
job without checking MOM
.
OK, I'll try it with the new version.

Thanks,

---
    Luiz Angelo Daros de Luca, Me.
           luizluca at gmail.com


2009/7/16 Garrick Staples <garrick at usc.edu>

> On Wed, Jul 15, 2009 at 12:14:46PM -0300, Luiz Angelo Daros de Luca
> alleged:
> > 01:50:23;0008;PBS_Server;Job;2915531.servidor.pcarga.local;purging job
> > without checking MOM
>
> Who or what is doing a qdel -p?  That is breaking things.
>
> You can certainly upgrade to the latest 2.1.x.  It is quite stable.
>
> --
> Garrick Staples, GNU/Linux HPCC SysAdmin
> University of Southern California
>
> The pro-disease movement: http://www.jennymccarthybodycount.com/
>
>
> _______________________________________________
> torquedev mailing list
> torquedev at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torquedev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torquedev/attachments/20090716/639f37f4/attachment.html 


More information about the torquedev mailing list