[torquedev] File not found with heavy PBS use

Luiz Angelo Daros de Luca luizluca at gmail.com
Mon Jul 20 08:18:13 MDT 2009


I upgraded to version 2.1.11 and also added the mon_sync option.
However, I still got the jobs kept

$ qstat 3599908; pbsnodes fobos
Job id              Name             User            Time Use S Queue
------------------- ---------------- --------------- -------- - -----
3599908.atlas       ...687-C2ED08_96 pcarga                 0 R small
fobos
     state = free
     np = 1
     properties = nodo,previsao,fobos,ajuste
     ntype = cluster
     jobs = 0/3599908.atlas.lsc.ufsc.br
     status = opsys=linux,uname=Linux fobos 2.6.24.7-server-2mnb #1 SMP Thu
Oct 30 18:40:15 EDT 2008 i686,sessions=? 0,nsessions=?
0,nusers=0,idletime=10816,totmem=515632kb,availmem=495328kb,physmem=515632kb,ncpus=1,loadave=0.15,netload=24481101,state=free,jobs=?
0,rectime=1248099312

qmgr
(...)
        tcp_timeout = 6
        default_node = 1
        mom_job_sync = True
        pbs_version = 2.1.11


Even node been free, there is a job for it and sched refuses to set a job to
it.
This way, my only option is purge. :-(

BTW, why purge is not recommended?

I hope the 2.1.11 upgrade will solve the crash (my real problem at first).

Cheers
---
    Luiz Angelo Daros de Luca, Me.
           luizluca at gmail.com


2009/7/18 Chris Samuel <csamuel at vpac.org>

>
> ----- "Luiz Angelo Daros de Luca" <luizluca at gmail.com> wrote:
>
> > Oh... Great tip! Thanks
>
> No worries!
>
> --
> Christopher Samuel - (03) 9925 4751 - Systems Manager
>  The Victorian Partnership for Advanced Computing
>  P.O. Box 201, Carlton South, VIC 3053, Australia
> VPAC is a not-for-profit Registered Research Agency
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torquedev/attachments/20090720/7c1350b9/attachment.html 


More information about the torquedev mailing list