[torqueusers] Bug in Torque 1.2.0p6 ?

Garrick Staples garrick at usc.edu
Wed Jan 25 12:36:40 MST 2006


On Tue, Jan 24, 2006 at 06:13:04PM +0100, Jacques Foury alleged:
> Hi.
> 
> We're running a 6-nodes cluster composed of bi-Opteron computers.
> 
> One of our nodes is currently running 3 jobs instead of 2, and we have a 
> strange result when typing qstat :
> 
>                                                            Req'd  
> Req'd   Elap
> Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  
> S Time
> --------------- -------- -------- ---------- ------ --- --- ------ ----- 
> - -----
> 7447.ulmo.calcu bouchere infiniCa twi1D2      31413   1  -- 1000mb 2000: 
> R 153:5
>   callas04/0
> 7993.ulmo.math. khodor   q1jourCa microf        --   --  -- 1500mb 02:00 
> Q   --
>   callas04/1
> 7994.ulmo.math. khodor   q1jourCa nsmgev      30742  --  -- 1500mb 02:00 
> R 00:52
>   callas04/1
> 
> Job 7993 is marked as QUEUED, but has a processor reserved... the same 
> processor as 7994 !
> 
> but it is actually RUNNING on the node :

This shouldn't happen with newer TORQUEs.  Can you reproduce this with a
newer version?

-- 
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20060125/647c3203/attachment.bin


More information about the torqueusers mailing list