[torqueusers] Exceeded job limits on nodes
Piotr Siwczak
psiwczak at man.poznan.pl
Tue Mar 28 06:32:34 MST 2006
Hi,
I am running torque + maui on an Opteron cluster. A strange thing has been
happening recently with 3 of our nodes. All of them have:
Node w38
state = free
np = 6
properties = lcgpro
ntype = cluster
jobs = 0/2408.fangorn.man.poznan.pl, 1/2409.fangorn.man.poznan.pl,
2/2410.fangorn.man.poznan.pl, 3/2626.fangorn.man.poznan.pl,
3/2625.fangorn.man.poznan.pl, 3/2624.fangorn.man.poznan.pl,
3/2623.fangorn.man.poznan.pl, 3/2622.fangorn.man.poznan.pl,
3/2621.fangorn.man.poznan.pl, 3/2620.fangorn.man.poznan.pl,
3/2619.fangorn.man.poznan.pl, 3/2618.fangorn.man.poznan.pl,
3/2617.fangorn.man.poznan.pl, 3/2616.fangorn.man.poznan.pl,
3/2615.fangorn.man.poznan.pl, 3/2614.fangorn.man.poznan.pl,
3/2613.fangorn.man.poznan.pl, 3/2612.fangorn.man.poznan.pl,
3/2611.fangorn.man.poznan.pl, 3/2610.fangorn.man.poznan.pl,
3/2609.fangorn.man.poznan.pl, 3/2608.fangorn.man.poznan.pl,
3/2607.fangorn.man.poznan.pl, 3/2606.fangorn.man.poznan.pl,
3/2605.fangorn.man.poznan.pl, 3/2604.fangorn.man.poznan.pl,
3/2603.fangorn.man.poznan.pl, 3/2602.fangorn.man.poznan.pl,
3/2601.fangorn.man.poznan.pl, 3/2600.fangorn.man.poznan.pl,
3/2599.fangorn.man.poznan.pl, 3/2598.fangorn.man.poznan.pl,
3/2597.fangorn.man.poznan.pl, 3/2596.fangorn.man.poznan.pl,
3/2595.fangorn.man.poznan.pl, 3/2594.fangorn.man.poznan.pl,
3/2593.fangorn.man.poznan.pl, 3/2592.fangorn.man.poznan.pl,
3/2591.fangorn.man.poznan.pl, 3/2590.fangorn.man.poznan.pl,
3/2589.fangorn.man.poznan.pl, 3/2588.fangorn.man.poznan.pl
As you probably see from the above excerpt, the number of jobs far exceeds
the number of slots. Further more, the node is still shown as "free". Has
anyone got any idea what's going on here?
Piotr
--
Piotr Siwczak <psiwczak at man.poznan.pl>
System Administrator
Poznan Supercomputing and Networking Center
Supercomputing Department
(www.eu-egee.org <piotr.siwczak at cern.ch>)
--
More information about the torqueusers
mailing list