[torqueusers] pbsnodes reporting incorrect totmem/availmem
Gianfranco Sciacca
gs at hep.ucl.ac.uk
Thu Jun 11 07:32:20 MDT 2009
On 11 Jun 2009, at 13:59, Ole Holm Nielsen wrote:
> Gianfranco Sciacca wrote:
>> we are having scheduling issues on our cluster, presumably due to
>> wrong values of totmem and availmem reported by pbsnodes. Affected
>> are nodes with 16GB+16GB swap. Other nodes with up to 4GB+4GB swap
>> seem to report more or less consistent values.
>> I paste below an example with farm00 being the Torque server and
>> farm25 the node being probed. I am not sure I can make any sense
>> of the numbers reported, except for Maui that seems to get it
>> right. But perhaps it's just me not understanding the output of
>> pbsnodes. I can't be positive about what values were reported some
>> weeks back, but scheduling made sense in that if jobs were
>> submitted to the idle farm, the 16+16GB nodes were surely
>> prioritised as execution nodes over the less equipped nodes.
>
> Some comments that may or may not be useful to you:
>
> 1. Your Torque version 2.1.8 is quite old, we use 2.1.11.
>
> 2. "pbsnodes -a" gives correct and consistent values for physical
> memory and available memory (=physical+swap) in our cluster.
> We have nodes with 8-24 GB RAM and 12-16 GB swap, and we run
> CentOS 4 and 5 nodes.
>
> 3. May I recommend my script "pestat" for giving a quick overview
> of the nodes' load and memory usage, based on parsing the output
> of "pbsnodes -a" ? Download ftp://ftp.fysik.dtu.dk/pub/Torque/
> pestat.
> A sample output shows pmem (physical memory) and mem (physical
> memory + swap) and other interesting stuff for job and cluster
> monitoring:
Thanks!
[root at farm00 ~]# ./pestat
node state load pmem ncpu mem resi usrs tasks jobids/users
farm00 free 2.00* 16046 8 3763 -3095 1/1 1* 1906410
gjc 1906638 NONE*
...
farm3 free 0.00 1010 4 3057 303 0/0 0
...
farm10 free 0.00 3931 4 8025 318 0/0 0
...
farm19 free 0.06 16033 8 3748 273 0/0 0
farm20 free 0.14 16033 8 3748 273 0/0 0
farm21 free 0.09 16033 8 3748 273 0/0 0
farm22 free 0.13 16033 8 3748 273 0/0 0
farm23 free 0.19 16033 8 3748 273 0/0 0
farm24 free 0.27 16033 8 3748 273 0/0 0
farm25 offl* 1.00 16046 8 3760 -3218 1/1 1 1906412 gjc
farm26 free 1.04* 16046 8 3760 359 1/1 0* 1906639
NONE*
farm27 free 1.08* 16046 8 3760 358 1/1 0* 1906640
NONE*
farm28 free 0.13 16046 8 3760 274 0/0 0
farm29 free 0.03 16046 8 3760 274 0/0 0
...
So this shows that idle nodes with 1+2GB swap, 4+4GB swap report
correctly. Nodes with 16+16GB don't. Even more interestingly, nodes
19-24 are different machines than nodes 25-29. Even though the amount
of memory+swap installed is the same on all, they report different
"mem" values across the two ranges. Also the only node currently
running a job shows the same amount of "mem" as the idle ones.
The only further detail I can add, maybe relevant, maybe not is that
*all* 16+16 nodes, just a few days back where running jobs for a user
whose homedir went over-quota. For some weird reason (or mechanism
unknown to me), all his jobs disappeared from the queue (not showing
in qstat), but job files remained in /var/spool/pbs/mom_priv/jobs
across the nodes, *and* were showing as active tasks by pbsnodes. I
*think* (but I'm not 100% sure), this is when the scheduling funnies
started. I did purge /var/spool/pbs/mom_priv/jobs directories by hand
and restarted moms and server+maui (multiple times). I haven't tried
re-booting the head node, but I'd rather not, unless it becomes clear
that it can help.
Suggestions?
Thanks,
Gianfranco
--
Dr. Gianfranco Sciacca Tel: +44 (0)20 7679 3044
Dept of Physics and Astronomy Internal: 33044
University College London D15 - Physics Building
London WC1E 6BT
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2944 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20090611/2c793ae4/attachment.bin
More information about the torqueusers
mailing list