[torqueusers] pbsnodes reporting incorrect totmem/availmem

Gianfranco Sciacca gs at hep.ucl.ac.uk
Thu Jun 11 07:32:20 MDT 2009


On 11 Jun 2009, at 13:59, Ole Holm Nielsen wrote:

> Gianfranco Sciacca wrote:
>> we are having scheduling issues on our cluster, presumably due to   
>> wrong values of totmem and availmem reported by pbsnodes. Affected  
>> are  nodes with 16GB+16GB swap. Other nodes with up to 4GB+4GB swap  
>> seem to  report more or less consistent values.
>> I paste below an example with farm00 being the Torque server and   
>> farm25 the node being probed. I am not sure I can make any sense  
>> of  the numbers reported, except for Maui that seems to get it  
>> right. But  perhaps it's just me not understanding the output of  
>> pbsnodes. I can't  be positive about what values were reported some  
>> weeks back, but  scheduling made sense in that if jobs were  
>> submitted to the idle farm,  the 16+16GB nodes were surely  
>> prioritised as execution nodes over the  less equipped nodes.
>
> Some comments that may or may not be useful to you:
>
> 1. Your Torque version 2.1.8 is quite old, we use 2.1.11.
>
> 2. "pbsnodes -a" gives correct and consistent values for physical
>  memory and available memory (=physical+swap) in our cluster.
>  We have nodes with 8-24 GB RAM and 12-16 GB swap, and we run
>  CentOS 4 and 5 nodes.
>
> 3. May I recommend my script "pestat" for giving a quick overview
>  of the nodes' load and memory usage, based on parsing the output
>  of "pbsnodes -a" ?  Download ftp://ftp.fysik.dtu.dk/pub/Torque/ 
> pestat.
>  A sample output shows pmem (physical memory) and mem (physical
>  memory + swap) and other interesting stuff for job and cluster
>  monitoring:

Thanks!

[root at farm00 ~]# ./pestat
  node state  load    pmem ncpu   mem   resi usrs tasks  jobids/users
  farm00  free  2.00*  16046   8   3763  -3095  1/1    1*   1906410  
gjc 1906638 NONE*
  ...
  farm3  free  0.00    1010   4   3057    303  0/0    0
  ...
  farm10  free  0.00    3931   4   8025    318  0/0    0
  ...
  farm19  free  0.06   16033   8   3748    273  0/0    0
  farm20  free  0.14   16033   8   3748    273  0/0    0
  farm21  free  0.09   16033   8   3748    273  0/0    0
  farm22  free  0.13   16033   8   3748    273  0/0    0
  farm23  free  0.19   16033   8   3748    273  0/0    0
  farm24  free  0.27   16033   8   3748    273  0/0    0
  farm25  offl* 1.00   16046   8   3760  -3218  1/1    1    1906412 gjc
  farm26  free  1.04*  16046   8   3760    359  1/1    0*   1906639  
NONE*
  farm27  free  1.08*  16046   8   3760    358  1/1    0*   1906640  
NONE*
  farm28  free  0.13   16046   8   3760    274  0/0    0
  farm29  free  0.03   16046   8   3760    274  0/0    0
  ...

So this shows that idle nodes with 1+2GB swap, 4+4GB swap report  
correctly. Nodes with 16+16GB don't. Even more interestingly, nodes  
19-24 are different machines than nodes 25-29. Even though the amount  
of memory+swap installed is the same on all, they report different  
"mem" values across the two ranges. Also the only node currently  
running a job shows the same amount of "mem" as the idle ones.

The only further detail I can add, maybe relevant, maybe not is that  
*all* 16+16 nodes, just a few days back where running jobs for a user  
whose homedir went over-quota. For some weird reason (or mechanism  
unknown to me), all his jobs disappeared from the queue (not showing  
in qstat), but job files remained in /var/spool/pbs/mom_priv/jobs  
across the nodes, *and* were showing as active tasks by pbsnodes. I  
*think* (but I'm not 100% sure), this is when the scheduling funnies  
started. I did purge /var/spool/pbs/mom_priv/jobs directories by hand  
and restarted moms and server+maui (multiple times). I haven't tried  
re-booting the head node, but I'd rather not, unless it becomes clear  
that it can help.

Suggestions?

Thanks,
Gianfranco

-- 
Dr. Gianfranco Sciacca			Tel: +44 (0)20 7679 3044
Dept of Physics and Astronomy		Internal: 33044
University College London		D15 - Physics Building
London WC1E 6BT



-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2944 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20090611/2c793ae4/attachment.bin 


More information about the torqueusers mailing list