[torqueusers] remaining memory on a node

Troy Baer tbaer at utk.edu
Fri Aug 12 13:29:17 MDT 2011


On Fri, 2011-08-12 at 18:58 +0000, Andrus, Brian Contractor wrote:
> Ok. That is what is odd to me.
> 
> Pbsnodes is telling me:
> status = rectime=1313175218,varattr=,jobs=141859.hamming2.local 141328.hamming2.local,state=free,netload=481480719018,gres=,loadave=0.01,ncpus=16,physmem=24676992kb,availmem=44749640kb,totmem=45159856kb,idletime=617965,nusers=0,nsessions=? 15201,sessions=? 15201,uname=Linux compute-8-33.local 2.6.18-194.17.1.el5 #1 SMP Wed Sep 29 12:50:31 EDT 2010 x86_64,opsys=linux
> 
> so 
> physmem=24676992kb
>  availmem=44749640kb
> 
> Wow... I have way more available than is physically in the system!???
> 
> >From the node itself:
> [bdandrus at compute-8-33 ~]$ cat /proc/meminfo |grep MemTotal
> MemTotal:     24676992 kB
> 
> So how does mom calculate this??

IIRC, availmem is how much *virtual* memory is available, i.e. available
physmem plus available swap.  I don't think pbs_mom currently has a way
to tell the server or scheduler just how much physical memory is
available, short of disabling swap on the compute nodes.

BTW, there's a bug in Moab (and perhaps Maui as well) where totmem is
interpreted as the amount of swap available, rather than using the
difference between totmem and physmem.  I reported this in Moab just shy
of a year ago, but it does not appear to be fixed as of Moab 6.0.4:
-----
# for a node with 24 GB of physmem and 2 GB of swap:
troy at kidlogin2:~$ qstat --version
version: 2.5.7

troy at kidlogin2:~$ showq --version
Version: moab client 6.0.4 (revision 2, changeset
7e677d9743ea82a45d5b8fa75c9af9883dd9418e)

troy at kidlogin2:~$ checknode kid001
node kid001.nics.utk.edu
[...]
Configured Resources: PROCS: 12  MEM: 23G  SWAP: 25G  DISK: 28G  GPUS: 3
[...]
-----

	--Troy
-- 
Troy Baer, HPC System Administrator
National Institute for Computational Sciences, University of Tennessee
http://www.nics.tennessee.edu/
Phone:  865-241-4233




More information about the torqueusers mailing list