[torqueusers] 4GB resources_used.mem limit
Garrick Staples
garrick at usc.edu
Thu Jun 30 14:44:19 MDT 2005
On Wed, Jun 29, 2005 at 11:13:26AM +0200, Bernd Schubert alleged:
> Hello,
>
> we have a cluster running a combination of torque + maui. In principle its
> running fine, we only have one pretty annoying problem, torque does not
> detect jobs running more than 4GB, qstat always only shows
> 'actual_size - 4GB' for jobs with more than 4GB.
I'm not able to test this. But the first thing you need to do is figure out if
pbs_mom is reporting the wrong info, or if pbs_server is breaking it.
You can query this info directly from pbs_mom using momctl or a small util I
wrote awhile ago called dumpmom (http://www-rcf.usc.edu/~garrick/dumpmom.c)
To use momctl, first get the session list, then get the memory usage of that
session. Here's an example with a node having 2 sessions, and 1 of them is
using 100MB.
$ momctl -q sessions -h hpc0961
hpc0961: sessions = 'sessions=30631 30651'
$ momctl -q 'mem[session=30631]' -h hpc0961
hpc0961: mem[session=30631] = 'mem[session=30631]=120856kb'
dumpmom is easier for this particular purpose, just do 'dumpmom hpc0961' and
it will print out lots of similar information.
If you can verify that pbs_mom is sending the correct info, then we can look
into pbs_server.
--
Garrick Staples, Linux/HPCC Administrator
University of Southern California
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20050630/8241f5a9/attachment.bin
More information about the torqueusers
mailing list