Fwd: [torqueusers] Confused about vmem

Diego M. Vadell dvadell at linuxclusters.com.ar
Fri Oct 3 12:04:43 MDT 2008


Anyone? Any hint will be appreciated.

TIA
 -- Diego


----------  Forwarded Message  ----------

Subject: [torqueusers] Confused about vmem
Date: Monday 29 September 2008
From: "Diego M. Vadell" <dvadell at linuxclusters.com.ar>
To: torqueusers at supercluster.org

Hi list,

   We are starting to restrict the amount of RAM with torque. We have 4 cores 
nodes with 4Gb of RAM.

   I made some tests with setting, via qmgr, a default of pvmem = 1Gb and it 
seemed to be what we were looking for, until we considered a particular job.

   This job uses openMP. It is one process that needs a whole node: the 4Gb 
and the 4 cores. 

   If I ask for nodes=1:ppn=4,pvmem=1G, torque will get me a whole node, but 
it will kill this job's only process  when it uses more than 1G. If I ask for 
nodes=1:ppn=4,pvmem=4G, torque will look for a node with 16Gb of RAM, and I 
don't have one of those.

   So I found vmem. As I can't make lots of test right now (I have to schedule 
downtime to do that safely) I made some little ones (with a program called 
memtest that just malloc's and fills memory):

1) Got an interactive session with vmem=1Gb and ran 1 memtest: 
qsub -I -l nodes=1:ppn=2 -l vmem=1000mb
Commmand line: examples/memtest 
Result: malloc failure after 993 MiB
Expected: This is Ok as vmem is the total amount of virtual memory the whole 
job can use.

2) Got an interactive session with vmem=1Gb and ran 2 memtest, in parallel: 
qsub -I -l nodes=1:ppn=2 -l vmem=1000mb
Command line: examples/memtest & examples/memtest   
Result: malloc failure after 993 MiB
           malloc failure after 993 MiB
Expected: So both can allocate almost 1Gb! I was expecting them to die at 
500Mb, so they would sum the max 1Gb I asked torque.

3) Got an interactive session with pvmem=1Gb and ran 2 memtest, in parallel: 
qsub -I -l nodes=1:ppn=2 -l pvmem=1000mb
Command line: examples/memtest & examples/memtest   
Result: malloc failure after 993 MiB
           malloc failure after 993 MiB
Expected: This is what I was expecting because both can allocate about 1Gb.

So vmem and pvmem seem to do the same. What am I missing here?

Thanks in advance,
 -- Diego.
_______________________________________________
torqueusers mailing list
torqueusers at supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers


-------------------------------------------------------


More information about the torqueusers mailing list