[torqueusers] mem and pmem

Laurence Dawson larry.dawson at vanderbilt.edu
Wed Aug 10 13:21:16 MDT 2005


We are running torque 1.2.0p1 and moab 4.2.0p3

pmem and mem are not behaving correctly

We are having difficulty specifying the memory requirement in a few ways 
but we are concentrating on this case: Here is the pbs script:

#!/bin/csh

#PBS -l nodes=64:opteron:myrinet
#PBS -l walltime=24:00:00
#PBS -l cput=1536:00:00
#PBS -j oe
#PBS -l pmem=400mb
#PBS -l mem=25600mb

cd $PBS_O_WORKDIR
mpiexec lmp_vampire <in_lammps

#########################################

When this is first submitted, checkjob shows it looking ok with

Req[0]  TaskCount: 64  Partition: ALL
Network: [NONE]  Memory >= 400M  Disk >= 0  Swap >= 0
Opsys: [NONE]  Arch: [NONE]  Features: opteron,myrinet
Dedicated Resources Per Task: PROCS: 1  MEM: 400M
NodeCount: 1

This should be no problem, but our cluster is busy, so it is not 
scheduled yet and stays in idle.
Then sometime later (maybe 10 minutes - but long enough for the 
scheduler to start looking at resources available, this switches to look 
like below:

Req[0]  TaskCount: 64  Partition: ALL
Network: [NONE]  Memory >= 400M  Disk >= 0  Swap >= 0
Opsys: [NONE]  Arch: [NONE]  Features: opteron,myrinet
Dedicated Resources Per Task: PROCS: 1  MEM: 25G
NodeCount: 1


Now there are no machines that can meet the 25G per task limit and it 
goes into batchhold.

I don't know if it is a torque problem or a moab problem, but this looks 
like a bug...any ideas?




More information about the torqueusers mailing list