[torqueusers] mem and pmem
Laurence Dawson
larry.dawson at vanderbilt.edu
Wed Aug 10 13:21:16 MDT 2005
We are running torque 1.2.0p1 and moab 4.2.0p3
pmem and mem are not behaving correctly
We are having difficulty specifying the memory requirement in a few ways
but we are concentrating on this case: Here is the pbs script:
#!/bin/csh
#PBS -l nodes=64:opteron:myrinet
#PBS -l walltime=24:00:00
#PBS -l cput=1536:00:00
#PBS -j oe
#PBS -l pmem=400mb
#PBS -l mem=25600mb
cd $PBS_O_WORKDIR
mpiexec lmp_vampire <in_lammps
#########################################
When this is first submitted, checkjob shows it looking ok with
Req[0] TaskCount: 64 Partition: ALL
Network: [NONE] Memory >= 400M Disk >= 0 Swap >= 0
Opsys: [NONE] Arch: [NONE] Features: opteron,myrinet
Dedicated Resources Per Task: PROCS: 1 MEM: 400M
NodeCount: 1
This should be no problem, but our cluster is busy, so it is not
scheduled yet and stays in idle.
Then sometime later (maybe 10 minutes - but long enough for the
scheduler to start looking at resources available, this switches to look
like below:
Req[0] TaskCount: 64 Partition: ALL
Network: [NONE] Memory >= 400M Disk >= 0 Swap >= 0
Opsys: [NONE] Arch: [NONE] Features: opteron,myrinet
Dedicated Resources Per Task: PROCS: 1 MEM: 25G
NodeCount: 1
Now there are no machines that can meet the 25G per task limit and it
goes into batchhold.
I don't know if it is a torque problem or a moab problem, but this looks
like a bug...any ideas?
More information about the torqueusers
mailing list