[torqueusers] exceeding memory limits?
Caird, Andrew J
acaird at umich.edu
Fri Jun 23 08:55:46 MDT 2006
Are you sure the gaussian jobs are staying within their limits and that
torque is actually wrong?
We've had some confusion here with Gaussians "mw" versus Torques "mb";
where 1mw = 8mb.
If you can reproduce this you might try running a Gaussian job via
torque and request all of the memory on a machine, then log in and watch
the memory usage with tools other than Torque (free, top, simple things
> -----Original Message-----
> From: torqueusers-bounces at supercluster.org
> [mailto:torqueusers-bounces at supercluster.org] On Behalf Of Steve Young
> Sent: Friday, June 23, 2006 10:44 AM
> To: torqueusers at supercluster.org
> Subject: [torqueusers] exceeding memory limits?
> Periodically, some of my users are reporting errors
> with their jobs.
> They keep getting:
> =>> PBS: job killed: mem 1207959552 exceeded limit 209715200
> Killed Terminated
> This job had requested 200mb of memory. This is for a
> gaussian job. What I am trying to understand is what this
> means. I suspect torque thinks the job actually needs 1.2gb
> of memory which is exceeding the limit of 200mb that this
> person requested? I'd like to find more information about how
> torque allocates/manages memory on nodes. If anyone has more
> information about this I would be greatly appreciative. Thanks,
> torqueusers mailing list
> torqueusers at supercluster.org
More information about the torqueusers