[torqueusers] exceeding memory limits?

Caird, Andrew J acaird at umich.edu
Fri Jun 23 08:55:46 MDT 2006


Steve,

Are you sure the gaussian jobs are staying within their limits and that
torque is actually wrong?

We've had some confusion here with Gaussians "mw" versus Torques "mb";
where 1mw = 8mb.

If you can reproduce this you might try running a Gaussian job via
torque and request all of the memory on a machine, then log in and watch
the memory usage with tools other than Torque (free, top, simple things
like that).

--andy

> -----Original Message-----
> From: torqueusers-bounces at supercluster.org 
> [mailto:torqueusers-bounces at supercluster.org] On Behalf Of Steve Young
> Sent: Friday, June 23, 2006 10:44 AM
> To: torqueusers at supercluster.org
> Subject: [torqueusers] exceeding memory limits?
> 
> Hello,
> 	Periodically, some of my users are reporting errors 
> with their jobs.
> They keep getting:
> 
> =>> PBS: job killed: mem 1207959552 exceeded limit 209715200 
> Killed Terminated
> 
> This job had requested 200mb of memory. This is for a 
> gaussian job. What I am trying to understand is what this 
> means. I suspect torque thinks the job actually needs 1.2gb 
> of memory which is exceeding the limit of 200mb that this 
> person requested? I'd like to find more information about how 
> torque allocates/manages memory on nodes. If anyone has more 
> information about this I would be greatly appreciative. Thanks,
> 
> -Steve
> 
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
> 


More information about the torqueusers mailing list