[torqueusers] Short of physical memory, crash?

Diego Bacchin diego.bacchin at bmr-genomics.it
Thu Dec 20 17:52:05 MST 2012


Hi,
In my experience the node will start use the swap partition. the jobs will work if you have enough swap but the performance will be very very slow.
For example a simple ssh will take 5 seconds.
In my opinion the best solution is to run the jobs in 2 set, in this way you will not use the swap 


Bye



--
Diego Bacchin

Il giorno 21/dic/2012, alle ore 00:35, "Tian, Dong" <dong.tian at gmail.com> ha scritto:

> Dear Experts,
> 
> I have the following question as a cluster user. My job is to submit jobs to the cluster to do simulations. Forgive me if my question sound simple. :-)
> 
> In one example, on one compute node, there are 48 GB RAM, 12 cores/CPUs. If each job take <4GB RAM, there should be no any issue to run 12 jobs on one node. 
> 
> Now the problem is that one job takes 4.5 GB physical RAM at peak, say as reported by qstat -f. If 12 such jobs are submitted and running on one compute node. Are there any risks to crash down the compute node? Let us assume the job program is written in a safe manner.
> 
> My understanding is that the compute node may crash from the shortage of memory, but want to have confirmation from you guys.
> 
> Appreciate your time!
> 
> Thanks,
> Dong
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers


More information about the torqueusers mailing list