[torqueusers] Ulimit Help

Dr. Stephan Raub raub at uni-duesseldorf.de
Mon Jun 21 05:24:10 MDT 2010


Hi all,

 

this is a solution we use for all our clusters for some four years now. Even PBSPro has this issue, that in the boot-process the pbs_mom-process doesn’t have the limits defined in the conf-file.

 

BUT: WHAT is the reason for this? For me it would be really really interesting to UNDERSTAND this, so that we can learn something from this. Has anyone an idea what the mechanism behind this phenomena may be?

 

Stephan

 

--

---------------------------------------------------------

| | Dr. rer. nat. Stephan Raub

| | Dipl. Chem.

| | Lehrstuhl für IT-Management / ZIM

| | Heinrich-Heine-Universität Düsseldorf Universitätsstr. 1 /

| | 25.41.O2.25-2

| | 40225 Düsseldorf / Germany

| |

| | Tel: +49-211-811-3911

---------------------------------------------------------

 

Wichtiger Hinweis: Diese E-Mail kann Betriebs- oder Geschäftsgeheimnisse, bzw. 

sonstige vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtümlich erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine Vervielfältigung oder Weitergabe der E-Mail ausdrücklich untersagt. Bitte benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen Dank.

 

Important Note: This e-mail may contain trade secrets or privileged, undisclosed or otherwise confidential information. If you have received this e-mail in error, you are hereby notified that any review, copying or distribution of it is strictly prohibited. Please inform us immediately and destroy the original transmittal. Thank you for your cooperation.

 

Von: torqueusers-bounces at supercluster.org [mailto:torqueusers-bounces at supercluster.org] Im Auftrag von Chris Vaughan
Gesendet: Montag, 21. Juni 2010 12:45
An: rishi pathak
Cc: torqueusers; Torque Users Mailing List
Betreff: Re: [torqueusers] Ulimit Help

 

Thanks Rishi,

 

We added "ulimit -u unlimited" to the pbs_mom start stop scripts and that has appeared to resolve the issues.

 

Best Regards,

  _____  

Hi Chris,
              We faced the same problem with 2.3.6. It seemed like on pbs_mom startup at system boot did'nt inherited limits from /etc/security/limits.conf. We verified this by submitting a interactive job and listing the limits using ulimit -a. Subsequent restart of pbs_mom(w/o reboot) comes up with defined limits. We observed that ssh login for a user shows the correct limits for the node in question. Next what we tried was to make pbs_mom start after sshd, but the results were same. 

On Fri, Jun 18, 2010 at 6:35 PM, Chris Vaughan <chris at adaptivecomputing.com> wrote:

Hi,

 

I have a customer with the following error with ulimits for the job not being unlimited, does anyone know how to resolve this?  Do I need to have this set when pbs_server and pbs_mom starts?

 

Thanks,

 

We've been having problems running Fluent software at our site. It works well when running interactively, but it give errors(see attached error file), when running it in a queue. You'll see in the file attached, it complains about 'locked down' memory, which it tries to use for MPI which is using locked memory for Infiniband communications. The /etc/security/limits.conf file on all nodes reads

@ *            hard    memlock         unlimited
@ *            soft    memlock         unlimited

 


-- 

Chris Vaughan  |  Technical Consultant - EMEA

Adaptive Computing | Mobile: +44 7800 973062 | Office: +44 1483 243578

3000 Cathedral Hill | Guildford GU2 7YB  |  United Kingdom

 

Das Bild wurde vom Absender entfernt.

 


_______________________________________________
torqueusers mailing list
torqueusers at supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers




-- 
Regards--
Rishi Pathak
National PARAM Supercomputing Facility
Center for Development of Advanced Computing(C-DAC)
Pune University Campus,Ganesh Khind Road
Pune-Maharastra




-- 



Chris Vaughan  |  Technical Consultant - EMEA

Adaptive Computing | Mobile: +44 7800 973062 | Office: +44 1483 243578

3000 Cathedral Hill | Guildford GU2 7YB  |  United Kingdom

 

Das Bild wurde vom Absender entfernt.

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20100621/5054b33e/attachment-0002.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ~WRD000.jpg
Type: image/jpeg
Size: 823 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20100621/5054b33e/attachment-0002.jpg 


More information about the torqueusers mailing list