[torqueusers] PBS_NODEFILE incomplete

Scott Hazelhurst Scott.Hazelhurst at wits.ac.za
Sat Dec 18 06:50:13 MST 2010


Dear all,

I would like to re-open this thread.

http://www.supercluster.org/pipermail/torqueusers/2010-October/011518.html

We have exactly the same problem, and I¹ve also fiddled for many days trying
all sorts of configurations to sort the problem out. It¹s not surprising we
have the same problem, since we are running the same software (part of our
national grid infrastructure, running Glite 3.2 on SL5.4). The torque and
maui packages are installed automatically by the grid installation. I am
sure  that installing later versions would fix the problem, but I¹m afraid
that would break some of the grid software which is highly fragile.

The basic symptom is that PBS_NODEFILE is wrong. If in my job file I ask for
a certain number of processors so that I can run an MPI job across our
cluster, only one node is placed in PBS_NODEFILE.

If I do a checkjob on the job being run, it looks like the right number of
nodes is being allocated and it shows the names of the nodes which are
available. However, the job only runs on one of the nodes and all my MPI
jobs run on that  node (far in excess of the actual number of

We are running torque 2.3.6-2cri.el5 and maui
3.2.6p21-snap.1234905291.5.el5.

In my maui.cfg I have

ENABLEMULTIREQJOBS TRUE
ENABLEMULTINODEJOBS TRUE


I have experimented with a wide range of queue configurations, none of which
worked.


What should I have in my maui.cfg?

What are the appropriate torque queue parameters ?

I want to be able to specify an MPI job runs on p nodes with no more than q
processes per node.

If anyone could send me configurations, I¹d be very grateful,

Many thanks

Scott







<html><p><font face = "verdana" size = "0.8" color = "navy">This communication is intended for the addressee only. It is confidential. If you have received this communication in error, please notify us immediately and destroy the original message. You may not copy or disseminate this communication without the permission of the University. Only authorized signatories are competent to enter into agreements on behalf of the University and recipients are thus advised that the content of this message may not be legally binding on the University and may contain the personal views and opinions of the author, which are not necessarily the views and opinions of The University of the Witwatersrand, Johannesburg. All agreements between the University and outsiders are subject to South African Law unless the University agrees in writing to the contrary.</font></p></html>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20101218/12fa783e/attachment.html 


More information about the torqueusers mailing list