[torqueusers] specifying nodes for MPI jobs on small cluster

Andrew Dawson dawson at atm.ox.ac.uk
Thu Feb 7 09:38:21 MST 2013


Nodes file looks like this:

cirrus np=1
cirrus1 np=8
cirrus2 np=8
cirrus3 np=8
cirrus4 np=8
On 7 Feb 2013 16:25, "Ricardo Román Brenes" <roman.ricardo at gmail.com> wrote:

> hi!
>
> How does your node config file looks like?
>
> On Thu, Feb 7, 2013 at 3:10 AM, Andrew Dawson <dawson at atm.ox.ac.uk> wrote:
>
>> Hi all,
>>
>> I'm configuring a recent torque/maui installation and I'm having trouble
>> with submitting MPI jobs. I would like for MPI jobs to specify the number
>> of processors they require and have those come from any available physical
>> machine, the users shouldn't need to specify processors per node etc.
>>
>> The torque manual says that the nodes option is mapped to virtual
>> processors, so for example:
>>
>>     #PBS -l nodes=8
>>
>> should request 8 virtual processors. The problem I'm having is that our
>> cluster currently has only 5 physical machines (nodes), and setting nodes
>> to anything greater than 5 gives the error:
>>
>>     qsub: Job exceeds queue resource limits MSG=cannot locate feasible
>> nodes (nodes file is empty or all systems are busy)
>>
>> I'm confused by this, we have 33 virtual processors available across the
>> 5 nodes (4 8-core machines and one single core) so my interpretation of the
>> manual is that I should be able to request 8 nodes, since these should be
>> understood as virtual processors? Am I doing something wrong?
>>
>> I tried setting
>>
>> #PBS -l procs=8
>>
>> but that doesn't seem to do anything, MPI stops due to having only 1
>> worker available (single core allocated to the job).
>>
>> Thanks,
>> Andrew
>>
>> p.s.
>>
>> The queue I'm submitting jobs to is defined as:
>>
>> create queue normal
>> set queue normal queue_type = Execution
>> set queue normal resources_min.cput = 12:00:00
>> set queue normal resources_default.cput = 24:00:00
>> set queue normal disallowed_types = interactive
>> set queue normal enabled = True
>> set queue normal started = True
>>
>> and we are using torque version 2.5.12 and we are using maui 3.3.1 for
>> scheduling
>>
>>
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>
>>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20130207/13e2c932/attachment-0001.html 


More information about the torqueusers mailing list