[torqueusers] Allocating resources by CPUs
Martin Siegert
siegert at sfu.ca
Mon Mar 12 16:52:28 MDT 2007
On Tue, Mar 13, 2007 at 09:00:51AM +1100, Chris Samuel wrote:
> On Sat, 10 Mar 2007, Martins, Flavio wrote:
>
> > I tried setting the resources_available_nodect parameter to 16 as mentioned
> > in previous suggestions to this problem. This allows me to submit the job,
> > but then it just sits in q status waiting for resources to become
> > available.
>
> Hmm, it may be that pbs_sched doesn't support scheduling like this and you may
> need to use Maui instead for that functionality.
>
> Anyone have a better clue than I ?
Actually, this is broken and always has been broken. It does not
work with Maui/Moab either.
This is not a scheduler problem, because in order to have the scheduler
schedule such a job you must be able to specify in your submission
script that you want the scheduler to schedule a job that requires
x cpus anywhere on the cluster.
Example: user A wants to run a job using 20 cpus anywhere on the cluster.
User B wants to run a job using 20 cpus with exactly one cpu on each
node.
Torque cannot do this. In my opinion this is the largest deficiency of
torque - and has been for a long time.
(there are attempts to work around this problem in moab by specifying
the JOBNODEMATCHPOLICY parameter, but that doesn't really work either:
the example above cannot be implemented).
This, really needs to be fixed ...
(in my opinion the ncpus resource should be extended to work on clusters
so that -l ncpus=20 gives you 20 processors anywher on the cluster; then
-l nodes=x:ppn=y should be interpreted literally, i.e., exactly x nodes
with y processors each - no need for the JOBNODEMATCHPOLICY parameter
anymore).
Cheers,
Martin
--
Martin Siegert
Head, HPC at SFU
WestGrid Site Lead
Academic Computing Services phone: (604) 291-4691
Simon Fraser University fax: (604) 291-4242
Burnaby, British Columbia email: siegert at sfu.ca
Canada V5A 1S6
More information about the torqueusers
mailing list