[Mauiusers] Having problems scheduling jobs with PPN specification on a cluster.
Jonathan K Shelley
Jonathan.Shelley at inl.gov
Fri Feb 5 16:36:56 MST 2010
I tried that and it worked, allowing my jobs to schedule. However, my node
file has only one line in it instead of 24 lines. I then found the
JOBNODEMATCHPOLICY and removed it from my maui.cfg and then I could submit
using -l nodes=24 which provided me with what I wanted. But when I try to
do -l nodes=4:ppn=6 it won't schedule. From what I read it appears that
this should work from torques liberal scheduling policy.
when I run checkjob it returns
checking job 1098
State: Idle
Creds: user:jon group:jon class:all qos:DEFAULT
WallTime: 00:00:00 of 21:00:00:00
SubmitTime: Fri Feb 5 16:15:25
(Time Queued Total: 00:17:43 Eligible: 00:17:43)
Total Tasks: 60
Req[0] TaskCount: 60 Partition: ALL
Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0
Opsys: [NONE] Arch: [NONE] Features: [NONE]
IWD: [NONE] Executable: [NONE]
Bypass: 0 StartCount: 0
PartitionMask: [ALL]
Reservation '1098' (23:32:14 -> 21:23:32:14 Duration: 21:00:00:00)
PE: 60.00 StartPriority: 17
job can run in partition DEFAULT (108 procs available. 60 procs required)
Here is my configuration:
# Resource Manager Definition
RMCFG[torque] TYPE=PBS
# Allocation Manager Definition
AMCFG[bank] TYPE=NONE
# full parameter docs at
http://supercluster.org/mauidocs/a.fparameters.html
# use the 'schedctl -l' command to display current configuration
RMPOLLINTERVAL 00:00:30
SERVERPORT 42559
SERVERMODE NORMAL
# Admin: http://supercluster.org/mauidocs/a.esecurity.html
LOGFILE maui.log
LOGFILEMAXSIZE 10000000
LOGLEVEL 3
# Job Priority: http://supercluster.org/mauidocs/5.1jobprioritization.html
QUEUETIMEWEIGHT 1
# Backfill: http://supercluster.org/mauidocs/8.2backfill.html
BACKFILLPOLICY FIRSTFIT
RESERVATIONPOLICY CURRENTHIGHEST
# Node Allocation: http://supercluster.org/mauidocs/5.2nodeallocation.html
NODEACCESSPOLICY SHARED
NODEALLOCATIONPOLICY MINRESOURCE
ENABLEMULTIREQJOBS TRUE
Any ideas?
Thanks,
Jon Shelley
HPC Software Consultant
Idaho National Lab
Phone (208) 526-9834
Fax (208) 526-0122
<Gareth.Williams at csiro.au>
01/28/2010 03:24 PM
To
<Jonathan.Shelley at inl.gov>
cc
Subject
RE: [Mauiusers] Having problems scheduling jobs with PPN specification on
a cluster.
Hi Jon,
The parameter JOBNODEMATCHPOLICY matters.
It turns out that there is an alternative syntax but I don't know if maui
supports it (or just moab).
Could you do me a favour and try a job with qsub -l procs=24
(or any other number - maybe > 24) and see if it works.
There was a recent thread on torqueusers on this.
cheers,
Gareth
From: Jonathan K Shelley [mailto:Jonathan.Shelley at inl.gov]
Sent: Thursday, 28 January 2010 9:15 AM
To: mauiusers at supercluster.org
Subject: [Mauiusers] Having problems scheduling jobs with PPN
specification on a cluster.
Maui Version: 3.2.6p21
Torque Version: 2.3.6
I have a cluster with 4 compute nodes. 1 node has 16 cores and the other 3
each have 24 cores. One of the nodes is down for maintenance. So I have 64
cores available to run jobs.
When I submit this line
qsub -I -l nodes=4:ppn=6
I get the following error:
01/27 13:41:39 INFO: 64 feasible tasks found for job 384:0 in
partition DEFAULT (24 Needed)
01/27 13:41:39 INFO: inadequate feasible nodes found for job 384:0 in
partition DEFAULT (3 < 4)
01/27 13:41:39 ALERT: job 384 cannot run in any partition
01/27 13:41:39 ALERT: cannot create new reservation for job 384
(shape[1] 24)
01/27 13:41:39 ALERT: cannot create new reservation for job 384
01/27 13:41:39 MJobSetHold(384,16,1:00:00,NoResources,cannot create
reservation for job '384' (intital reservation attempt)
If I submit this line
qsub -I -l nodes=3:ppn=8
it works just fine
01/27 14:47:16 INFO: 64 feasible tasks found for job 385:0 in
partition DEFAULT (24 Needed)
01/27 14:47:16 INFO: tasks located for job 385: 40 of 24 required (0
feasible)
01/27 14:47:16 MJobStart(385)
01/27 14:47:16 MJobStart(385)
01/27 14:47:16 MJobDistributeTasks(385,eos,NodeList,TaskMap)
01/27 14:47:16 MAMAllocJReserve(385,RIndex,ErrMsg)
01/27 14:47:16 MRMJobStart(385,Msg,SC)
01/27 14:47:16 MPBSJobStart(385,eos,Msg,SC)
01/27 14:47:16
MPBSJobModify(385,Resource_List,Resource,eos03.inel.gov:ppn=8+eos01.inel.gov:ppn=8+eos:ppn=8)
01/27 14:47:16 MPBSJobModify(385,Resource_List,Resource,3:ppn=8)
01/27 14:47:16 INFO: job '385' successfully started
My maui.cfg file has the following variables set.
BACKFILLPOLICY FIRSTFIT
RESERVATIONPOLICY CURRENTHIGHEST
NODEACCESSPOLICY SHARED
NODEALLOCATIONPOLICY FIRSTFIT
ENABLEMULTIREQJOBS TRUE
I am I missing something. Is there a way for maui to schedule the job
since there is clearly enough space to run the job if it schedules 2 of
requests on one node.
Thanks,
Jon Shelley
HPC Software Consultant
Idaho National Lab
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20100205/b5277b53/attachment.html
More information about the mauiusers
mailing list