[Mauiusers] Having problems scheduling jobs with PPN specification on a cluster.

Jonathan K Shelley Jonathan.Shelley at inl.gov
Wed Jan 27 15:15:19 MST 2010


Maui Version: 3.2.6p21
Torque Version: 2.3.6

I have a cluster with 4 compute nodes. 1 node has 16 cores and the other 3 
each have 24 cores. One of the nodes is down for maintenance. So I have 64 
cores available to run jobs.
When I submit this line 

qsub -I -l nodes=4:ppn=6

I get the following error:
01/27 13:41:39 INFO:     64 feasible tasks found for job 384:0 in 
partition DEFAULT (24 Needed)
01/27 13:41:39 INFO:     inadequate feasible nodes found for job 384:0 in 
partition DEFAULT (3 < 4)
01/27 13:41:39 ALERT:    job 384 cannot run in any partition
01/27 13:41:39 ALERT:    cannot create new reservation for job 384 
(shape[1] 24)
01/27 13:41:39 ALERT:    cannot create new reservation for job 384
01/27 13:41:39 MJobSetHold(384,16,1:00:00,NoResources,cannot create 
reservation for job '384' (intital reservation attempt)

If I submit this line 

qsub -I -l nodes=3:ppn=8

it works just fine
01/27 14:47:16 INFO:     64 feasible tasks found for job 385:0 in 
partition DEFAULT (24 Needed)
01/27 14:47:16 INFO:     tasks located for job 385:  40 of 24 required (0 
feasible)
01/27 14:47:16 MJobStart(385)
01/27 14:47:16 MJobStart(385)
01/27 14:47:16 MJobDistributeTasks(385,eos,NodeList,TaskMap)
01/27 14:47:16 MAMAllocJReserve(385,RIndex,ErrMsg)
01/27 14:47:16 MRMJobStart(385,Msg,SC)
01/27 14:47:16 MPBSJobStart(385,eos,Msg,SC)
01/27 14:47:16 
MPBSJobModify(385,Resource_List,Resource,eos03.inel.gov:ppn=8+eos01.inel.gov:ppn=8+eos:ppn=8)
01/27 14:47:16 MPBSJobModify(385,Resource_List,Resource,3:ppn=8)
01/27 14:47:16 INFO:     job '385' successfully started


My maui.cfg file has the following variables set.
BACKFILLPOLICY        FIRSTFIT
RESERVATIONPOLICY     CURRENTHIGHEST

NODEACCESSPOLICY        SHARED
NODEALLOCATIONPOLICY    FIRSTFIT
ENABLEMULTIREQJOBS      TRUE


I am I missing something. Is there a way for maui to schedule the job 
since there is clearly enough space to run the job if it schedules 2 of 
requests on one node.

Thanks,

Jon Shelley
HPC Software Consultant
Idaho National Lab
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20100127/9461755c/attachment.html 


More information about the mauiusers mailing list