[torqueusers] Maui not cancelling job that exceeds ncpus limit

Emir Imamagic eimamagi at srce.hr
Sun Aug 28 16:01:38 MDT 2005


Hi,

I'm using Maui 3.2.6p13 and torque_1.2.0p4 on dual processor cluster. I 
would like to limit number of processors per job to 1 by using Torque 
configuration. First I tried with nodect, but that didn't work for 
processors. Then I tried with ncpus and noticed that Maui reports 
PolicyViolation, but doesn't cancel the job. Commands output is at 
the end.

Is there a another to set this limit, or am I missing something in 
configuration?

Thanx in advance,
Emir Imamagic
University Computing Centre

$ less maui.cfg
...
# config relevant to node allocation
NODEALLOCATIONPOLICY    MINRESOURCE
NODEACCESSPOLICY        SHARED
JOBNODEMATCHPOLICY      EXACTNODE
JOBREJECTPOLICY         CANCEL

$ diagnose -c
Class/Queue Status

Name           Priority Flags        QDef              QOSList* 
PartitionList        Target Limits

interactive        5000 [NONE]       [NONE]             [NONE]  [NONE] 
0.00  [NONE]
   MAXNODEPERJOB=1  MAXPROCPERJOB=1

$ qstat -fQ
Queue: interactive
     queue_type = Execution
     total_jobs = 1
     state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:1 Exiting:0
     resources_max.cput = 06:00:00
     resources_max.ncpus = 1
     resources_max.nodect = 1
     resources_max.walltime = 06:00:00
     resources_min.cput = 00:00:01
     resources_min.walltime = 00:00:01
     resources_default.cput = 06:00:00
     resources_default.ncpus = 1
     resources_default.nodect = 1
     resources_default.walltime = 06:00:00
     resources_assigned.ncpus = 1
     resources_assigned.nodect = 1
     enabled = True
     started = True

$ qsub -l nodes=1:ppn=2 -q interactive test.sh
2352.grozd.etfos.cro-grid.hr

$ qstat
Job id           Name             User             Time Use S Queue
---------------- ---------------- ---------------- -------- - -----
2352.grozd       waittest         eimamagi                0 R interactive

$ checkjob 2352

checking job 2352

State: Running
Creds:  user:eimamagi  group:eimamagi  class:interactive  qos:DEFAULT
WallTime: 00:00:11 of 6:00:00
SubmitTime: Sun Aug 28 23:46:23
   (Time Queued  Total: 00:00:01  Eligible: 00:00:01)

StartTime: Sun Aug 28 23:46:24
Total Tasks: 2

Req[0]  TaskCount: 2  Partition: DEFAULT
Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
Opsys: [NONE]  Arch: [NONE]  Features: [NONE]
Allocated Nodes:
[compute-0-2.local:2]


IWD: [NONE]  Executable:  [NONE]
Bypass: 0  StartCount: 1
PartitionMask: [ALL]
Flags:       RESTARTABLE

Reservation '2352' (-00:00:11 -> 5:59:49  Duration: 6:00:00)
Holds:    Batch  (hold reason:  PolicyViolation)
Messages:  procs too high (2 > 1)
PE:  2.00  StartPriority:  7000



More information about the torqueusers mailing list