[Mauiusers] I'm obviously missing something with hard and soft limits

Mike Renfro renfro at tntech.edu
Thu Mar 30 15:40:39 MST 2006


I've got a total of 7 dual-CPU machines running Maui/Torque jobs 24/7,
and after implementing my first set of processor limits, things are
not working as I had planned. I've got USERCFG[DEFAULT] MAXPROC=4,14
set in maui.cfg, and yet despite there being no jobs in the queue
other than those owned by user0000001 and user000002, and having 3
CPUs still available out of the 14, user0000001's other jobs get
blocked. So user0000001 has 8 of the 14 processors, and user000002 has
3. Why can't user0000001 take over the remaining 3 if nothing is
waiting? Did I just misread something?

Current showq output:

ACTIVE JOBS--------------------
JOBNAME            USERNAME      STATE  PROC   REMAINING            STARTTIME

626                user0000001    Running     1    10:40:59  Wed Mar 29 16:12:46
644                user0000001    Running     1    16:10:33  Wed Mar 29 21:42:20
645                user0000001    Running     1    16:14:18  Wed Mar 29 21:46:05
646                user0000001    Running     1    16:19:33  Wed Mar 29 21:51:20
642                user0000001    Running     1    20:06:24  Wed Mar 29 20:38:11
635                user000002    Running     1    21:01:56  Thu Mar 30 13:33:43
627                user0000001    Running     1    21:33:27  Thu Mar 30 03:05:14
628                user0000001    Running     1    23:51:51  Thu Mar 30 05:23:38
629                user0000001    Running     1    23:58:37  Thu Mar 30 05:30:24
648                user000002    Running     1  1:09:04:11  Thu Mar 30 13:35:58
649                user000002    Running     1  1:09:04:26  Thu Mar 30 13:36:13

    11 Active Jobs      11 of   14 Processors Active (78.57%)
                         6 of    7 Nodes Active      (85.71%)

IDLE JOBS----------------------
JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME


0 Idle Jobs

BLOCKED JOBS----------------
JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME

630                user0000001       Idle     1  1:11:00:00  Wed Mar 29 03:00:48
631                user0000001       Idle     1  1:11:00:00  Wed Mar 29 03:02:00
632                user0000001       Idle     1  1:11:00:00  Wed Mar 29 03:03:18
633                user0000001       Idle     1  1:11:00:00  Wed Mar 29 03:04:53

Total Jobs: 15   Active Jobs: 11   Idle Jobs: 0   Blocked Jobs: 4

Current output from checkjob 630:

checking job 630

State: Idle
Creds:  user:user0000001  group:users  class:ch226  qos:DEFAULT
WallTime: 00:00:00 of 1:11:00:00
SubmitTime: Wed Mar 29 03:00:48
  (Time Queued  Total: 1:13:33:28  Eligible: 18:58:55)

Total Tasks: 1

Req[0]  TaskCount: 1  Partition: ALL
Network: [NONE]  Memory >= 0  Disk >= 0  Swap >= 0
Opsys: [NONE]  Arch: [NONE]  Features: [NONE]


IWD: [NONE]  Executable:  [NONE]
Bypass: 0  StartCount: 0
PartitionMask: [ALL]
Flags:       RESTARTABLE

PE:  1.00  StartPriority:  1138
cannot select job 630 for partition DEFAULT (job 630 violates active HARD MAXPROC limit of 14 for user user0000001  (R: 1, U: 16)
)

-- 
Mike Renfro  / R&D Engineer, Center for Manufacturing Research,
931 372-3601 / Tennessee Technological University -- renfro at tntech.edu


More information about the mauiusers mailing list