[torqueusers] [Mauiusers] priority job failing to get reservation
Naveed Near-Ansari
naveed at caltech.edu
Fri Apr 20 18:13:01 MDT 2012
On 04/20/2012 04:23 PM, Lyn Gerner wrote:
> Naveed,
>
> It looks like your setup is only showing 1056 procs, not 3552:
>> PE: 1501.00 StartPriority: 144235
>> job cannot run in partition DEFAULT (insufficient idle procs available:
>> 1056 < 1501)
> You might play w/diagnose -t (partition) and diagnose -j (job) to see
> what they tell you. Also, you could try to explicitly make a
> reservation for the job, and maybe then you could get info from
> diagnose -r (though attempting the setres may give enough error info).
>
> Good luck,
> Lyn
Thanks for looking.
I think it is configured for 3768 (i said 3552 because the queue it was
sent to has that many available to it). i didn't see anything clear in
either diagnose command. I attempted to create a reservation, but it
failed.
# setres -u ortega -d 4:00:00:00 TASKS==1501
ERROR: 'setres' failed
ERROR: cannot select 1501 tasks for reservation for 3:13:33:56
ERROR: cannot select requested tasks for 'TASKS==1501'
#diagnose -t
Displaying Partition Status
System Partition Settings: PList: DEFAULT PDef: DEFAULT
Name Procs
DEFAULT 3768
Partition Configured Up U/C Dedicated D/U
Active A/U
NODE----------------------------------------------------------------------------
DEFAULT 314 313 99.68% 297 94.89%
297 94.89%
PROC----------------------------------------------------------------------------
DEFAULT 3768 3756 99.68% 3564 94.89%
3000 79.87%
MEM----------------------------------------------------------------------------
DEFAULT 15156264 15107978 99.68% 14335282 94.89%
0 0.00%
SWAP----------------------------------------------------------------------------
DEFAULT 30227950 30131665 99.68% 28590985 94.89%
1400704 4.65%
DISK----------------------------------------------------------------------------
DEFAULT 314 313 99.68% 297 94.89%
0 0.00%
Class/Queue State
[<CLASS> <AVAIL>:<UP>]...
DEFAULT [shared 3756:3756][debug 3756:3756][default 477:3756][gpu
3756:3756]
#diagnose -j 220559
Name State Par Proc QOS WCLimit R Min User
Group Account QueuedTime Network Opsys Arch Mem Disk
Procs Class Features
220559 Idle ALL 1501 ded 4:00:00:00 0 1501 ortega
simons - 1:23:34:41 [NONE] [NONE] [NONE] >=0 >=0 NC0
[default:1] [default]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4887 bytes
Desc: S/MIME Cryptographic Signature
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20120420/963ffcc5/attachment-0001.bin
More information about the torqueusers
mailing list