[torqueusers] [Mauiusers] priority job failing to get reservation

Naveed Near-Ansari naveed at caltech.edu
Fri Apr 20 18:13:01 MDT 2012



On 04/20/2012 04:23 PM, Lyn Gerner wrote:
> Naveed,
>
> It looks like your setup is only showing 1056 procs, not 3552:
>> PE:  1501.00  StartPriority:  144235
>> job cannot run in partition DEFAULT (insufficient idle procs available:
>> 1056 < 1501)
> You might play w/diagnose -t (partition) and diagnose -j (job) to see
> what they tell you.  Also, you could try to explicitly make a
> reservation for the job, and maybe then you could get info from
> diagnose -r (though attempting the setres may give enough error info).
>
> Good luck,
> Lyn

Thanks for looking.

I think it is configured for 3768 (i said 3552 because the queue it was
sent to has that many available to it). i didn't see anything clear in
either diagnose command.  I attempted to create a reservation, but it
failed.

# setres -u ortega -d 4:00:00:00 TASKS==1501
ERROR:    'setres' failed
ERROR:    cannot select 1501 tasks for reservation for 3:13:33:56
ERROR:    cannot select requested tasks for 'TASKS==1501'



#diagnose -t
Displaying Partition Status

System Partition Settings:  PList: DEFAULT PDef: DEFAULT

Name                    Procs

DEFAULT                  3768

Partition    Configured         Up     U/C  Dedicated     D/U    
Active     A/U

NODE----------------------------------------------------------------------------
DEFAULT             314        313  99.68%        297  94.89%       
297  94.89%
PROC----------------------------------------------------------------------------
DEFAULT            3768       3756  99.68%       3564  94.89%      
3000  79.87%
MEM----------------------------------------------------------------------------
DEFAULT        15156264   15107978  99.68%   14335282  94.89%         
0   0.00%
SWAP----------------------------------------------------------------------------
DEFAULT        30227950   30131665  99.68%   28590985  94.89%   
1400704   4.65%
DISK----------------------------------------------------------------------------
DEFAULT             314        313  99.68%        297  94.89%         
0   0.00%

Class/Queue State

             [<CLASS> <AVAIL>:<UP>]...

     DEFAULT [shared 3756:3756][debug 3756:3756][default 477:3756][gpu
3756:3756]



#diagnose -j 220559      
Name                  State Par Proc QOS     WCLimit R  Min     User   
Group  Account  QueuedTime  Network  Opsys   Arch    Mem   Disk 
Procs       Class Features

220559                 Idle ALL 1501 ded  4:00:00:00 0 1501   ortega  
simons        -  1:23:34:41   [NONE] [NONE] [NONE]    >=0    >=0    NC0
[default:1] [default]



-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4887 bytes
Desc: S/MIME Cryptographic Signature
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20120420/963ffcc5/attachment-0001.bin 


More information about the torqueusers mailing list