[Mauiusers] Requesting a given number of processors

Jan Ploski Jan.Ploski at offis.de
Fri Sep 28 10:03:38 MDT 2007


"Toni L. Harbaugh-Blackford [Contr]" <harbaugh at ncifcrf.gov> schrieb am 
09/28/2007 04:14:45 PM:

> On Fri, 28 Sep 2007, Jan Ploski wrote:
> 
>   > "Toni L. Harbaugh-Blackford [Contr]" <harbaugh at ncifcrf.gov> schrieb 
am 
>   > 09/28/2007 02:35:50 PM:
>   > 
>   > > 
>   > > Jan-
>   > > 
>   > > Depending on how you have things configured, you may be able to 
use
>   > > '-l ncpus=100'.
>   > 
>   > Toni,
>   > 
>   > Thanks for the tip. Unfortunately it doesn't work in our cluster. (I 
am 
>   > the administrator, so if you happen to know any options that 
> influence it, 
>   > please share.) I think ncpus=100 makes it look for a machine with 
100 
>   > processors. I get 'rejected : CPU' lines in the output of checkjob 
-v.
>   > 
>   > It's amazing that these trivial matters seem not to be documented 
>   > anywhere.
> 
> It may be something in your configuration, although I have never tried
> this asking for 100 cpus, only 8, and was able to get it to work.

Ok, I'm unable to get it working with 9, so it's not about the big number.

> Are you using "nodes=", either in your torque queue configurations or
> on the qsub command line?

I'm not using nodes= in the queue configuration. When I use nodes=9 on the 
command line I get one error (see my latest message). When I use ncpus=9 
on the command line, then I get another error (see my previous message).

> If you submit a job and it stays queued, what does the qstat -f look 
like?

I suppose you only wish to see it for the job which is not running, not 
for all jobs? Here it goes, for the ncpus=9 variant:

jploski at srvgrid01:~/torque> qstat -f 346784.srvgrid01
Job Id: 346784.srvgrid01.offis.uni-oldenburg.de
    Job_Name = jpl1.jb
    Job_Owner = jploski at srvgrid01.offis.uni-oldenburg.de
    job_state = Q
    queue = verylong
    server = srvgrid01.offis.uni-oldenburg.de
    Checkpoint = u
    ctime = Fri Sep 28 18:00:18 2007
    Error_Path = srvgrid01:/home/jploski/torque/jpl1.ERR
    Hold_Types = n
    Join_Path = n
    Keep_Files = n
    Mail_Points = n
    mtime = Fri Sep 28 18:00:18 2007
    Output_Path = srvgrid01:/home/jploski/torque/jpl1.OUT
    Priority = 0
    qtime = Fri Sep 28 18:00:18 2007
    Rerunable = True
    Resource_List.ncpus = 9
    Resource_List.nodect = 176
    Variable_List = PBS_O_HOME=/home/jploski,PBS_O_LANG=en_US,
        PBS_O_LOGNAME=jploski,
 PBS_O_PATH=/home/jploski/bin:/home/jploski/bin:/usr/local/bin:/usr/bi
 n:/usr/X11R6/bin:/bin:/usr/games:/opt/gnome/bin:/opt/kde3/bin:/opt/ofe
 d-1.1/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:/opt/ofed-1.1/sbin:/opt/p
 gi/linux86-64/6.2/bin:/opt/pgi/linux86-64/6.2/mpi/mpich/bin:/opt/ncl-m
 etno:/opt/netcdf-3.6.1-pgcc/bin:/opt/ncl/bin:/opt/condor/bin:/opt/cond
 or/sbin:/opt/globus-install//bin:/opt/globus-install//sbin:/opt/mpiexe
 c/bin:/opt/ncview-1.93c/bin:/opt/nco/bin:/opt/cdo/bin:/opt/bashdb/bin,
        PBS_O_MAIL=/var/mail/jploski,PBS_O_SHELL=/bin/bash,
        PBS_O_HOST=srvgrid01.offis.uni-oldenburg.de,
        PBS_O_WORKDIR=/home/jploski/torque,PBS_O_QUEUE=verylong
    etime = Fri Sep 28 18:00:18 2007

Best regards,
Jan Ploski


More information about the mauiusers mailing list