[Mauiusers] Partition problem

Bas van der Vlies basv at sara.nl
Mon Apr 3 01:00:56 MDT 2006


You forgot to make use of the node properties, was in my first email
   serial:
   set queue q_serial resources_default.neednodes = serial

  parallel:
  set queue q_parallel resources_default.neednodes = parallel

Then the queue is automatically selected if you use more then one node

Regards

On Apr 3, 2006, at 8:41 AM, luxun wrote:

> Thanks for your help.
>
> I try to setting node properties in $PBS_HOME/server_priv/nodes.
> Then I submit some parallel jobs, some are running on parallel queue,
> others are running on serial queue. Serial jobs are the same.
> Accounting logs:
> 04/03/2006 09:35:56;E;3.i159.ascc;user=wzlu group=wzlu jobname=cpi  
> queue=parallel ctime=1144028142 qtime=1144028142 etime=1144028142  
> start=1144028142 exec_host=i153.ascc/0+i152.ascc/0  
> Resource_List.neednodes=2 Resource_List.nodect=2  
> Resource_List.nodes=2 session=0 end=1144028156 Exit_status=271  
> resources_used.cput=00:00:00 resources_used.mem=0kb  
> resources_used.vmem=0kb resources_used.walltime=00:00:14
> (This parallel job running on i153.ascc and i152.ascc. i153.ascc  
> and i152.ascc are
> define in serial queue)
>
> maui.log have following message:
> 04/03 14:14:51 MPBSNodeUpdate(i154.ascc,i154.ascc,Idle,base)
> 04/03 14:14:51 MPBSLoadQueueInfo(base,i154.ascc,SC)
> 04/03 14:14:51 INFO:     queue 'batch' started state set to True
> 04/03 14:14:51 INFO:     class to node not mapping enabled for  
> queue 'batch' adding class to all nodes
> 04/03 14:14:51 INFO:     queue 'serial' started state set to True
> 04/03 14:14:51 INFO:     class to node not mapping enabled for  
> queue 'serial' adding class to all nodes
> 04/03 14:14:51 INFO:     queue 'parallel' started state set to True
> 04/03 14:14:51 INFO:     class to node not mapping enabled for  
> queue 'parallel' adding class to all nodes
>
> I try to add "#PBS -l nodes=2:parallel" in job script, all the  
> parallel jobs
> running on parallel queue.
> Accounting logs:
> 04/03/2006 09:55:40;E;4.i159.ascc;user=wzlu group=wzlu jobname=cpi  
> queue=parallel ctime=1144029318 qtime=1144029318 etime=1144029318  
> start=1144029319 exec_host=i156.ascc/0+i155.ascc/0  
> Resource_List.neednodes=2:parallel Resource_List.nodect=2  
> Resource_List.nodes=2:parallel session=0 end=1144029340  
> Exit_status=0 resources_used.cput=00:00:00 resources_used.mem=616kb  
> resources_used.vmem=5276kb resources_used.walltime=00:00:22
> (This parallel job running on i156.ascc and i155.ascc. i156.ascc  
> and i155.ascc are
> define in parallel queue)
>
> Add "#PBS -l nodes=2:parallel" in job script for most users are  
> inconvenient.
> I thinks there are some miss in my system.
>
> Have any idea? Thanks.
>
> My environment is:
> OS - RHEL 4 WS 64 bit
> torque - 2.0.0p8
> maui - 3.2.6p14
> serial queue - i151.ascc i152.ascc i153.ascc i154.ascc
> parallel queue - i155.ascc i156.ascc
>
> torque configuration as following:
> #
> # Create queues and set their attributes.
> #
> #
> # Create and define queue batch
> #
> create queue batch
> set queue batch queue_type = Route
> set queue batch route_destinations = serial
> set queue batch route_destinations += parallel
> set queue batch enabled = True
> set queue batch started = True
> #
> # Create and define queue serial
> #
> create queue serial
> set queue serial queue_type = Execution
> set queue serial resources_max.nodect = 1
> set queue serial resources_default.nodect = 1
> set queue serial resources_default.nodes = 1:ppn=1
> set queue serial enabled = True
> set queue serial started = True
> #
> # Create and define queue parallel
> #
> create queue parallel
> set queue parallel queue_type = Execution
> set queue parallel resources_max.nodect = 64
> set queue parallel resources_min.nodect = 2
> set queue parallel resources_default.nodect = 2
> set queue parallel resources_default.nodes = 2:ppn=1
> set queue parallel enabled = True
> set queue parallel started = True
> #
> # Set server attributes.
> #
> set server scheduling = True
> set server acl_host_enable = False
> set server acl_user_enable = False
> set server default_queue = batch
> set server log_events = 511
> set server mail_from = adm
> set server query_other_jobs = True
> set server resources_default.neednodes = 1
> set server resources_default.nodes = 1:ppn=1
> set server scheduler_iteration = 600
> set server node_check_rate = 150
> set server tcp_timeout = 6
> set server default_node = 1
> set server pbs_version = 2.0.0p8-1cri
>
> nodes
> i151.ascc serial
> i152.ascc serial
> i153.ascc serial
> i154.ascc serial
> i155.ascc parallel
> i156.ascc parallel
> i157.ascc parallel
>
> maui.cfg
> # maui.cfg 3.2.6p14
>
> SERVERHOST            i159.ascc
> # primary admin must be first in list
> ADMIN1                root
>
> # Resource Manager Definition
>
> RMCFG[base] TYPE=PBS
>
> # Allocation Manager Definition
>
> #AMCFG[bank]  TYPE=NONE
>
> # full parameter docs at http://clusterresources.com/mauidocs/ 
> a.fparameters.html
> # use the 'schedctl -l' command to display current configuration
>
> RMPOLLINTERVAL        00:00:30
>
> SERVERPORT            42559
> SERVERMODE            NORMAL
>
> # Admin: http://clusterresources.com/mauidocs/a.esecurity.html
>
> LOGFILE               maui.log
> LOGFILEMAXSIZE        10000000
> LOGLEVEL              3
>
> # Job Priority: http://clusterresources.com/mauidocs/ 
> 5.1jobprioritization.html
>
> QUEUETIMEWEIGHT       1
>
> # FairShare: http://clusterresources.com/mauidocs/6.3fairshare.html
>
> #FSPOLICY              PSDEDICATED
> #FSDEPTH               7
> #FSINTERVAL            86400
> #FSDECAY               0.80
>
> # Throttling Policies: http://clusterresources.com/mauidocs/ 
> 6.2throttlingpolicies.html
>
> # NONE SPECIFIED
>
> # Backfill: http://clusterresources.com/mauidocs/8.2backfill.html
>
> BACKFILLPOLICY        FIRSTFIT
> RESERVATIONPOLICY     CURRENTHIGHEST
>
> #NODEALLOCATIONPOLICY  MINRESOURCE
> NODEALLOCATIONPOLICY  CPULOAD
>
> DEFERTIME             0
>
> NODECFG[i151.ascc] PARTITION=SERIAL
> NODECFG[i152.ascc] PARTITION=SERIAL
> NODECFG[i153.ascc] PARTITION=SERIAL
> NODECFG[i154.ascc] PARTITION=SERIAL
> NODECFG[i155.ascc] PARTITION=PARALLEL
> NODECFG[i156.ascc] PARTITION=PARALLEL
>
> CLASSCFG[serial]     MAXJOBPERUSER=4
> CLASSCFG[parallel]   MAXJOBPERUSER=4
> CLASSCFG[parallel]   MAXPROCPERUSER=16
> USERCFG[DEFAULT]     MAXJOB=6 MAXPROC=20
>
> SRPARTITION[serial]  SERIAL
> SRTASKCOUNT[serial]  4
> SRRESOURCES[serial]  PROCS=-1
> SRCLASSLIST[serial]  serial
> SRPERIOD[serial]     INFINITY
>
> SRPARTITION[parallel]  PARALLEL
> SRTASKCOUNT[parallel]  2
> SRRESOURCES[parallel]  PROCS=-1
> SRCLASSLIST[parallel]  parallel
> SRPERIOD[parallel]     INFINITY
>
>
> 2006/3/31, Bas van der Vlies < basv at sara.nl>:
>
> I do not use PARTITIONS but i solved the problem by setting node
> properties for, eg:
> node1 serial
> node2 serial
> node3 parallel
> node4 parallel
>
> In torque to create queue:
>    parallel...
>    set queue q_parallel resources_default.neednodes = parallel
>    set queue q_parallel resources_default.nodect = 2
>    ...
>
>    serial...
>    set queue q_serial resources_default.neednodes = serial
>    set queue q_serial resources_max.nodect = 1
>    set queue q_serial resources_default.ncpus = 1
>    set queue q_serial resources_default.nodect = 1
>    set queue q_serial resources_default.nodes = 1
>
> --
> ********************************************************************
> *                                                                  *
> *  Bas van der Vlies                     e-mail: basv at sara.nl      *
> *  SARA - Academic Computing Services    phone:  +31 20 592 8012   *
> *  Kruislaan 415                         fax:    +31 20 6683167    *
> *  1098 SJ Amsterdam                                               *
> *                                                                  *
> ********************************************************************
>

--
Bas van der Vlies
basv at sara.nl



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20060403/73db2a53/attachment-0001.html


More information about the mauiusers mailing list