[torqueusers] how i configure torque and maui to submit serial job to diffrent nodes???

lars at hesdorf.dk lars at hesdorf.dk
Wed Dec 6 01:15:52 MST 2006


You need a change in the file
/var/spool/PBS/sched_priv/sched_config

For other reason as well time is well spend reading this file.

#
# smp_cluster_dist
#
#       This option allows you to decide how to distribute jobs to all the
#       nodes on your systems.
#
#       pack        - pack as many jobs onto a node that will fit before
#                     running on another node
#       round_robin - run one job on each node in a cycle
#       lowest_load - run the job on the lowest loaded node
#
#       PRIME OPTION

## smp_cluster_dist: pack
smp_cluster_dist: round_robin

This is taken from our PBSpro installation, by I think Torque has the same
(haven't chacked yet).


> when i submit serial job to the cluster by torque+maui,the job always run
> at the same node untill  all the cpus of that node is used .
> for example ,I use "lx" account to submit serial job "dfdf", every nodes
> have two cpus.and now every cpu is free,no job is running.
> [lx at console ~]$ qsub -l nodes=1:ppn=1 dfdf
> 101.console
> [lx at console ~]$ qstat -an
>
> console:
>                                                                    Req'd
> Req'd   Elap
> Job ID               Username Queue    Jobname    SessID NDS   TSK Memory
> Time  S Time
> -------------------- -------- -------- ---------- ------ ----- --- ------
> ----- - -----
> 101.console          lx       dpool    dfdf         4012     1  --    --
> --  R   --
>    c1501/0
> [lx at console ~]$ qsub -l nodes=1:ppn=1 dfdf
> 102.console
> [lx at console ~]$ qstat -an
>
> console:
>                                                                    Req'd
> Req'd   Elap
> Job ID               Username Queue    Jobname    SessID NDS   TSK Memory
> Time  S Time
> -------------------- -------- -------- ---------- ------ ----- --- ------
> ----- - -----
> 101.console          lx       dpool    dfdf         4012     1  --    --
> --  R   --
>    c1501/0
> 102.console          lx       dpool    dfdf         4102     1  --    --
> --  R   --
>    c1501/1
> [lx at console ~]$ qsub -l nodes=1:ppn=1 dfdf
> 103.console
> [lx at console ~]$ qstat -an
>
> console:
>                                                                    Req'd
> Req'd   Elap
> Job ID               Username Queue    Jobname    SessID NDS   TSK Memory
> Time  S Time
> -------------------- -------- -------- ---------- ------ ----- --- ------
> ----- - -----
> 101.console          lx       dpool    dfdf         4012     1  --    --
> --  R   --
>    c1501/0
> 102.console          lx       dpool    dfdf         4012     1  --    --
> --  R   --
>    c1501/1
> 103.console          lx       dpool    dfdf         3543     1  --    --
> --  R   --
>    c1503/0
>
> as you see,the first two jobs are running at the same node.this is not
> load
> balance.i want the job 102 run at the other node not the node c1501.after
> all the cpus of the node c1501 is used,the job 103 is starting to  run at
> the other node c1503.
> i have configured the torque server by using "node_pack=false",but it not
> works.
> and i also configure the maui.cfg file ,adding "NODEALLOCATIONPOLICY
> MAXBALANCE
> NODEACCESSPOLICY SINGLEUSER",but it still not works.
>
> i am very disappointed,how can i do .
>
> this is my server's configuration.
> #
> # Create queues and set their attributes.
> #
> #
> # Create and define queue dpool
> #
> create queue dpool
> set queue dpool queue_type = Execution
> set queue dpool max_queuable = 50
> set queue dpool max_running = 50
> set queue dpool resources_default.neednodes = dpool
> set queue dpool enabled = True
> set queue dpool started = True
> #
> # Set server attributes.
> #
> set server scheduling = True
> set server acl_host_enable = False
> set server managers = root at console
> set server operators = root at console
> set server default_queue = dpool
> set server log_events = 127
> set server mail_from = adm
> set server scheduler_iteration = 300
> set server node_check_rate = 150
> set server tcp_timeout = 6
> set server node_pack = False
> set server torque_version = 2.0.0p8
>
> _________________________________________________________________
> Ãâ·ÑÏÂÔØ MSN Explorer:   http://explorer.msn.com/lccn/
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>



More information about the torqueusers mailing list