[torqueusers] queue to node mapping is wrong when use '-l procs' option

Xiangqian Wang jascha.wang at gmail.com
Tue Feb 7 18:51:18 MST 2012


it seems that '-l feature' option has no effect for maui-3.2.6p21, the jobs
runs on nodes without the feature requested.

2012/2/7 Sreedhar Manchu <sm4082 at nyu.edu>

> Hi,
>
> Instead of using
>
> set queue fluent acl_host_enable = False
> set queue fluent acl_hosts = cnode01
>
>
> I set a feature to the node I wanted my jobs to run or wanted it to be
> under a special queue, I gave a certain feature to the nodes and put it in
> the pbs script like this:
>
> #PBS -l feature=<feature name>
>
> Moab can put the jobs on the nodes with those features. I'm not sure how
> maui does it. I have a qsub wrapper that adds this feature line depending
> on users' requests.
>
> To give features to nodes, I used
>
> qmgr -c 'set node <node name> properties += <feature name>'
>
> For example, our p48 nodes have features like chassis0, chassis1, etc to
> indicate the chassis they belong to. Since we are asking for a specific
> queue with specific features, jobs always go onto right nodes with right
> feature.
>
> Sreedhar.
>
> On Feb 7, 2012, at 4:33 AM, Xiangqian Wang wrote:
>
> I failed to test queue to node mapping feature of torque/maui system, I
> use torque 2.5.8 and maui 3.2.6p21. the simple job script contains a procs
> option:
>
> #!/bin/sh
> #PBS -N simple-job
> #PBS -l procs=3
> #PBS -q fluent
> #PBS -d /opt/share/job
> cd $PBS_O_WORKDIR
> date
> sleep 30
> date
>
> The 'fluent' queue is mapped to a node 'cnode01' with 4 processors, the
> setting is shown below:
>
> # Create and define queue batch
> #
> create queue batch
> set queue batch queue_type = Execution
> set queue batch resources_default.nodes = 1
> set queue batch resources_default.walltime = 01:00:00
> set queue batch enabled = True
> set queue batch started = True
> #
> # Create and define queue fluent
> #
> create queue fluent
> set queue fluent queue_type = Execution
> set queue fluent acl_host_enable = False
> set queue fluent acl_hosts = cnode01
> set queue fluent enabled = True
> set queue fluent started = True
> #
> # Set server attributes.
> #
> set server scheduling = True
> set server acl_hosts = snode01
> set server acl_roots = root@*
> set server managers = root at snode01
> set server operators = root at snode01
> set server default_queue = batch
> set server log_events = 511
> set server mail_from = adm
> set server scheduler_iteration = 600
> set server node_check_rate = 150
> set server tcp_timeout = 6
> set server mom_job_sync = True
> set server keep_completed = 300
> set server auto_node_np = True
> set server next_job_number = 94
> set server display_job_server_suffix = False
>
> The job should use a single node 'cnode01' , while the allocated node
> contains another node. see part of 'qstat -f' output:
>
>     exec_host = snode01/1+snode01/0+cnode01/0
>     ...
>     Resource_List.neednodes = cnode01
>     Resource_List.procs = 3
>
> Can anyone give me some suggestion, it'll be greatly appreciated.
>
> Xiangqian
>  _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
>  ---
> Sreedhar Manchu
> HPC Support Specialist
> New York University
> 251 Mercer Street
> New York, NY 10012-1110
>
>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20120208/6fa0eac3/attachment.html 


More information about the torqueusers mailing list