[torqueusers] Spread config to submit host

Gus Correa gus at ldeo.columbia.edu
Tue Apr 20 12:22:50 MDT 2010


Hi Fernando

Did you do this?

qmgr -c 'set server allow_node_submit = True'

You don't seem to have it in your 'print server' output.

Anyway, you may need to use the Maui scheduler to achieve what you
want,  queues associated to different node attributes
(COREDUO and XEON).

I hope this helps.
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------

Fernando Campos wrote:
> I just realized now that when I submit jobs from the master node, they 
> get filtered to the proper nodes correctly but when I submit them from 
> another submit_host, they are queued and run, but don't care about the 
> kind of node:
> 
> /41507-04/19/2010 18:16:04;0040;PBS_Server;Req;set_nodes;allocating 
> nodes for job 1209.master.node.com <http://1209.master.node.com> with 
> node expression 'COREDUO'
> 
> 41508-04/19/2010 18:16:04;0008;PBS_Server;Job;1209.//master.node.com 
> <http://master.node.com>//;could not locate requested resources 
> 'COREDUO' (node_spec failed) cannot allocate node '06.//node.com 
> <http://node.com>//' to job - node not currently available (nps 
> needed/free: 1/0,  joblist: 1029.//master.node.com 
> <http://master.node.com>//:0,1208.//master.node.com 
> <http://master.node.com>//:1)/
> 
> Obviously, master.node.com <http://master.node.com> is a fake name, but 
> the point is that when I try to launch a job to the /short/ queue, 
> torque realizes that "needsnode" of type COREDUO but there aren't any 
> available so doesn't allocate any node and stay waiting.
> 
> I would like, as I said on my previous mail, that if every COREDUO nodes 
> are busy, then use the other type of nodes: XEON. But at least I can see 
> the queue is filtering the allocation of nodes depending on the type.
> 
> Any idea why this doesn't work submiting the jobs from the submit_host?????
> 
> 
> Thanks a lot again!
> 
> 
> Fernando.
> 
> 2010/4/19 Fernando Campos <fernando.campos at uam.es 
> <mailto:fernando.campos at uam.es>>
> 
>     Hi all!!
> 
>     I'm having troubles configuring torque server. The situation is,
>     let's say, 10 nodes running pbs_mom, 1 master node running
>     pbs_server and pbs_sched (and NFS server and other stuffs), 1 submit
>     host with torque-client installed to launch jobs and check the queues.
> 
>     The /nodes/ file makes two sets of nodes depending on the type of
>     processor: COREDUO and XEON.
>     I've added the bold lines to my queues configuration so, executing
>     /$ qmgr -c "p s"/ on the master node running pbs_server I get:
> 
>     /#/
>     /# Create queues and set their attributes./
>     /#/
>     /#/
>     /# Create and define queue long/
>     /#/
>     /create queue long/
>     /set queue long queue_type = Execution/
>     /*set queue long resources_default.neednodes = XEON*/
>     /set queue long enabled = True/
>     /set queue long started = True/
>     /#/
>     /# Create and define queue short/
>     /#/
>     /create queue short/
>     /set queue short queue_type = Execution/
>     /set queue short resources_max.cput = 24:00:00/
>     /set queue short resources_max.walltime = 25:00:00/
>     /*set queue short resources_default.neednodes = COREDUO*/
>     /set queue short enabled = True/
>     /set queue short started = True/
> 
> 
>     So it's supposse that when I submit a job to the /short/ queue
>     should be executed on a COREDUO node, and if I submit a job to the
>     /long/ queue, execute on a XEON node. Obviously it's not working
>     like that and I realize that when I execute /$ qmgr -c "p s"/ from
>     the submit machine I get different answer:
> 
>     /#
>     # Create queues and set their attributes.
>     #
>     #
>     # Create and define queue long
>     #
>     create queue long
>     set queue long queue_type = Execution
>     set queue long enabled = True
>     set queue long started = True
>     #
>     # Create and define queue short
>     #
>     create queue short
>     set queue short queue_type = Execution
>     set queue short resources_max.cput = 24:00:00
>     set queue short resources_max.walltime = 25:00:00
>     set queue short enabled = True
>     set queue short started = True
>     /
> 
>     NO /*set queue <queue> resources_default.neednodes = <NODE_GROUP>
>     */LINES AT ALL!!!!
>     I've already checked and used the submit host to submit jobs to the
>     master node and they are executed on the nodes. I have also checked
>     nodes status with pbsnodes and everything seem work fine but this:
>     they don't take care about "neednodes".
> 
>     Have anybody got any idea about why is this working this way???
> 
>     BTW, I also would like to send jobs on the short queue to XEON nodes
>     if all the COREDUO are busy and send jobs on the long queue to
>     COREDUO nodes if all the XEON are busy. Any hint??
> 
>     Thank you very much.
> 
>     Cheers.
> 
>     Fernando.
> 
> 
> -- 
> ---------------------------------------------------------------------------------------------------------
> Fernando Campos Del Pozo
> Departamento de Física Teórica
> Facultad de Ciencias / Módulo 15 (C-XI) / Despacho 512
> Universidad Autónoma de Madrid
> Tlf.: +34-914974893
> e-mail: fernando.campos at uam.es <mailto:fernando.campos at uam.es>
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers



More information about the torqueusers mailing list