[torqueusers] Spread config to submit host

Fernando Campos fernando.campos at uam.es
Mon Apr 19 09:16:48 MDT 2010


Hi all!!

I'm having troubles configuring torque server. The situation is, let's say,
10 nodes running pbs_mom, 1 master node running pbs_server and pbs_sched
(and NFS server and other stuffs), 1 submit host with torque-client
installed to launch jobs and check the queues.

The *nodes* file makes two sets of nodes depending on the type of processor:
COREDUO and XEON.
I've added the bold lines to my queues configuration so, executing *$ qmgr
-c "p s"* on the master node running pbs_server I get:

*#*
*# Create queues and set their attributes.*
*#*
*#*
*# Create and define queue long*
*#*
*create queue long*
*set queue long queue_type = Execution*
*set queue long resources_default.neednodes = XEON*
*set queue long enabled = True*
*set queue long started = True*
*#*
*# Create and define queue short*
*#*
*create queue short*
*set queue short queue_type = Execution*
*set queue short resources_max.cput = 24:00:00*
*set queue short resources_max.walltime = 25:00:00*
*set queue short resources_default.neednodes = COREDUO*
*set queue short enabled = True*
*set queue short started = True*


So it's supposse that when I submit a job to the *short* queue should be
executed on a COREDUO node, and if I submit a job to the *long* queue,
execute on a XEON node. Obviously it's not working like that and I realize
that when I execute *$ qmgr -c "p s"* from the submit machine I get
different answer:

*#
# Create queues and set their attributes.
#
#
# Create and define queue long
#
create queue long
set queue long queue_type = Execution
set queue long enabled = True
set queue long started = True
#
# Create and define queue short
#
create queue short
set queue short queue_type = Execution
set queue short resources_max.cput = 24:00:00
set queue short resources_max.walltime = 25:00:00
set queue short enabled = True
set queue short started = True
*

NO *set queue <queue> resources_default.neednodes = <NODE_GROUP> *LINES AT
ALL!!!!
I've already checked and used the submit host to submit jobs to the master
node and they are executed on the nodes. I have also checked nodes status
with pbsnodes and everything seem work fine but this: they don't take care
about "neednodes".

Have anybody got any idea about why is this working this way???

BTW, I also would like to send jobs on the short queue to XEON nodes if all
the COREDUO are busy and send jobs on the long queue to COREDUO nodes if all
the XEON are busy. Any hint??

Thank you very much.

Cheers.

Fernando.






-- 
---------------------------------------------------------------------------------------------------------
Fernando Campos Del Pozo
Departamento de Física Teórica
Facultad de Ciencias / Módulo 15 (C-XI) / Despacho 512
Universidad Autónoma de Madrid
Tlf.: +34-914974893
e-mail: fernando.campos at uam.es
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20100419/1ac4fd4e/attachment.html 


More information about the torqueusers mailing list