[torqueusers] Spread config to submit host

Fernando Campos fernando.campos at uam.es
Tue Apr 20 18:25:56 MDT 2010


Hi Gus!!

I'm at home, so I don't remember exactly the options of the server, I just
pasted the important ones. But if that option is necessary to submit jobs
from other host, yes I have that option activated, cause submiting jobs from
a different machine is working.

Since nobody answered me (but you) and the most of the people use Torque
with either Maui or Moab, I'll probably install Maui tomorrow cause probably
it's a bug but nobody realized cause nobody try to do that with pbs_sched.

Any way, THANK YOU SO MUCH for replying!!! You were the only one after three
long mails with no trivial questions!! Maybe I should contact the Torque
development mail-list.
I appreciate your effort!! Thanks again!!

Fer.

On Tue, Apr 20, 2010 at 20:22, Gus Correa <gus at ldeo.columbia.edu> wrote:

> Hi Fernando
>
> Did you do this?
>
> qmgr -c 'set server allow_node_submit = True'
>
> You don't seem to have it in your 'print server' output.
>
> Anyway, you may need to use the Maui scheduler to achieve what you
> want,  queues associated to different node attributes
> (COREDUO and XEON).
>
> I hope this helps.
> Gus Correa
> ---------------------------------------------------------------------
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> ---------------------------------------------------------------------
>
> Fernando Campos wrote:
> > I just realized now that when I submit jobs from the master node, they
> > get filtered to the proper nodes correctly but when I submit them from
> > another submit_host, they are queued and run, but don't care about the
> > kind of node:
> >
> > /41507-04/19/2010 18:16:04;0040;PBS_Server;Req;set_nodes;allocating
> > nodes for job 1209.master.node.com <http://1209.master.node.com> with
> > node expression 'COREDUO'
> >
> > 41508-04/19/2010 18:16:04;0008;PBS_Server;Job;1209.//master.node.com
> > <http://master.node.com>//;could not locate requested resources
> > 'COREDUO' (node_spec failed) cannot allocate node '06.//node.com
> > <http://node.com>//' to job - node not currently available (nps
> > needed/free: 1/0,  joblist: 1029.//master.node.com
> > <http://master.node.com>//:0,1208.//master.node.com
> > <http://master.node.com>//:1)/
> >
> > Obviously, master.node.com <http://master.node.com> is a fake name, but
> > the point is that when I try to launch a job to the /short/ queue,
> > torque realizes that "needsnode" of type COREDUO but there aren't any
> > available so doesn't allocate any node and stay waiting.
> >
> > I would like, as I said on my previous mail, that if every COREDUO nodes
> > are busy, then use the other type of nodes: XEON. But at least I can see
> > the queue is filtering the allocation of nodes depending on the type.
> >
> > Any idea why this doesn't work submiting the jobs from the
> submit_host?????
> >
> >
> > Thanks a lot again!
> >
> >
> > Fernando.
> >
> > 2010/4/19 Fernando Campos <fernando.campos at uam.es
> > <mailto:fernando.campos at uam.es>>
> >
> >     Hi all!!
> >
> >     I'm having troubles configuring torque server. The situation is,
> >     let's say, 10 nodes running pbs_mom, 1 master node running
> >     pbs_server and pbs_sched (and NFS server and other stuffs), 1 submit
> >     host with torque-client installed to launch jobs and check the
> queues.
> >
> >     The /nodes/ file makes two sets of nodes depending on the type of
> >     processor: COREDUO and XEON.
> >     I've added the bold lines to my queues configuration so, executing
> >     /$ qmgr -c "p s"/ on the master node running pbs_server I get:
> >
> >     /#/
> >     /# Create queues and set their attributes./
> >     /#/
> >     /#/
> >     /# Create and define queue long/
> >     /#/
> >     /create queue long/
> >     /set queue long queue_type = Execution/
> >     /*set queue long resources_default.neednodes = XEON*/
> >     /set queue long enabled = True/
> >     /set queue long started = True/
> >     /#/
> >     /# Create and define queue short/
> >     /#/
> >     /create queue short/
> >     /set queue short queue_type = Execution/
> >     /set queue short resources_max.cput = 24:00:00/
> >     /set queue short resources_max.walltime = 25:00:00/
> >     /*set queue short resources_default.neednodes = COREDUO*/
> >     /set queue short enabled = True/
> >     /set queue short started = True/
> >
> >
> >     So it's supposse that when I submit a job to the /short/ queue
> >     should be executed on a COREDUO node, and if I submit a job to the
> >     /long/ queue, execute on a XEON node. Obviously it's not working
> >     like that and I realize that when I execute /$ qmgr -c "p s"/ from
> >     the submit machine I get different answer:
> >
> >     /#
> >     # Create queues and set their attributes.
> >     #
> >     #
> >     # Create and define queue long
> >     #
> >     create queue long
> >     set queue long queue_type = Execution
> >     set queue long enabled = True
> >     set queue long started = True
> >     #
> >     # Create and define queue short
> >     #
> >     create queue short
> >     set queue short queue_type = Execution
> >     set queue short resources_max.cput = 24:00:00
> >     set queue short resources_max.walltime = 25:00:00
> >     set queue short enabled = True
> >     set queue short started = True
> >     /
> >
> >     NO /*set queue <queue> resources_default.neednodes = <NODE_GROUP>
> >     */LINES AT ALL!!!!
> >     I've already checked and used the submit host to submit jobs to the
> >     master node and they are executed on the nodes. I have also checked
> >     nodes status with pbsnodes and everything seem work fine but this:
> >     they don't take care about "neednodes".
> >
> >     Have anybody got any idea about why is this working this way???
> >
> >     BTW, I also would like to send jobs on the short queue to XEON nodes
> >     if all the COREDUO are busy and send jobs on the long queue to
> >     COREDUO nodes if all the XEON are busy. Any hint??
> >
> >     Thank you very much.
> >
> >     Cheers.
> >
> >     Fernando.
> >
> >
> > --
> >
> ---------------------------------------------------------------------------------------------------------
> > Fernando Campos Del Pozo
> > Departamento de Física Teórica
> > Facultad de Ciencias / Módulo 15 (C-XI) / Despacho 512
> > Universidad Autónoma de Madrid
> > Tlf.: +34-914974893
> > e-mail: fernando.campos at uam.es <mailto:fernando.campos at uam.es>
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>


-- 
---------------------------------------------------------------------------------------------------------
Fernando Campos Del Pozo
Departamento de Física Teórica
Facultad de Ciencias / Módulo 15 (C-XI) / Despacho 512
Universidad Autónoma de Madrid
Tlf.: +34-914974893
e-mail: fernando.campos at uam.es
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20100421/9463aeae/attachment-0001.html 


More information about the torqueusers mailing list