[torqueusers] Torque scheduler not informed about new jobs?
gus at ldeo.columbia.edu
Fri Sep 27 08:52:41 MDT 2013
Thank you Nico.
Glad to know Torque+Maui is working for you.
Incidentally, yesterday Matthew Ezell posted a message
with a fix for pbs_sched, although as far as I know,
it is not in Torque 4.2.5 yet:
It is good news anyway.
(Thanks Matthew also!).
On 09/27/2013 04:25 AM, Dr. Nico van Eikema Hommes (CCC) wrote:
> Hello Gus,
> excellent suggestion: I installed the maui scheduler and now jobs start immediately when nodes are free.
> Thanks a lot!
> Best regards,
> On 20 Sep 2013, at 17:35 , Gus Correa<gus at ldeo.columbia.edu> wrote:
>> Hi Nico
>> If by "standard scheduler" you mean pbs_sched,
>> please read these recent threads:
>> To answer your question, yes, myself and others have seen this behavior.
>> I would suggest trying Maui instead of pbs_sched.
>> My two cents,
>> Gus Correa
>> On 09/19/2013 08:26 PM, Dr. Nico van Eikema Hommes (CCC) wrote:
>>> We recently set up a new master node, running openSUSE 12.3,
>> and installed version 4.1.6 of Torque. Everything seemed to
>> running fine, but we noticed that freshly submitted jobs are
>> not run immediately even when suitable nodes are free.
>> Instead, they wait in the queue until the scheduler
>> (we use the standard scheduler at the moment) is run periodically.
>>> The logfile contains the lines
>>> PBS_Server.59003;Svr;servernode;Scheduler was
>> sent the command time
>>> but no lines like
>>> PBS_Server.54088;Svr;servernode;Scheduler was sent the command new
>>> that appear on another cluster system (with torque 4.1.0)
>> immediately when jobs are submitted. On this cluster,
>> jobs start immediately when free nodes are available.
>>> Jobs are submitted from the login nodes.
>> However, the problem occurs as well when jobs
>> are submitted on the master node.
>>> The server parameters are the same on both clusters.
>>> Our current workaround is to set the scheduler
>> interval to 60 seconds, but this isn't really a solution.
>>> Has anybody seen this behaviour?
>> Or better, does anybody know how to solve this problem?
>>> Thanks in advance for any help!
>>> Best regards,
>>> Nico van Eikema Hommes
> Dr. N.J.R. van Eikema Hommes Computer-Chemie-Centrum
> E-Mail: nico.hommes at fau.de Universitaet Erlangen-Nuernberg
> Phone: +49-9131-8526532 Naegelsbachstr. 25
> FAX: +49-9131-8526565 91052 Erlangen, Germany
> torqueusers mailing list
> torqueusers at supercluster.org
More information about the torqueusers