[torqueusers] Re: Job remains in state R

Adil Mughal adil.m.mughal at gmail.com
Mon Feb 25 06:44:07 MST 2008


I feel silly for answering my own problem but I found that

> service iptables stop

solved my problems!!

On Mon, Feb 25, 2008 at 1:35 PM, Adil Mughal <adil.m.mughal at gmail.com> wrote:
> I had a closer look at my mom_log file on one of the slaves and there
>  is the following repeated error message:
>
>  pbs_mom;Req;jobobit;No contact with server at hostaddr 907c3092, port
>  15001, jobid 165.dphpc1011.dph.$
>  $1.dph.aber.ac.uk errno 113
>
>
>  Does that help?
>
>  Adil
>
>
>
>  On Mon, Feb 25, 2008 at 1:17 PM, Adil Mughal <adil.m.mughal at gmail.com> wrote:
>  > Dear Experts
>  >
>  >  I recently had to reboot my master computer.
>  >
>  >  After rebooting I went through the usual steps to set up - i.e.
>  >
>  >  >qterm
>  >  > pbs_server
>  >  >pbs_sched
>  >
>  >  The problem is that now when I submit a basic job like:
>  >
>  >   echo "sleep 5" | qsub
>  >
>  >  or
>  >
>  >   echo "touch testfile" | qsub
>  >
>  >  the job remains in the run state, that is typing qstat gives something
>  >  like this:
>  >
>  >  Job id              Name             User            Time Use S Queue
>  >  ------------------- ---------------- --------------- -------- - -----
>  >  165.dphpc1011       STDIN            guest1                 0 R batch
>  >  166.dphpc1011       STDIN            guest1          00:00:00 R batch
>  >  167.dphpc1011       STDIN            guest1                 0 R batch
>  >  168.dphpc1011       STDIN            guest1          00:00:00 R batch
>  >
>  >  Wheras prevously the jobs were running and then dequeuing
>  >
>  >  Any ideas what I might have missed
>  >
>  >  adil
>  >
>


More information about the torqueusers mailing list