[torqueusers] Re: Job remains in state R

Adil Mughal adil.m.mughal at gmail.com
Mon Feb 25 06:35:06 MST 2008


I had a closer look at my mom_log file on one of the slaves and there
is the following repeated error message:

pbs_mom;Req;jobobit;No contact with server at hostaddr 907c3092, port
15001, jobid 165.dphpc1011.dph.$
$1.dph.aber.ac.uk errno 113


Does that help?

Adil

On Mon, Feb 25, 2008 at 1:17 PM, Adil Mughal <adil.m.mughal at gmail.com> wrote:
> Dear Experts
>
>  I recently had to reboot my master computer.
>
>  After rebooting I went through the usual steps to set up - i.e.
>
>  >qterm
>  > pbs_server
>  >pbs_sched
>
>  The problem is that now when I submit a basic job like:
>
>   echo "sleep 5" | qsub
>
>  or
>
>   echo "touch testfile" | qsub
>
>  the job remains in the run state, that is typing qstat gives something
>  like this:
>
>  Job id              Name             User            Time Use S Queue
>  ------------------- ---------------- --------------- -------- - -----
>  165.dphpc1011       STDIN            guest1                 0 R batch
>  166.dphpc1011       STDIN            guest1          00:00:00 R batch
>  167.dphpc1011       STDIN            guest1                 0 R batch
>  168.dphpc1011       STDIN            guest1          00:00:00 R batch
>
>  Wheras prevously the jobs were running and then dequeuing
>
>  Any ideas what I might have missed
>
>  adil
>


More information about the torqueusers mailing list