[torqueusers] ssh problem
Kevin Van Workum
vanw at tticluster.com
Mon Feb 7 14:01:11 MST 2005
make sure that you can scp from blade* to your pbs_server machine without
using a password (ssh_key authentication).
On Mon, 7 Feb 2005, Guillaume Alleon wrote:
> Hi,
>
> I am running torque-1.2.0p0 on a x86_64 machine. It is configured with
> the --with-scp option.
> Running a job is ok until the stage out. Then I get the following message:
>
> Host key verification failed.
> lost connection
>
> The notification email is the following:
>
> -------------------------------------------------------------------------------------------------------
> Message 1:
> From adm at hal Mon Feb 7 19:57:23 2005
> Date: Mon, 7 Feb 2005 19:57:23 +0100
> From: adm <adm at hal>
> To: alleon at hal
> Subject: PBS JOB 39.hal
> Precedence: bulk
>
> PBS Job Id: 39.hal
> Job Name: zaza
> File stage in failed, see below.
> Job will be retried later, please investigate and correct problem.
> Post job file processing error; job 39.hal on host
> blade08/1+blade08/0+blade07/1+blade07/0+blade06/1+blade06/0+blade05/1+blade05/0
>
> Unable to copy file 39.hal.OU to hal:/home/alleon/test/zaza.o39
> -------------------------------------------------------------------------------------------------------
>
> when I do the copy as the job owner it is OK. The mom logs are not
> telling much
>
> 02/07/2005 19:52:40;0100; pbs_mom;Req;;Type QueueJob request received
> from PBS_Server at hal, sock=10
> 02/07/2005 19:52:40;0100; pbs_mom;Req;;Type JobScript request received
> from PBS_Server at hal, sock=10
> 02/07/2005 19:52:40;0100; pbs_mom;Req;;Type ReadyToCommit request
> received from PBS_Server at hal, sock=10
> 02/07/2005 19:52:40;0100; pbs_mom;Req;;Type Commit request received
> from PBS_Server at hal, sock=10
> 02/07/2005 19:52:40;0100; pbs_mom;Req;;Type StatusJob request received
> from PBS_Server at hal, sock=13
> 02/07/2005 19:52:40;0001; pbs_mom;Job;TMomFinalizeJob3;job 39.hal
> started, pid = 10881
> 02/07/2005 19:52:40;0001; pbs_mom;Svr;pbs_mom;tm_eof, matching task
> located, marking interface closed
> 02/07/2005 19:52:40;0100; pbs_mom;Req;;Type StatusJob request received
> from PBS_Server at hal, sock=10
> 02/07/2005 19:52:41;0008; pbs_mom;Job;39.hal;start_process: task
> started, tid 2, sid 10934, cmd /bin/sh
> 02/07/2005 19:52:41;0001; pbs_mom;Svr;pbs_mom;tm_eof, matching task
> located, marking interface closed
> 02/07/2005 19:52:41;0008; pbs_mom;Job;39.hal;start_process: task
> started, tid 3, sid 10936, cmd /bin/sh
> 02/07/2005 19:53:12;0100; pbs_mom;Req;;Type StatusJob request received
> from PBS_Server at hal, sock=11
> 02/07/2005 19:54:40;0100; pbs_mom;Req;;Type StatusJob request received
> from PBS_Server at hal, sock=11
> 02/07/2005 19:55:40;0100; pbs_mom;Req;;Type StatusJob request received
> from PBS_Server at hal, sock=11
> 02/07/2005 19:56:35;0080; pbs_mom;Job;39.hal;scan_for_terminated: job
> 39.hal task 2 terminated, sid 10934
> 02/07/2005 19:56:35;0080; pbs_mom;Job;39.hal;scan_for_terminated: job
> 39.hal task 3 terminated, sid 10936
> 02/07/2005 19:56:35;0001; pbs_mom;Svr;pbs_mom;tm_eof, matching task
> located, marking interface closed
> 02/07/2005 19:56:36;0008; pbs_mom;Job;39.hal;kill_task: killing pid
> 10882 task 1 with sig 9
> 02/07/2005 19:56:36;0080; pbs_mom;Job;39.hal;scan_for_terminated: job
> 39.hal task 1 terminated, sid 10881
> 02/07/2005 19:56:36;0008; pbs_mom;Job;39.hal;Terminated
> 02/07/2005 19:56:36;0100; pbs_mom;Req;;Type CopyFiles request received
> from PBS_Server at hal, sock=10
> 02/07/2005 19:56:58;0100; pbs_mom;Req;;Type DeleteJob request received
> from PBS_Server at hal, sock=10
>
> Have you any idea of what I am doing wrong ?
> Yours
>
> Guillaume
>
>
>
>
> Host key verification failed.
> lost connection
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://supercluster.org/mailman/listinfo/torqueusers
>
--
Kevin Van Workum, Ph.D.
Vice President
Senior System Administrator
www.clusterondemand.com
ONLINE COMPUTER CLUSTERS
More information about the torqueusers
mailing list