[torqueusers] error on qsub/mpirun jobs
Zhiliang Hu
zhu at iastate.edu
Mon Sep 8 09:28:26 MDT 2008
I have a a mpiblast job that runs well on command line with "mpirun",
but encounter errors when "qsub" to run:
qsub -l nodes=6:ppn=2
-e /path/to/locationA
-o /path/to/locationA
/path/to/program
----------------------------------------------------------
Unable to copy file /var/spool/torque/spool/658.nagrp2..ER to
hu at hist:/raid/pub/ncbi/blast/www/mpiblast.tmp
>>> error from copy
Host key verification failed.
lost connection
>>> end error output
Output retained on that host in: /var/spool/torque/undelivered/658.nagrp2..ER
----------------------------------------------------------
-- When manually check, the "retained" file "/var/spool/torque/undelivered/658.nagrp2..ER" is not there.
-- I wonder why "Host key verification failed"? Since I can ssh to all nodes, and run it with mpirun with no problem. I suspect there might be something in torgue that may lead to above, possible misleading, errors?
Any hint to look further is appreciated.
Zhiliang
More information about the torqueusers
mailing list