[torqueusers] All *.ER and *.OU stucked in undelivered

Hristo Iliev hristo at mc.phys.uni-sofia.bg
Mon Mar 20 09:17:53 MST 2006


On Tue, 2006-03-14 at 12:47 +0100, Torsten Bruhn wrote:
> Hallo,
> 
> I am new to Torque and maui and tried to setup them on our cluster. 
> Torque is installed with the -with-scp option and scp is possible in 
> both directions. Jobs in the queue start normally and finish normally 
> but the *.ER- and *.OU-files are not copied in the users directory 
> but get stucked in the undelivered directory. There are no error 
> messages in the log-files and no mails with an error-message and I 
> have no clue what is wrong, perhaps some here can help?
> 
> Greets,
> --
> Dipl.-Chem. Torsten Bruhn


Hello,

do you use public keys for SSH or do you enter a password each time?
In order for scp to work unattended you need to use public key
authentication with *empty* key passphrase.

You can generate your public key by executing the following command:

ssh-keygen -t dsa -b 1024
(just hit Enter when asked for passphrase)

Then you will get two files in the .ssh subdirectory of your home dir:
id_dsa (keep this file in safety - this is your secret key)
id_dsa.pub (your public key file)

Now all you need to do is to append the content of id_dsa.pub to the end
of the authorized_keys file (found once again in the .ssh subdir) on
each computing node:

(execute the following commands from your .ssh subdir)

cat id_dsa.pub >> authorized_keys
cat id_dsa.pub | ssh login at hostname "cat - >> ~/.ssh/authorized_keys"
(substitute login with your login and hostname with the name of each
computing node. you can of course transfer id_dsa.pub to the remote
hosts, login there and cat its content to authorized_keys)

Now you have to SSH login from each of your compute nodes back to the
machine from which you submit your job files and accept the server key
fingerprint. The reason for doing so is that scp will ask you to confirm
that you trust the SSH server fingerprint the first time you connect to
the server from your compute nodes. You can confirm that everything is
OK when the following command "scp somefile mainnode:~/" completes
without asking you for password, passphrase or fingerprint trust.

In my opinion it is easier to setup NFS shared home folders and to use
NIS for central administration of user accounts. Then Torque can use
simple "cp" to transfer files back and forth.



More information about the torqueusers mailing list