[torqueusers] Post job file processing error / permission denied

David Beer dbeer at adaptivecomputing.com
Thu Aug 15 09:46:39 MDT 2013


Mark,

Are the keys set up for the user that you're getting the error with? I
believe using ssh keys is the most common setup for people that aren't
doing everything via nfs, and I know it is used / has been used by our QA
team on their small test cluster here.

David


On Wed, Aug 14, 2013 at 8:57 PM, <glen.beane at gmail.com> wrote:

> You're right, it should be a viable option. If passwordless ssh works from
> the compute nodes to the submit node the the scp should succeed.   I think
> in most cases Torque users have NFS home directories, which is why I
> suggested $usecp.
>
>
> On Aug 14, 2013, at 9:08 PM, "Mark Christiansen" <mchristi at uw.edu> wrote:
>
> Glen,****
>
> ** **
>
> Thank you for your suggestion.  By moving my files to an NFS share and
> configuring the $usecp parameter, I was able to get things to work without
> an error.  However, I still don’t understand why using ssh keys didn’t
> work.  Perhaps it is because of the use of the Scheduler@ user that I
> remember seeing in one of the logs.  It seems like using password-less ssh
> keys should be a viable solution to solving this problem.****
>
> ** **
>
> Best regards,****
>
> ** **
>
> Mark**
>
> ** **
>
> *From:* torqueusers-bounces at supercluster.org [
> mailto:torqueusers-bounces at supercluster.org<torqueusers-bounces at supercluster.org>]
> *On Behalf Of *glen.beane at gmail.com
> *Sent:* Wednesday, August 14, 2013 5:05 PM
> *To:* Torque Users Mailing List
> *Subject:* Re: [torqueusers] Post job file processing error / permission
> denied****
>
> ** **
>
> You don't need to use ssh keys if this is an NFS filesystem. Check the
> $usecp pbs_mom config file parameter
>
> Sent from my iPhone****
>
>
> On Aug 14, 2013, at 2:52 PM, "Mark Christiansen" <mchristi at uw.edu> wrote:*
> ***
>
> Hi Everyone,****
>
>  ****
>
> I am trying to set up a torque cluster and I get the following error:****
>
>  ****
>
> An error has occurred processing your job, see below.****
>
> Post job file processing error; job 4****
>
>  ****
>
> Unable to copy file /var/lib/torque/spool/4.localhost.localdomain.OU to
> user at host.local:/home/user/job.out****
>
> *** error from copy****
>
> Permission denied
> (publickey,gssapi-keyex,gssapi-with-mic,password,hostbased).****
>
> lost connection****
>
> *** end error output****
>
> Output retained on that host in:
> /var/lib/torque/undelivered/4.localhost.localdomain.OU****
>
>  ****
>
> I have looked through the archives, and there are some threads around
> using ssh keys to fix this problem.  However, I did configure password-less
> ssh logins and it works well.  If I log into the machine giving me the
> above error, I can perform a copy using “scp” without any problem.  I can
> also ssh from any machine to any machine in the cluster without getting
> prompted for a password.   ****
>
>  ****
>
> My cluster is configured to use LDAP for authenticating users.  ****
>
>  ****
>
> Has anyone run into this problem after configuring their ssh keys?  ****
>
>  ****
>
> Thank you in advance, ****
>
>  ****
>
> Mark****
>
>  ****
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers****
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>


-- 
David Beer | Senior Software Engineer
Adaptive Computing
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20130815/adb17168/attachment-0001.html 


More information about the torqueusers mailing list