[torqueusers] Q:Torque&maui

Garrick garrick at usc.edu
Mon Oct 13 19:44:28 MDT 2008


No, it runs scp as the user.  The error message is telling that the  
can't scp without a password.

HPCC/Linux Systems Admin

On Oct 13, 2008, at 6:42 PM, Ye YC Cui <cuiye at cn.ibm.com> wrote:

> Hi All
> Error message is:
> Unable to copy file /var/spool/torque/spool/ 
> 14.ls21-03.clusters.com.ER to admin1 at ls21
> -03.clusters.com:/home/admin1/STDIN.e14
> >>> error from copy
> Host key verification failed.
> lost connection
> >>> end error output
> Output retained on that host in: /var/spool/torque/undelivered/ 
> 14.ls21-03.clusters.co
> m.ER
>
> It is possible that cluster use root to copy the STDIN.* file to  
> admin1 user's folder?
>
> Simon Cui ( 崔野)
> IBM China Software Development LAB, Beijing
> CSTL HPC System Management Development
> Tel: 86-10-82782244 ext 54955 E-mail: cuiye at cn.ibm.com
> Address: 2/F,IBM ZGC Campus. Ring Building 28, ZhongGuanCun Software  
> Park,
> No.8 Dong Bei Wang West Road Haidian District,
> Beijing P.R.China 100193
> MSN: cuiye_forevery at hotmail.com
>
>
>
>
> <graycol.gif>Ye YC Cui---10/13/2008 04:08:45 PM---Hi All
>
> Ye YC Cui/China/Contr/IBM at IBMCN
> Sent by: torqueusers-bounces at supercluster.org
> 10/13/2008 04:06 PM
>
> <ecblank.gif>
> To
> <ecblank.gif>
> torqueusers at supercluster.org
> <ecblank.gif>
> cc
> <ecblank.gif>
> <ecblank.gif>
> Subject
> <ecblank.gif>
> [torqueusers] Q:Torque&maui
> <ecblank.gif>	<ecblank.gif>
>
> Hi All
> We have installed <torque2.3.2.tar.gz> & <Maui 3.2.6 - Patch 19> on  
> 2 nodes with <SLES 10 os>.
> We have root user and admin1 user . node1 is master and node2 is  
> count node.
> Now On node1 and node2 root can ssh to each other without password.
> On node1 and node2 admin1 user can ssh to each other without password.
>
> set:
> qmgr -c "set server operators = root at node1.clusters.com"
> qmgr -c "set server operators += admin1 at node1.clusters.com"
> qmgr -c "create queue batch"
> qmgr -c "set queue batch queue_type = Execution"
> qmgr -c "set queue batch started = True"
> qmgr -c "set queue batch enabled = True"
> qmgr -c "set server default_queue = batch"
> qmgr -c "set server resources_default.nodes = 1"
> qmgr -c "set server scheduling = True"
> qmgr -c "set server scheduling = True"
> Qmgr: set server managers += root at node1.clusters.com
> Qmgr: set server resources_default.nodect = 1
> Qmgr: set server resources_default.walltime = 00:05:00
>
> We try to commit jobs on nod1 with :
> cat comm.sh |qsub -l nodes=node1
> cat comm.sh |qsub -l nodes=node2
> PASS
> cat comm.sh |qsub
> failed
>
>
> We try to commite jobs on nod2 with :
> cat comm.sh |qsub -l nodes=node1
> cat comm.sh |qsub -l nodes=node2
> Failed
> cat comm.sh |qsub
> PASS
>
> error message: can not copy STDIN.* to node1 or to node2
>
>
>
> Could you give me some tips about this problem?
> Simon Cui ( 崔野)
> IBM China Software Development LAB, Beijing
> CSTL HPC System Management Development
> Tel: 86-10-82782244 ext 54955 E-mail: cuiye at cn.ibm.com
> Address: 2/F,IBM ZGC Campus. Ring Building 28, ZhongGuanCun Software  
> Park,
> No.8 Dong Bei Wang West Road Haidian District,
> Beijing P.R.China 100193
> MSN: cuiye_forevery at hotmail.com
>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20081013/015317dc/attachment.html


More information about the torqueusers mailing list