[torqueusers] scp unreliability

Chris Samuel csamuel at vpac.org
Fri Jun 6 05:37:15 MDT 2008


----- "Darren Platt" <darren at 23andme.com> wrote:

> Just to elaborate on my earlier comments on the scp mechanism for file
> transfer. Here's a simple test that breaks it on our (modestly small)
> test cluster:

Two questions:

1) Is this with 2.3 ?

2) Can you check your syslog and mom logs for things like:

pbs_mom: No such file or directory (2) in open_std_file, cannot open/create stdout/stderr file '/usr/spool/PBS/spool/253428.tango-m.vpac.org.OU'

as we're seeing this occasionally on some nodes, and some
extra debugging I added implied that O_CREAT was disappearing.

Have just recompiled my mom's with extra code to print out
where that might happen to see if it's deliberately getting
dropped or not, but it may take a little time to work out
what's going on..

cheers,
Chris
-- 
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency


More information about the torqueusers mailing list