[torqueusers] problems receiving output files from jobs
nathaniel.x.woody at gsk.com
nathaniel.x.woody at gsk.com
Tue Mar 13 09:32:50 MDT 2007
torqueusers-bounces at supercluster.org wrote on 03/13/2007 11:06:14 AM:
> On Mon, Mar 12, 2007 at 09:05:36PM +0100, Thomas Blum alleged:
> > Hello,
> >
> > we experience problems with torque-2.1.7(with maui-3.2.6p17) when
> > submitting 200 jobs in a for loop and getting back the output/error
> > files for that jobs. The job only does date, hostname and uptime. From
> > submitted 200 jobs e.g. we get back 198 e and 196 o files. We tried
> > using scp and rcp but both have the same effects. The missing files
> > stuck on the clients in the undelivered directories and the serverlog
says:
> >
> > Post job file processing error
> >
> > Does anybody has an idea what we can do?
>
> The actual error message would have been emailed to the user. Also, I'd
> check syslog for errors.
>
> The first thing off the top of my head is "MaxStartups" in sshd_config.
>
The other one worth checking is ConnectionAttempts in ssh_config. That's
based on the same assumption as Garricks I believe that you're getting
something that looks like an ssh connection failure reported back.
Best,
Nate
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20070313/6f5020cc/attachment-0001.html
More information about the torqueusers
mailing list