[torqueusers] output staying on nodes - pbs_mom problem ?

Ronny T. Lampert telecaadmin at uni.de
Tue Sep 20 02:46:05 MDT 2005


Hi,

> Running 1.2.0p5, I am having a similar problem with the job staying in the
> "E" state and eventually clearing out reporting the same Post job file
> processing error.  However it is intermittent and I am running an Epilogue
> script as well but all processes have completed for the job and the Epilogue
> script.  Sometimes I get the Standard Out file but the Standard Error is
> still in the spool directory on the mother superior.  We never saw this
> behavior in 1.2.0p4.

State "E" means the job has finished and the postprocessing is being done;
this includes delivering the output back to the submitter.

The behaviour you mention indicates that the moms can't deliver the output
back; the moms use rcp (an own version, somewhere in the torque-dir) to do
this or can be configured to use scp instead (configure option --use-scp).

If you have networked filesystems, you should use the $usecp directive in
mom_priv/config:

$usecp *.<your>.<domain>:/home /home

Hope that helps?

Cheers,
Ronny




More information about the torqueusers mailing list