[torqueusers] Torque set up problem: simple jobs not executing
and files undelivered
Adil Mughal
adil.m.mughal at gmail.com
Thu Feb 14 05:34:04 MST 2008
Dear Garrick and other Torque experts,
I manged to get jobs running on my network - as you rightly said it
was not a Torque problem - it was something to do with my
.bash_profile file. I reproduce a copy of my modified .bash_profile
file below for any other novice who may be having the same problem.
Unfortunately I still cannot get the .OU and .ER messages to go to the
right place. Furthermore - I don't know where to find the error
messages which ought to be e-mailed to me. Typing "mail" as either
root or non-root user does not show any messages generated by Torque.
At what point in the TORQUE set up should I have configured the mail?
As always many thanks for any help
adil
#
# .bash_profile called at start of login shell
#
##source /users/PROTOUSER/profile.bash
#
# Users may add their own commands here
#
if [ -f ~/.bashrc ]
then
. ~/.bashrc
fi
## . /users/PROTOUSER/common_profile
bind '"\eOP": dynamic-complete-history'
xto ()
{
DISPLAY=$1:0.0; export DISPLAY
echo DISPLAY set to $DISPLAY
}
echo "You are now running on $HOSTNAME in a BASH environment."
echo ""
On Tue, Feb 12, 2008 at 7:20 PM, Garrick Staples <garrick at usc.edu> wrote:
> On Tue, Feb 12, 2008 at 05:26:55PM +0000, Adil Mughal alleged:
>
> > at no point does the job status register as "R" - it appears to be stuck in "E".
> >
> > I also found that the .ER and .OU files for the jobs are not being
> > delivered and are piling up in /var/spool/torque/undelivered. Here is
>
> The errors for failing to deliver output files are emailed to the user.
>
> Errors for failing to setup the initial job env are sent to syslog or the MOM
> log.
>
>
>
> > the content of these files as a result of running > echo "sleep 30" |
> > qsub
> >
> > .ER
> >
> > stdin: is not a tty
> >
> > and in .OU I get
> >
> > Terminal type (default=dumb) : Terminal type
> > /var/spool/torque/mom_priv/jobs/94.dphpc101.SC invalid - using dumb
> > You are now running on dphpc1001 in a BASH environment.
>
> These are not TORQUE errors. These are generated by the job, the shell, or
> something else in the OS.
>
>
>
> > Also I am using an nfs system - here is the content of my mom_priv/config file:
> >
> > $pbsserver dphpc1011.dph.xxxx.xx.xx
> >
> > $usecp dphpc1011.dph.xxxx.xx.xx:/home /home
>
> Verify that 'df' actually shows the filesystem mounted from
> 'dphpc1011.dph.xxxx.xx.xx:/home'. Since you aren't using a wildcard, the exact
> string must match.
>
>
>
> > $logevent 255
> >
> > Any ideas why the .ER and .OU files are not going to the right places??
>
> You'll need to check the logs and the email that should have been sent.
>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
More information about the torqueusers
mailing list