[torqueusers] Torque set up problem: simple jobs not executing and files undelivered

Adil Mughal adil.m.mughal at gmail.com
Thu Feb 14 05:34:04 MST 2008


Dear Garrick and other Torque experts,

I manged to get jobs running on my network - as you rightly said it
was not a Torque problem - it was something to do with my
.bash_profile file. I reproduce a copy of my modified .bash_profile
file below for any other novice who may be having the same problem.

Unfortunately I still cannot get the .OU and .ER messages to go to the
right place. Furthermore - I don't know where to find the error
messages which ought to be e-mailed to me. Typing "mail" as either
root or non-root user does not show any messages generated by Torque.

At what point in the TORQUE set up should I have configured the mail?

As always many thanks for any help

adil






#
#       .bash_profile called at start of login shell
#
##source /users/PROTOUSER/profile.bash
#
#       Users may add their own commands here
#

if [ -f ~/.bashrc ]
then
        . ~/.bashrc
fi
## . /users/PROTOUSER/common_profile
bind '"\eOP": dynamic-complete-history'
xto ()
{
DISPLAY=$1:0.0; export DISPLAY
echo DISPLAY set to $DISPLAY
}
echo "You are now running on $HOSTNAME in a BASH environment."
echo ""


On Tue, Feb 12, 2008 at 7:20 PM, Garrick Staples <garrick at usc.edu> wrote:
> On Tue, Feb 12, 2008 at 05:26:55PM +0000, Adil Mughal alleged:
>
> > at no point does the job status register as "R" - it appears to be stuck in "E".
>  >
>  > I also found that the .ER and .OU files for the jobs are not being
>  > delivered and are piling up in /var/spool/torque/undelivered. Here is
>
>  The errors for failing to deliver output files are emailed to the user.
>
>  Errors for failing to setup the initial job env are sent to syslog or the MOM
>  log.
>
>
>
>  > the content of these files as a result of running > echo "sleep 30" |
>  > qsub
>  >
>  > .ER
>  >
>  > stdin: is not a tty
>  >
>  > and in .OU    I get
>  >
>  > Terminal type (default=dumb) : Terminal type
>  > /var/spool/torque/mom_priv/jobs/94.dphpc101.SC invalid - using dumb
>  > You are now running on dphpc1001 in a BASH environment.
>
>  These are not TORQUE errors.  These are generated by the job, the shell, or
>  something else in the OS.
>
>
>
>  > Also I am using an nfs system - here is the content of my mom_priv/config file:
>  >
>  > $pbsserver dphpc1011.dph.xxxx.xx.xx
>  >
>  > $usecp dphpc1011.dph.xxxx.xx.xx:/home  /home
>
>  Verify that 'df' actually shows the filesystem mounted from
>  'dphpc1011.dph.xxxx.xx.xx:/home'.  Since you aren't using a wildcard, the exact
>  string must match.
>
>
>
>  > $logevent       255
>  >
>  > Any ideas why the .ER and .OU files are not going to the right places??
>
>  You'll need to check the logs and the email that should have been sent.
>
>
> _______________________________________________
>  torqueusers mailing list
>  torqueusers at supercluster.org
>  http://www.supercluster.org/mailman/listinfo/torqueusers
>
>


More information about the torqueusers mailing list