[torqueusers] Torque set up problem: simple jobs not executing and files undelivered

Adil Mughal adil.m.mughal at gmail.com
Tue Feb 12 10:26:55 MST 2008


Dear Experts

I am having a problem getting the most simple job to run using torque. I enter

> echo "sleep 30" | qsub

on the master (not as root of course).

Typing

>qstat

on the master as root gives:

Job id              Name             User            Time Use S Queue
------------------- ---------------- --------------- -------- - -----
94.dphpc1011        STDIN            guest1          00:00:00 E batch

at no point does the job status register as "R" - it appears to be stuck in "E".

I also found that the .ER and .OU files for the jobs are not being
delivered and are piling up in /var/spool/torque/undelivered. Here is
the content of these files as a result of running > echo "sleep 30" |
qsub

.ER

stdin: is not a tty

and in .OU    I get

Terminal type (default=dumb) : Terminal type
/var/spool/torque/mom_priv/jobs/94.dphpc101.SC invalid - using dumb
You are now running on dphpc1001 in a BASH environment.


Any ideas what this means? How can I get my jobs to execute??

Also I am using an nfs system - here is the content of my mom_priv/config file:

$pbsserver dphpc1011.dph.xxxx.xx.xx

$usecp dphpc1011.dph.xxxx.xx.xx:/home  /home

$logevent       255

Any ideas why the .ER and .OU files are not going to the right places??

As usual many kind thanks in advance!!!!

adil


More information about the torqueusers mailing list