[torqueusers] Batch not running : Things to check

Zarnoch, Dave dave.zarnoch at sykes.com
Wed Sep 14 10:48:01 MDT 2011


James,
 
I tried entering:
qsub -V -I  -l nodes=1 -q dn
and it just hangs there
 
Do I have a problem with "mom"?
Here's some files in mom_priv:
 
usphl1ora002@/var/spool/torque/mom_priv>ls -l jobs
total 0
 
usphl1ora002@/var/spool/torque/mom_priv>more config
$pbsserver      usphl1ora002.amer.sykes.com      # note: hostname
running pbs_server
$logevent       255     # bitmap of which events to log
 
usphl1ora002@/var/spool/torque/mom_priv>more mom.lock
25994
 
usphl1ora002@/var/spool/torque/mom_priv>ps -ef | grep 25994 | grep -v
grep
root     25994     1  0 Sep12 ?        00:01:03 /usr/local/sbin/pbs_mom
-p
 
Not really familiar with "mom"
 
I also don't have a lot of documentation on Torque...
Do you know of any good web pages?
Thanks!
Dave
 
Dave Zarnoch
UNIX Systems Administration
(215)200-0911
Dave.Zarnoch at sykes.com
 
________________________________

From: torqueusers-bounces at supercluster.org
[mailto:torqueusers-bounces at supercluster.org] On Behalf Of Coyle, James
J [ITACD]
Sent: Wednesday, September 14, 2011 12:00 PM
To: Torque Users Mailing List
Subject: Re: [torqueusers] Batch not running : Things to check
 
Dave,
 
  Welcome to Torque. I switched from NQS some time ago, and Torque/PBS
has been a good replacement for me.
Things to check:
  What does the error output say? ( Probably in file dn_test.txt.e[0-9]*
)
  Permissions on  /home/zarnocda/torque/scripts_test/dn_test.sh , is it
executable?  
You may need:    chmod u+x /home/zarnocda/torque/scripts_test/dn_test.sh
 
  I'd also check if /home/zarnocda/torque/scripts_test/dn_test.sh 
even exists on the compute node.
e.g.  ls /home/zarnocda/torque/scripts_test/dn_test.sh  executable
 
  I usually use the interactive opion ( qsub -I ) to debug these kinds
of problems.
You could issue:
qsub -V -I  -l nodes=1 -q dn
which will start an interactive jobs and log you into the mother
superior node for that job
where you can then try issuing the commands within your job that is not
working.
 
James Coyle, PhD
High Performance Computing Group        
 Iowa State Univ.          
web: http://jjc.public.iastate.edu/ <http://www.public.iastate.edu/~jjc>

 
 
 
From: torqueusers-bounces at supercluster.org
[mailto:torqueusers-bounces at supercluster.org] On Behalf Of Zarnoch, Dave
Sent: Wednesday, September 14, 2011 8:42 AM
To: torqueusers at supercluster.org
Subject: [torqueusers] Batch not running
Importance: High
 
Hello folks,
New to Torque, used to run NQS....
Concerning Torque...
I have a small script:
$ more dn_test.sh
#!/bin/sh
#
PATH=/bin:/usr/bin:/usr/local/bin:/etc:/usr/sbin:/usr/ucb:$HOME/bin:/usr
/bin/X11
:/sbin:.
export PATH
DATE=`date +%H%M`
echo "Hello"
touch /tmp/dn_test_${DATE}
sleep 90
 
When I submit the script:
 
qsub -V -l nodes=1 -q dn dn_test.sh
 
It runs fine.
 
But I need to run batch...
 
I created a text file "dn_test.txt"
 
That contains:
 
/home/zarnocda/torque/scripts_test/dn_test.sh
 
When I run:
 
qsub -V -l nodes=1 -q dn dn_test.txt
 
It appears to process the file:
qstat -s
Job id                    Name             User            Time Use S
Queue
------------------------- ---------------- --------------- -------- -
-----
7592.usphl1ora002.amer    dn_test.txt      zarnocda               0 R dn

 
But it doesn't excute the script within:
/home/zarnocda/torque/scripts_test/dn_test.sh
 
Any help!
 
Thanks!
 
Dave
Dave Zarnoch
UNIX Systems Administration
(215)200-0911
Dave.Zarnoch at sykes.com
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20110914/574d2a2c/attachment-0001.html 


More information about the torqueusers mailing list