[torqueusers] Batch not running : Things to check
Zarnoch, Dave
dave.zarnoch at sykes.com
Wed Sep 14 11:31:13 MDT 2011
Thanks!
I'll give that a shot!
Dave
Dave Zarnoch
UNIX Systems Administration
(215)200-0911
Dave.Zarnoch at sykes.com
________________________________
From: torqueusers-bounces at supercluster.org
[mailto:torqueusers-bounces at supercluster.org] On Behalf Of Sreedhar
Manchu
Sent: Wednesday, September 14, 2011 1:25 PM
To: Torque Users Mailing List
Subject: Re: [torqueusers] Batch not running : Things to check
Hi Dave,
This line says which directories from the host should be staged on to
the compute node's destination directory. This is what I found from
torque documentation. I included the link to the page below. Hopefully,
this helps.
$usecp Format: <HOST>:<SRCDIR> <DSTDIR> Description: Specifies which
directories should be staged (see TORQUE Data Management
<http://www.clusterresources.com/torquedocs21/6.2filesystems.shtml#torqu
edm> ) Example: $usecp *.fte.com:/data /usr/local/data
http://www.clusterresources.com/torquedocs21/a.cmomconfig.shtml
Best,
Sreedhar.
On Sep 14, 2011, at 1:18 PM, Zarnoch, Dave wrote:
Sreedhar.
Thanks for your suggestion!
Just a question....
The second line:
$usecp crunch.its.nyu.edu:/home /home
Is this because the script that I'm running is located in /home or is
the location "/home" used for something else?
Thanks!
Dave
Dave Zarnoch
UNIX Systems Administration
(215)200-0911
Dave.Zarnoch at sykes.com
________________________________
From: torqueusers-bounces at supercluster.org
[mailto:torqueusers-bounces at supercluster.org] On Behalf Of Sreedhar
Manchu
Sent: Wednesday, September 14, 2011 1:11 PM
To: Torque Users Mailing List
Subject: Re: [torqueusers] Batch not running : Things to check
Hi Dave,
This is what I have in my config file.
$pbsserver crunch.local
$usecp crunch.its.nyu.edu:/home /home
$spool_as_final_name true
I think you need to mention the second line.
Best,
Sreedhar.
On Sep 14, 2011, at 12:48 PM, Zarnoch, Dave wrote:
James,
I tried entering:
qsub -V -I -l nodes=1 -q dn
and it just hangs there
Do I have a problem with "mom"?
Here's some files in mom_priv:
usphl1ora002@/var/spool/torque/mom_priv>ls -l jobs
total 0
usphl1ora002@/var/spool/torque/mom_priv>more config
$pbsserver usphl1ora002.amer.sykes.com
<http://usphl1ora002.amer.sykes.com/> # note: hostname running
pbs_server
$logevent 255 # bitmap of which events to log
usphl1ora002@/var/spool/torque/mom_priv>more mom.lock
25994
usphl1ora002@/var/spool/torque/mom_priv>ps -ef | grep 25994 | grep -v
grep
root 25994 1 0 Sep12 ? 00:01:03 /usr/local/sbin/pbs_mom
-p
Not really familiar with "mom"
I also don't have a lot of documentation on Torque...
Do you know of any good web pages?
Thanks!
Dave
Dave Zarnoch
UNIX Systems Administration
(215)200-0911
Dave.Zarnoch at sykes.com
________________________________
From: torqueusers-bounces at supercluster.org
[mailto:torqueusers-bounces at supercluster.org] On Behalf Of Coyle, James
J [ITACD]
Sent: Wednesday, September 14, 2011 12:00 PM
To: Torque Users Mailing List
Subject: Re: [torqueusers] Batch not running : Things to check
Dave,
Welcome to Torque. I switched from NQS some time ago, and Torque/PBS
has been a good replacement for me.
Things to check:
What does the error output say? ( Probably in file dn_test.txt.e[0-9]*
)
Permissions on /home/zarnocda/torque/scripts_test/dn_test.sh , is it
executable?
You may need: chmod u+x /home/zarnocda/torque/scripts_test/dn_test.sh
I'd also check if /home/zarnocda/torque/scripts_test/dn_test.sh
even exists on the compute node.
e.g. ls /home/zarnocda/torque/scripts_test/dn_test.sh executable
I usually use the interactive opion ( qsub -I ) to debug these kinds
of problems.
You could issue:
qsub -V -I -l nodes=1 -q dn
which will start an interactive jobs and log you into the mother
superior node for that job
where you can then try issuing the commands within your job that is not
working.
James Coyle, PhD
High Performance Computing Group
Iowa State Univ.
web: http://jjc.public.iastate.edu/ <http://www.public.iastate.edu/~jjc>
From: torqueusers-bounces at supercluster.org
[mailto:torqueusers-bounces at supercluster.org] On Behalf Of Zarnoch, Dave
Sent: Wednesday, September 14, 2011 8:42 AM
To: torqueusers at supercluster.org
Subject: [torqueusers] Batch not running
Importance: High
Hello folks,
New to Torque, used to run NQS....
Concerning Torque...
I have a small script:
$ more dn_test.sh
#!/bin/sh
#
PATH=/bin:/usr/bin:/usr/local/bin:/etc:/usr/sbin:/usr/ucb:$HOME/bin:/usr
/bin/X11
:/sbin:.
export PATH
DATE=`date +%H%M`
echo "Hello"
touch /tmp/dn_test_${DATE}
sleep 90
When I submit the script:
qsub -V -l nodes=1 -q dn dn_test.sh
It runs fine.
But I need to run batch...
I created a text file "dn_test.txt"
That contains:
/home/zarnocda/torque/scripts_test/dn_test.sh
When I run:
qsub -V -l nodes=1 -q dn dn_test.txt
It appears to process the file:
qstat -s
Job id Name User Time Use S
Queue
------------------------- ---------------- --------------- -------- -
-----
7592.usphl1ora002.amer dn_test.txt zarnocda 0 R dn
But it doesn't excute the script within:
/home/zarnocda/torque/scripts_test/dn_test.sh
Any help!
Thanks!
Dave
Dave Zarnoch
UNIX Systems Administration
(215)200-0911
Dave.Zarnoch at sykes.com
_______________________________________________
torqueusers mailing list
torqueusers at supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers
_______________________________________________
torqueusers mailing list
torqueusers at supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20110914/d0b8b40a/attachment-0001.html
More information about the torqueusers
mailing list