[torqueusers] Batch not running : Things to check

Sreedhar Manchu sm4082 at nyu.edu
Wed Sep 14 11:10:55 MDT 2011


Hi Dave,

This is what I have in my config file.

$pbsserver crunch.local
$usecp crunch.its.nyu.edu:/home /home
$spool_as_final_name true

I think you need to mention the second line.

Best,
Sreedhar.



On Sep 14, 2011, at 12:48 PM, Zarnoch, Dave wrote:

> James,
> 
>  
> 
> I tried entering:
> 
> qsub -V –I  -l nodes=1 -q dn
> 
> and it just hangs there
> 
>  
> 
> Do I have a problem with “mom”?
> 
> Here’s some files in mom_priv:
> 
>  
> 
> usphl1ora002@/var/spool/torque/mom_priv>ls -l jobs
> total 0
>  
> usphl1ora002@/var/spool/torque/mom_priv>more config
> $pbsserver      usphl1ora002.amer.sykes.com      # note: hostname running pbs_server
> $logevent       255     # bitmap of which events to log
>  
> 
> usphl1ora002@/var/spool/torque/mom_priv>more mom.lock
> 25994
>  
> usphl1ora002@/var/spool/torque/mom_priv>ps -ef | grep 25994 | grep -v grep
> root     25994     1  0 Sep12 ?        00:01:03 /usr/local/sbin/pbs_mom -p
>  
> 
> Not really familiar with “mom”
> 
>  
> 
> I also don’t have a lot of documentation on Torque…
> 
> Do you know of any good web pages?
> 
> Thanks!
> 
> Dave
> 
>  
> 
> Dave Zarnoch
> 
> UNIX Systems Administration
> 
> (215)200-0911
> 
> Dave.Zarnoch at sykes.com
> 
>  
> 
> From: torqueusers-bounces at supercluster.org [mailto:torqueusers-bounces at supercluster.org] On Behalf Of Coyle, James J [ITACD]
> Sent: Wednesday, September 14, 2011 12:00 PM
> To: Torque Users Mailing List
> Subject: Re: [torqueusers] Batch not running : Things to check
>  
> 
> Dave,
> 
>  
> 
>   Welcome to Torque. I switched from NQS some time ago, and Torque/PBS has been a good replacement for me.
> 
> Things to check:
> 
>   What does the error output say? ( Probably in file dn_test.txt.e[0-9]* )
> 
>   Permissions on  /home/zarnocda/torque/scripts_test/dn_test.sh , is it executable? 
> 
> You may need:    chmod u+x /home/zarnocda/torque/scripts_test/dn_test.sh
> 
>  
> 
>   I’d also check if /home/zarnocda/torque/scripts_test/dn_test.sh
> 
> even exists on the compute node.
> 
> e.g.  ls /home/zarnocda/torque/scripts_test/dn_test.sh  executable
> 
>  
> 
>   I usually use the interactive opion ( qsub –I ) to debug these kinds of problems.
> 
> You could issue:
> 
> qsub -V –I  -l nodes=1 -q dn
> 
> which will start an interactive jobs and log you into the mother superior node for that job
> 
> where you can then try issuing the commands within your job that is not working.
> 
>  
> 
> James Coyle, PhD
> High Performance Computing Group       
>  Iowa State Univ.         
> web: http://jjc.public.iastate.edu/
>  
> 
>  
> 
>  
> 
> From: torqueusers-bounces at supercluster.org [mailto:torqueusers-bounces at supercluster.org] On Behalf Of Zarnoch, Dave
> Sent: Wednesday, September 14, 2011 8:42 AM
> To: torqueusers at supercluster.org
> Subject: [torqueusers] Batch not running
> Importance: High
>  
> 
> Hello folks,
> 
> New to Torque, used to run NQS….
> 
> Concerning Torque…
> 
> I have a small script:
> 
> $ more dn_test.sh
> #!/bin/sh
> #
> PATH=/bin:/usr/bin:/usr/local/bin:/etc:/usr/sbin:/usr/ucb:$HOME/bin:/usr/bin/X11
> :/sbin:.
> export PATH
> DATE=`date +%H%M`
> echo "Hello"
> touch /tmp/dn_test_${DATE}
> sleep 90
>  
> When I submit the script:
>  
> qsub -V -l nodes=1 -q dn dn_test.sh
>  
> It runs fine.
>  
> But I need to run batch…
>  
> I created a text file “dn_test.txt”
>  
> That contains:
>  
> /home/zarnocda/torque/scripts_test/dn_test.sh
>  
> When I run:
>  
> qsub -V -l nodes=1 –q dn dn_test.txt
>  
> 
> It appears to process the file:
> 
> qstat –s
> 
> Job id                    Name             User            Time Use S Queue
> 
> ------------------------- ---------------- --------------- -------- - -----
> 
> 7592.usphl1ora002.amer    dn_test.txt      zarnocda               0 R dn       
> 
>  
> 
> But it doesn’t excute the script within:
> 
> /home/zarnocda/torque/scripts_test/dn_test.sh
>  
> 
> Any help!
> 
>  
> 
> Thanks!
> 
>  
> 
> Dave
> 
> Dave Zarnoch
> 
> UNIX Systems Administration
> 
> (215)200-0911
> 
> Dave.Zarnoch at sykes.com
> 
>  
> 
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20110914/62353063/attachment-0001.html 


More information about the torqueusers mailing list