[torqueusers] PBS in Cluster

Tim timlee126 at yahoo.com
Mon Feb 1 15:15:19 MST 2010


Thanks, Coyle. I was going to ask why Garrick mentioned to put wait at the end of script for background jobs. But you just answered it.

--- On Mon, 2/1/10, Coyle, James J [ITACD] <jjc at iastate.edu> wrote:

> From: Coyle, James J [ITACD] <jjc at iastate.edu>
> Subject: RE: [torqueusers] PBS in Cluster
> To: "Axel Kohlmeyer" <akohlmey at cmm.chem.upenn.edu>, "Tim" <timlee126 at yahoo.com>
> Date: Monday, February 1, 2010, 11:14 AM
> Re: Using & in a PBS or Torque
> batch job:
> --------------------------------------------
> If you use & at the end of commands in the batch
> script, ensure that
> you have the command 
>   wait
> after all of the commands you put in the background. Wait
> will
> cause the batch job to wait until all the backgrounded
> jobs
> Otherwise, the script will exit, and your jobs will
> either:
> 1) Continue running on the nodes which PBS or Torque
> considers free
>    This will make other users mad at you,
> whether you intended to
>    Defeat the batch scheduling system or
> not. 
> 2) Have your commands killed as part of the exit process of
> the batch job.
> 
> (I instituted #2 am my installation because of #1.) 
> 
> - Jim C. 
> -------------------------------------------------------------------------------
>  James Coyle, PhD
>  Xeon and Opteron Cluster Manager
>  High Performance Computing Group     
>  115 Durham Center           
> 
>  Iowa State Univ.          
>  Ames, Iowa 50011       
>    web: http://www.public.iastate.edu/~jjc
> 
> 
> 
> -----Original Message-----
> From: torqueusers-bounces at supercluster.org
> [mailto:torqueusers-bounces at supercluster.org]
> On Behalf Of Axel Kohlmeyer
> Sent: Monday, February 01, 2010 8:30 AM
> To: Tim
> Cc: torqueusers at supercluster.org
> Subject: Re: [torqueusers] PBS in Cluster
> 
> On Sat, Jan 30, 2010 at 7:58 PM, Tim <timlee126 at yahoo.com>
> wrote:
> > Hi,
> 
> tim,
> 
> > I am learning and have some questions about using PBS
> to submit jobs in a cluster.
> >
> > (1) If not using qsub to submit a job, will the job be
> running only on the single node where it is submitted?
> >
> > Even if the job is parallelized by MPI and run by
> mpirun, is it still running on the single node, not the
> others?
> >
> > So is qsub used for submit job to running on other
> nodes in the cluster?
> 
> qsub will ask to reserve nodes but it knows nothing about
> MPI.
> similarly mpirun can be set up so that it will as the batch
> system
> about which nodes are reserved for it, but this is not
> needlessly so.
> you can use mpirun interactively and without batch/qsub by
> providing
> a host list. some MPI libraries do need this always. hence
> $PBS_NODEFILE.
> 
> >
> > (2) In a pbs script that is submitted by qsub, are all
> the commands executed one after the other?
> 
> it is a regular shell script.
> 
> > If I want to run several executables at the same time,
> is it to make these calls background by adding "&" at
> the end?
> 
> if you background them, they will execute in parallel. if
> you need
> them in sequence,
> you can try using job dependencies (-W afterok:####).
> 
> > (3) In the pbs script, if the several calls to run the
> executables are running the same time by running background,
> are the number of nodes and processors per nodes specified
> to be the total needed by all those calls?
> 
> the script submitted to qsub will run as a regular shell
> script on the
> first(!) host assigned to the job.
> no job distribution or limitation is done. exectly this is
> why one
> would submit each execution in
> separate jobs. if you use OpenMPI, you can consider using
> an appfile
> to scatter multiple jobs
> across the allocated nodes, but keep in mind, that when
> those jobs
> take different amounts of
> time, you'll have idle nodes that cannot be used by other
> jobs.
> 
> > If yes, and the number of nodes and processors per
> nodes specified are not completely available but enough to
> run some of the calls, will some of these calls be run first
> or delayed until the requested total resources are
> completely available?
> 
> again, this kind of load balancing is exactly what batch
> systems in
> itself are for.
> just submit all requests in individual qsub statements and
> all this will happen.
> 

> cheers,
>     axel.
> 
> > Thanks and regards!
> >
> >
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> >
> 
> 
> 
> -- 
> Dr. Axel Kohlmeyer    akohlmey at gmail.com
> Institute for Computational Molecular Science
> College of Science and Technology
> Temple University, Philadelphia PA, USA.
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
> 


      


More information about the torqueusers mailing list