[torqueusers] PBS in Cluster
timlee126 at yahoo.com
Tue Feb 2 11:58:44 MST 2010
I wonder why it is important for the user to have a good
guess of how long his job will execute?
If I specify time shorter than actually needed my job will be terminated before it can finish, so now I often specify a very big time which shows in qstat -q as infinite.
I guess the longer I specify, the longer my job will have to wait in queue before it can actually start to run?
Once the job start running, will the specified long long walltime affect the priority of my job any more?
--- On Mon, 2/1/10, Axel Kohlmeyer <akohlmey at cmm.chem.upenn.edu> wrote:
> From: Axel Kohlmeyer <akohlmey at cmm.chem.upenn.edu>
> Subject: Re: [torqueusers] PBS in Cluster
> To: "Tim" <timlee126 at yahoo.com>
> Cc: torqueusers at supercluster.org
> Date: Monday, February 1, 2010, 5:33 PM
> On Mon, Feb 1, 2010 at 5:12 PM, Tim
> <timlee126 at yahoo.com>
> > Thanks, Axel!
> > Are you saying that if submit several background jobs,
> for better control of assignment of resources via PBS
> commands, better not to put them into a single PBS file and
> submit the file just once, but to put each job in different
> PBS files and submit the PBS files one by one?
> yes. unless your local cluster has certain restrictions. on
> machine, that i manage, users are encouraged to always
> use "full" nodes, i.e. multiple of 8 cores (e.g. via -l
> and then it would make sense to put (up to 8 jobs)
> into one submit. the trick with using wait is important, or
> else your
> jobs will be immediately killed, since the job will
> terminate when the qsub script terminates. nevertheless, i
> recommend people to use an appfile with OpenMPI
> instead of backgrounding jobs. this has more control (one
> can also
> "package" multiple parallel jobs, e.g. 4x 2-MPI tasks
> in the above example) and would also allow to scatter jobs
> multiple nodes. if you want to get really fancy, you
> can write your own wrapper that you give a long list of
> command lines
> and then it would execute them on the next
> free node in your reservation. in general, this is not a
> good idea, it
> only helps if you need to deal with jobs that
> have relatively short execution times, but you run on a
> heavily used
> machine where large parallel jobs are favored.
> > If yes, I think I do not need to put the jobs in
> background, since the reason I want to background them is
> because I have these several jobs in a single file and I
> want to run them in parallel instead of one start after
> another finishes.
> yes. with the one caveat from above (on our cluster the
> performance of
> a single job can be affected up to
> a factor of two by a second job of an special kind on the
> same node
> being present or not, hence to keep
> execution time predictable i request users to always
> reserve full
> nodes), it is better to let the batch scheduler
> do the work. particularly maui/moab are doing a pretty good
> job of
> packaging smaller jobs into the "holes"
> that larger jobs leave through "backfilling". for that,
> however, it is
> importan to submit your job with a good
> guess (plus safety) of how long it will execute.
> because of these "rules" our local cluster tend to have
> utilization per month of over 90%.
> Dr. Axel Kohlmeyer akohlmey at gmail.com
> Institute for Computational Molecular Science
> College of Science and Technology
> Temple University, Philadelphia PA, USA.
More information about the torqueusers