[torqueusers] have enough nodes,but job is not running
Jozef Káčer
quickparser at gmail.com
Wed Apr 16 06:41:29 MDT 2008
I often find myself in situations, in which jobs should have enough
resources and
should be running. I submit jobs using PBS script. Nevertheless, if the job
gets
hung in queue for a longer time I try force them to run using "runjob" or
"qrun". It
usually works provided that there are enough free resources available.
Jozef
2008/4/16 <pat.o'bryant at exxonmobil.com <pat.o%27bryant at exxonmobil.com>>:
>
> Zhyang,
> Here is something you might try. Code up a Torque "job_script" with the
> following "#PBS" control cards. Note that "#PBS" control cards can take
> the
> place of command line arguments and they follow the same format. Submit
> the job using "qsub job_script". If you specify ppn > (number of
> cpus/node), Maui (for some paramter settings) will look for a matching
> node with that number of cpus minimum. So for example, if you use "#PBS -l
> nodes=8:ppn=4", Maui will look for nodes with 4 cpus. If it can't find a
> node like that, the job will remain queued. The thing to keep in mind is
> that Torque queues your job and Maui (in your case) actually decides where
> and when your job will execute. Most execution problems will be due to
> Maui/Moab parameter settings. Here are some links to check as well:
>
> http://www.clusterresources.com/wiki/doku.php?id=torque:2.1_job_submission
> http://www.clusterresources.com/products/mwm/docs/a.fparameters.shtml
>
> Contents of "job_script"
> ----------------------------------
> #!/bin/bash
> #PBS -N Short
> #PBS -l nodes=8:ppn=2,walltime=00:02:00
> pwd
> hostname
>
> End of "job_script"
> ---------------------------
>
> Thanks,
> Pat
>
> J.W. (Pat) O'Bryant,Jr.
> Business Line Infrastructure
> Technical Systems, HPC
> Office: 713-431-7022
>
>
>
>
> zhyang at lzu.edu
> .cn
> To
> pat.o'bryant at exxonmobil.com<pat.o%27bryant at exxonmobil.com>
> 04/15/08 07:19 cc
> AM torqueusers at supercluster.org
> Subject
> Re: Re: [torqueusers] have enough
> nodes,but job is not running
>
>
>
>
>
>
>
>
>
>
>
> Hi pat
>
> I am not use the pbs control cards. I have 56 nodes, 2 cpu per node.
>
>
> >-----ÔʼÓʼþ-----
> > ·¢¼þÈË: pat.o'bryant at exxonmobil.com <pat.o%27bryant at exxonmobil.com>
> > ·¢ËÍʱ¼ä: 2008-04-15 20:09:27
> > ÊÕ¼þÈË: zhyang at lzu.edu.cn
> > ³ËÍ:
> > Ö÷Ìâ: Re: [torqueusers] have enough nodes,but job is not running
> > Zhyang,
> >
> > What do your #PBS control cards look like? Also, how many cpus/node
> do
> >
> > you have?
> >
> > Thanks,
> >
> > Pat
> >
> >
> >
> >
> >
> > J.W. (Pat) O'Bryant,Jr.
> >
> > Business Line Infrastructure
> >
> > Technical Systems, HPC
> >
> > Office: 713-431-7022
> >
> >
> >
> >
> >
> >
> > Hi
> >
> > I have a cluster include 56 nodes, and install torque and maui, but
> >
> > recently I found that when I use showq show 34 nodes active, user submit
> 5
> >
> > nodes job, the job status is Q and not running,from showq result ,it
> should
> >
> > have enough nodes(at leaat 5 nodes),but why the job not running?
> >
> > I submit 2 nodes job ,job running is ok. who can help me ? Thanks!
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > --
> >
> > _______________________________________________
> >
> > torqueusers mailing list
> >
> > torqueusers at supercluster.org
> >
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> >
> >
> >
> >
> >
> >
>
> -- ´ËÖÂ
> ¾´Àñ
> ÕÅÑó
> À¼ÖÝ´óѧͨÐÅÍøÂçÖÐÐÄ
> µØÖ·£ºÖйú¸ÊËàÀ¼ÖÝÌìˮ·222ºÅ
> µç»°£º£¨0931£©8912011 ´«Õ棺£¨0931£©8912022 ÓÊ
> ±à£º730000 Email£ºzhyang at lzu.edu.cn
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20080416/6063b85f/attachment-0001.html
More information about the torqueusers
mailing list