[torquedev] TORQUE 2.2.0 Defaults

Dave Jackson jacksond at clusterresources.com
Thu Aug 16 17:17:50 MDT 2007


Garrick,

> > 3) set resources_available.nodect to automatically allow jobs up to the
> > number of procs in the cluster
> 
> "setting" resources_available.nodect would be incorrect because then it would
> never be set again.  The point of resources_available.nodect is override what
> server thinks is correct.
> 
> Can we make this depend on node_pack?

  I don't fully understand your comments about 'it would never be set
again'.  My main concern is a user of a new 32 quad core cluster
submitting a job with 'qsub -l nodes=128' anticipating PBS's overly
flexible definition of nodes, and not being able to run his job because
of a 'mysterious' queue constraint.  I believe sites should be able to
force a tighter node definition but by default, this type of warning
will be confusing to a novice.

> > 4) modify configure to not build the GUI by default (configure
> > --disable-gui)
> 
> configure doesn't default to "on", it looks for the required deps and only
> builds it if it can.  What is wrong with that?

  Not a problem if it is working.  I saw a problem yesterday in which a
CentOS 4.4 system attempted to build the GUI by default then failed due
to a TCL library issue.  I take it your preference would be to improve
the dependency auto-detect capability?  What will you need?  config.log,
config.status? other?


> > 5) modify pbs_mom to recover jobs by default (ie, default to 'pbs_mom
> > -r')
> 
> That would be incorrect.  At boot, jobs can't be recovered.

  pbs_mom should be able to detect that quite easily since the process is 
gone.  If the process is there, the most correct 'default' behavior should
be to try to recover the job.  What exceptions should there be to this?  
Again, this is default behavior and can be overridden by any advanced site.

Dave

On Thu, 2007-08-16 at 14:40 -0700, Garrick Staples wrote:
> On Thu, Aug 16, 2007 at 03:41:23PM -0600, Dave Jackson alleged:
> > 3) set resources_available.nodect to automatically allow jobs up to the
> > number of procs in the cluster
> 
> "setting" resources_available.nodect would be incorrect because then it would
> never be set again.  The point of resources_available.nodect is override what
> server thinks is correct.
> 
> Can we make this depend on node_pack?
> 
>  
> > 4) modify configure to not build the GUI by default (configure
> > --disable-gui)
> 
> configure doesn't default to "on", it looks for the required deps and only
> builds it if it can.  What is wrong with that?
> 
>  
> > 5) modify pbs_mom to recover jobs by default (ie, default to 'pbs_mom
> > -r')
> 
> That would be incorrect.  At boot, jobs can't be recovered.
> 
>  
> >   Are there issues with these defaults?  Are there additional defaults
> > which should be set?
> > 
> > Thanks,
> > Dave
> > 
> > _______________________________________________
> > torquedev mailing list
> > torquedev at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torquedev
> _______________________________________________
> torquedev mailing list
> torquedev at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torquedev



More information about the torquedev mailing list