[torquedev] pbsdsh and number of processors allocated

Garrick garrick at usc.edu
Wed Jun 17 19:15:35 MDT 2009


A few words of caution: one day we will dynamicly sized jobs and  
putting values like number of nodes or jobs into env vars wouldn't be  
valid. These values need to be queryable from pbs_mom or read from  
disk (like $PBS_NODEFILE)

HPCC/Linux Systems Admin

On Jun 17, 2009, at 10:17 AM, Craig Macdonald <craigm at dcs.gla.ac.uk>  
wrote:

> A suggestion for pbsdsh improvement:
>
> pbsdsh allows processes to be launched on either:
> (a) specified hosts in the job
> (b) once for every allocated processors on every allocated node in  
> the job
> (c) all unique nodes in the job
>
> I'd like to suggest an improvement to the (c) case.  Some job programs
> manage the number of processors to use on a given node (e.g. the  
> Hadoop
> task tracker). However, if you allocate only processors, not whole
> nodes, then this can end up with too many processes running on a given
> node, as assumptions are drawn on the number of allocated processors  
> per
> sister (e.g. my job asked for 12 processors. Nodes have 4 procs each,
> but one nodes already had a single processor job running - how should
> the spawned process know this?)
>
> Instead, I'd like to propose that pbsdsh -u sets an environment  
> variable
> in the resulting spawn processes, detailing the number of allocated
> processes. This should be fairly easy, as tm_spawn accepts an argument
> to alter the target environment of the spawned process.
>
> Craig
> _______________________________________________
> torquedev mailing list
> torquedev at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torquedev


More information about the torquedev mailing list