[torquedev] pbsdsh and number of processors allocated

Joshua Bernstein jbernstein at penguincomputing.com
Wed Jun 17 17:38:43 MDT 2009


Hello Craig,

	This has been logged as bug #6 as an enchacements:

http://www.clusterresources.com/bugzilla/show_bug.cgi?id=6

-Joshua Bernstein
Senior Software Engineer
Penguin Computing

Craig Macdonald wrote:
> A suggestion for pbsdsh improvement:
> 
> pbsdsh allows processes to be launched on either:
>  (a) specified hosts in the job
>  (b) once for every allocated processors on every allocated node in the job
>  (c) all unique nodes in the job
>  
> I'd like to suggest an improvement to the (c) case.  Some job programs 
> manage the number of processors to use on a given node (e.g. the Hadoop 
> task tracker). However, if you allocate only processors, not whole 
> nodes, then this can end up with too many processes running on a given 
> node, as assumptions are drawn on the number of allocated processors per 
> sister (e.g. my job asked for 12 processors. Nodes have 4 procs each, 
> but one nodes already had a single processor job running - how should 
> the spawned process know this?)
> 
> Instead, I'd like to propose that pbsdsh -u sets an environment variable 
> in the resulting spawn processes, detailing the number of allocated 
> processes. This should be fairly easy, as tm_spawn accepts an argument 
> to alter the target environment of the spawned process.
> 
> Craig
> _______________________________________________
> torquedev mailing list
> torquedev at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torquedev


More information about the torquedev mailing list