[torquedev] pbsdsh and number of processors allocated

Craig Macdonald craigm at dcs.gla.ac.uk
Thu Jun 18 05:05:56 MDT 2009


Hi Garrick,

Would there be some mechanism for notifying running processes of a 
change? Or would they be expected to poll a file. Lets say it's called 
$PBS_CPUCOUNTFILE, and just contains the number of processors that the 
job has allocated.

Craig

Garrick wrote:
> A few words of caution: one day we will dynamicly sized jobs and 
> putting values like number of nodes or jobs into env vars wouldn't be 
> valid. These values need to be queryable from pbs_mom or read from 
> disk (like $PBS_NODEFILE)
>
> HPCC/Linux Systems Admin
>
> On Jun 17, 2009, at 10:17 AM, Craig Macdonald <craigm at dcs.gla.ac.uk> 
> wrote:
>
>> A suggestion for pbsdsh improvement:
>>
>> pbsdsh allows processes to be launched on either:
>> (a) specified hosts in the job
>> (b) once for every allocated processors on every allocated node in 
>> the job
>> (c) all unique nodes in the job
>>
>> I'd like to suggest an improvement to the (c) case.  Some job programs
>> manage the number of processors to use on a given node (e.g. the Hadoop
>> task tracker). However, if you allocate only processors, not whole
>> nodes, then this can end up with too many processes running on a given
>> node, as assumptions are drawn on the number of allocated processors per
>> sister (e.g. my job asked for 12 processors. Nodes have 4 procs each,
>> but one nodes already had a single processor job running - how should
>> the spawned process know this?)
>>
>> Instead, I'd like to propose that pbsdsh -u sets an environment variable
>> in the resulting spawn processes, detailing the number of allocated
>> processes. This should be fairly easy, as tm_spawn accepts an argument
>> to alter the target environment of the spawned process.
>>
>> Craig
>> _______________________________________________
>> torquedev mailing list
>> torquedev at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torquedev



More information about the torquedev mailing list