[torqueusers] [PATCH 0/3] use cgroup to limit the cpu and memory usage of jobs

Lukasz Flis l.flis at cyf-kr.edu.pl
Fri Nov 30 05:37:57 MST 2012


Hi Andre,

> There are some MPI implementations that don't support TM API,
> so afaik their is no real choice besides using SSH to launch
> the siblings (I'm open to suggestions if that is wrong).

We use few tricks to get rid of intra-node-ssh:
 - rsh emulator script built on top of pbs_dsh
   most mpi implementations still speak 'rsh'
   It includes HP-MPI (platform mpi) and old mpich,mvapich
   this approach is useful for scientific packages with dozens
   of wrappers (like Turbomole) so we don't have to modify all the
scripts but just set MPI_REMSH and it works

 - mpiexec from OSU (mpich,mvaipch,intelmpi)
 - mpiexec.hydra compiled with TM support (mpich,mvaipch,intelmpi)


But still - ssh approach is interesting - as It would allow users to
login to nodes where their jobs are running and spawn debug sessions,
view outputs and etc while still remaining under pbs_mom control.

You can find me on #torque and #hpc at irc.freenode.net or we can
discuss the details via e-mail.

 I would like to invite all the torque patch-makers to visit #torque
from time to time too so we can exchange ideas and solutions.

Cheers,
--
Lukasz Flis
ACC Cyfronet AGH






> Lukasz, 
> 
> that PAM approach sounds very interesting. 
> There are some MPI implementations that don't support TM API, 
> so afaik their is no real choice besides using SSH to launch 
> the siblings (I'm open to suggestions if that is wrong).
> 
> Instead of being interactive I'd rather prefer checking for 
> the same jobid on the siblings that the job has on the mother
> superior, but that is an implementation detail.
> 
> I wouldn't know how to start, but maybe we can collaborate?
> 
> Greetings
> André
> 
> ----- Ursprüngliche Mail -----
>> Some sites use ssh to spawn processes on the sibbling nodes. This
>> obviously is causing new ssh-spawned processed to run out of pbs_mom
>> control causing resource accounting and limitation impossible.
>>
>> I think this could be easily solved by using modified PAM module for
>> torque.
>>
>> Such module needs to be rewritten to do the following:
>>  * check whether incoming user has active job on the node (already
>>  present)
>>
>>  * if yes: find jobid of the youngest job belonging to the user on
>>  the
>> node. If PAM is able to deremine if session is interactive it could
>> ask
>> user to chose desired jobid to attach to
>>
>>  * use tm_adopt call (libtorque) to make pbs_mom aware of new session
>> and processes
>>
>> tm_adopt in theory should attach new ssh spawned session and
>> proceesses
>> to a cpuset (2.5.12) and cgroup in later versions of torque.
>> Quick tests and code digging shown that tm_adopt is not cpuset aware
>> in
>> 2.5.12 but it should be easy to fix
>>
>> Unfortunately I didn't yet have time to implement this in pam module
>> but
>> maybe there is someone more experienced with PAM development who's
>> willing to implement this? :)
> 
> 



More information about the torqueusers mailing list