[torqueusers] process using more CPUs than requested
knielson at adaptivecomputing.com
Tue Mar 1 09:24:35 MST 2011
On 02/28/2011 07:02 PM, Michael Jennings wrote:
> On Friday, 18 February 2011, at 16:53:39 (-0700),
> David Beer wrote:
>> There has been talk of adding some sort of rogue process killing
>> functionality to TORQUE. From the suggestions I've heard, it would
>> work something like this:
>> 1. It would be configurable.
>> 2. It would check which users have jobs active on the pbs_mom, and
>> it would kill all processes from other users that shouldn't be on
>> What do you all think of such a feature?
> We would use and appreciate such a feature, but we seem to be in the
> minority. :-]
I would like to see this go into TORQUE as well.
We need to talk about what the policies would be about its use. For
example, it is easy if you know when each job starts it will have
exclusive access to the machine. We would do a search and destroy on all
user processes before we start the job. But it becomes more difficult
when a node is shared. It becomes even more difficult when the same user
has multiple jobs running on the same node.
Please give us an idea of the use cases for which this feature would be
More information about the torqueusers