[torqueusers] torque does not kill jobs when wall_time or cpu_time reached

Ken Nielson knielson at adaptivecomputing.com
Mon Jun 7 09:50:34 MDT 2010


On 06/07/2010 09:29 AM, Glen Beane wrote:
> On Mon, Jun 7, 2010 at 11:21 AM, Ken Nielson
> <knielson at adaptivecomputing.com>  wrote:
>    
>> On 06/07/2010 09:10 AM, Glen Beane wrote:
>>      
>>> On Mon, Jun 7, 2010 at 11:02 AM, Ken Nielson
>>> <knielson at adaptivecomputing.com>    wrote:
>>>
>>>        
>>>> On 06/04/2010 08:14 PM, Glen Beane wrote:
>>>>
>>>>          
>>>>> On Fri, Jun 4, 2010 at 5:37 PM, David Singleton
>>>>> <David.Singleton at anu.edu.au>      wrote:
>>>>>
>>>>>
>>>>>
>>>>>            
>>>>>> If procs is going to mean processors/cpus then I would suggest there needs
>>>>>> to be a lot of code added to align nodes and procs - they are specifying
>>>>>> the same thing.
>>>>>>
>>>>>>
>>>>>>              
>>>>> Moab treats them the same if you do not specify ppn with your nodes
>>>>> request, however TORQUE is pretty much unaware of what -l procs=X
>>>>> means - it just passes the info along to Moab. I would like to see
>>>>> procs become a real torque resource that means give me X total
>>>>> processors on anywhere from 1 to X nodes.
>>>>> _______________________________________________
>>>>> torqueusers mailing list
>>>>> torqueusers at supercluster.org
>>>>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>>>>
>>>>>
>>>>>            
>>>> Currently Moab interprets procs to mean give me all the processors on X
>>>> nodes.
>>>>
>>>>          
>>> that doesn't seem correct.  I use procs all the time and I do not get
>>> this behavior from Moab (I've tried it with 5.3 and 5.4).  The
>>> behavior I expect and see is for Moab to give me X total processors
>>> spread across any number of nodes (the processors could all be on the
>>> same node, or they could be spread across many nodes depending on what
>>> is free at the time the job is scheduled to run).
>>> _______________________________________________
>>>
>>>        
>> Glen
>>
>> Try doing a qsub -l proces=1<job.sh>. Then do a qstat -f and see what
>> the exec_host is set to.
>>
>> I am running Moab 5.4.
>>
>>      
> you must have some TORQUE defaults set, like ncpus that are
> interfering with procs.  Since -l procs does not set ncpus, your
> default is getting applied.
>
> gbeane at wulfgar:~>  echo "sleep 60" | qsub -l procs=1,walltime=00:01:00
> 69760.wulfgar.jax.org
> qstat -f 69760
> ...
> exec_host = cs-short-2/0
> ...
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>    
Glen,

You are right. I set those on my last set of problems with syntax. 
Ironically they did not affect those resources.

Ken


More information about the torqueusers mailing list