[torqueusers] cpu-time limit specified, but wall-time enforced

Neelesh Arora narora at Princeton.EDU
Tue Jun 6 10:00:49 MDT 2006


garrick at speculation.org wrote:
> On Fri, Jun 02, 2006 at 03:55:29PM -0400, Neelesh Arora alleged:
> 
>>Hi,
>>
>>We have torque-2.0.0p2 and maui-3.2.6p13 based setup. There are 2 
>>execution queues, distinguished by different cpu-time limits. There is 
>>one route queue which routes the jobs based on the requested cpu-time.
>>
>>The queue definitions have appropriate resources_max.cput and 
>>resources_min.cput declarations. And the users are required to specify 
>>-l cput=<time> option to qsub.
>>
>>The jobs get submitted to the right queues, based on the cput parameter 
>>value. But then, Torque/Maui seem to be enforcing wall-time instead and 
>>a job is killed if the resources_used.walltime exceeds Resource_List.cput !!
>>
>>For example, I submit a job with qsub -l cput=1:0:0 pbs-script. And the 
>>job takes more than 1hr in wallclock time, while the cpu-time usage is 
>>still less than 1hr. This job would be killed with a "MOAB_INFO:  job 
>>exceeded wallclock limit" message.
>>
>>We have not specified any walltime parameters in queue/server 
>>definitions or during job submission.
>>
>>While the job is running, qstat reports both cputime and walltime usage. 
>>Whereas, checkjob only reports walltime usage:
>>_________________
>>qstat:
>>Job Id: 37414
>>    resources_used.cput = 00:00:00
>>    resources_used.mem = 2568kb
>>    resources_used.vmem = 9264kb
>>    resources_used.walltime = 00:37:10
>>
>>checkjob:
>>checking job 37414
>>State: Running
>>Creds:  user:narora  group:staff  class:medium  qos:DEFAULT
>>WallTime: 00:37:48 of 1:00:00
>>_________________
>>where, clearly Maui has set the max allowed walltime to be the same as 
>>the cput value I specified to qsub !!
>>
>>Can someone please suggest what's going wrong here?
> 
> 
> Well that's an annoying maui bug!
> 
> Double check your maui config and disable the resource enforcement.
> 

Our maui config does not explicitly define any resource limits: the 
cpu-time limits are set in the PBS queue config.
Probably you are referring to some default maui parameters? But I could 
only find RESOURCELIMITPOLICY and WCVIOLATIONACTION parameters from the 
docs. Neither one has a straight forward 'disable' option.

Can you please elaborate on what you mean by disabling the resource 
enforcement?

Thanks.

-Neel


More information about the torqueusers mailing list