[torqueusers] cpu-time limit specified, but wall-time enforced

garrick at speculation.org garrick at speculation.org
Fri Jun 2 14:45:08 MDT 2006


On Fri, Jun 02, 2006 at 03:55:29PM -0400, Neelesh Arora alleged:
> Hi,
> 
> We have torque-2.0.0p2 and maui-3.2.6p13 based setup. There are 2 
> execution queues, distinguished by different cpu-time limits. There is 
> one route queue which routes the jobs based on the requested cpu-time.
> 
> The queue definitions have appropriate resources_max.cput and 
> resources_min.cput declarations. And the users are required to specify 
> -l cput=<time> option to qsub.
> 
> The jobs get submitted to the right queues, based on the cput parameter 
> value. But then, Torque/Maui seem to be enforcing wall-time instead and 
> a job is killed if the resources_used.walltime exceeds Resource_List.cput !!
> 
> For example, I submit a job with qsub -l cput=1:0:0 pbs-script. And the 
> job takes more than 1hr in wallclock time, while the cpu-time usage is 
> still less than 1hr. This job would be killed with a "MOAB_INFO:  job 
> exceeded wallclock limit" message.
> 
> We have not specified any walltime parameters in queue/server 
> definitions or during job submission.
> 
> While the job is running, qstat reports both cputime and walltime usage. 
> Whereas, checkjob only reports walltime usage:
> _________________
> qstat:
> Job Id: 37414
>     resources_used.cput = 00:00:00
>     resources_used.mem = 2568kb
>     resources_used.vmem = 9264kb
>     resources_used.walltime = 00:37:10
> 
> checkjob:
> checking job 37414
> State: Running
> Creds:  user:narora  group:staff  class:medium  qos:DEFAULT
> WallTime: 00:37:48 of 1:00:00
> _________________
> where, clearly Maui has set the max allowed walltime to be the same as 
> the cput value I specified to qsub !!
> 
> Can someone please suggest what's going wrong here?

Well that's an annoying maui bug!

Double check your maui config and disable the resource enforcement.



More information about the torqueusers mailing list