[torqueusers] cpu-time limit specified, but wall-time enforced

Neelesh Arora narora at Princeton.EDU
Fri Jun 2 13:55:29 MDT 2006


Hi,

We have torque-2.0.0p2 and maui-3.2.6p13 based setup. There are 2 
execution queues, distinguished by different cpu-time limits. There is 
one route queue which routes the jobs based on the requested cpu-time.

The queue definitions have appropriate resources_max.cput and 
resources_min.cput declarations. And the users are required to specify 
-l cput=<time> option to qsub.

The jobs get submitted to the right queues, based on the cput parameter 
value. But then, Torque/Maui seem to be enforcing wall-time instead and 
a job is killed if the resources_used.walltime exceeds Resource_List.cput !!

For example, I submit a job with qsub -l cput=1:0:0 pbs-script. And the 
job takes more than 1hr in wallclock time, while the cpu-time usage is 
still less than 1hr. This job would be killed with a "MOAB_INFO:  job 
exceeded wallclock limit" message.

We have not specified any walltime parameters in queue/server 
definitions or during job submission.

While the job is running, qstat reports both cputime and walltime usage. 
Whereas, checkjob only reports walltime usage:
_________________
qstat:
Job Id: 37414
     resources_used.cput = 00:00:00
     resources_used.mem = 2568kb
     resources_used.vmem = 9264kb
     resources_used.walltime = 00:37:10

checkjob:
checking job 37414
State: Running
Creds:  user:narora  group:staff  class:medium  qos:DEFAULT
WallTime: 00:37:48 of 1:00:00
_________________
where, clearly Maui has set the max allowed walltime to be the same as 
the cput value I specified to qsub !!

Can someone please suggest what's going wrong here?

Thanks.

-Neel


More information about the torqueusers mailing list