[torqueusers] wait for job completion

Tony Schreiner schreian at bc.edu
Fri May 29 07:45:20 MDT 2009


On May 28, 2009, at 11:23 PM, Glen Beane wrote:

> On Wed, May 27, 2009 at 3:47 PM, Tony Schreiner <schreian at bc.edu>  
> wrote:
>> I'm transitioning a cluster from Platform/LSF to Torque, and one of
>> the ways the cluster is used is to run jobs submitted over a web cgi
>> front end. In the cases that the run is short enough, the results
>> would be returned on a web page.  This relied on the bsub -K option  
>> in
>> LSF which doesn't return control to the shell until the job is  
>> finished.
>>
>> Is there something equivalent in Torque? Or has anybody hacked  
>> another
>> way to do this? I can imagine setting up a semaphore file on a shared
>> directory, but I'm hoping for something simpler.
>
> how does LSF behave if you kill bsub before the job finishes?  Does
> the job still finish or does the job get canceled?  Does still go to a
> file?

Here's an example output, submitting then Ctrl-C after half a minute

the job uses mpiexec but it only requests one processor

 > bsub -K -oo domp.log2 ./domp
Job <278959> is submitted to default queue <normal>.
<<Waiting for dispatch ...>>
Job <278959> is being terminated

and bjobs shows it exited with abnormal status

The log file contains the following lines (among others)

TERM_OWNER: job killed by owner.
Exited with exit code 143.

and

mpiexec: killing job...

mpiexec noticed that job rank 0 with PID 11899 on node linux08 exited  
on signal
15 (Terminated).


And thanks for putting this on the feature request
Tony Schreiner





More information about the torqueusers mailing list