[torqueusers] wait for job completion
Tony Schreiner
schreian at bc.edu
Fri May 29 07:45:20 MDT 2009
On May 28, 2009, at 11:23 PM, Glen Beane wrote:
> On Wed, May 27, 2009 at 3:47 PM, Tony Schreiner <schreian at bc.edu>
> wrote:
>> I'm transitioning a cluster from Platform/LSF to Torque, and one of
>> the ways the cluster is used is to run jobs submitted over a web cgi
>> front end. In the cases that the run is short enough, the results
>> would be returned on a web page. This relied on the bsub -K option
>> in
>> LSF which doesn't return control to the shell until the job is
>> finished.
>>
>> Is there something equivalent in Torque? Or has anybody hacked
>> another
>> way to do this? I can imagine setting up a semaphore file on a shared
>> directory, but I'm hoping for something simpler.
>
> how does LSF behave if you kill bsub before the job finishes? Does
> the job still finish or does the job get canceled? Does still go to a
> file?
Here's an example output, submitting then Ctrl-C after half a minute
the job uses mpiexec but it only requests one processor
> bsub -K -oo domp.log2 ./domp
Job <278959> is submitted to default queue <normal>.
<<Waiting for dispatch ...>>
Job <278959> is being terminated
and bjobs shows it exited with abnormal status
The log file contains the following lines (among others)
TERM_OWNER: job killed by owner.
Exited with exit code 143.
and
mpiexec: killing job...
mpiexec noticed that job rank 0 with PID 11899 on node linux08 exited
on signal
15 (Terminated).
And thanks for putting this on the feature request
Tony Schreiner
More information about the torqueusers
mailing list