[torqueusers] Removing the "exec_host" attribute from a queued job ?

Andrew J Caird acaird at umich.edu
Tue Sep 20 06:41:55 MDT 2005


I'm not sure if this is exactly the same thing, but we see this in maui 
sometimes; it looks like:
    Allocated Nodes:
    [nyx020:1]
but if nyx020 has crashed, it just sits there.  We use the maui command:
    runjob -c <jobid>
According to the help for runjob:
    [ -c ] // CLEAR (clear stale job attributes)
and this seems to prompt maui to relook at where the job should be run.

----
Andrew Caird               Manager of High Performance Computing
University of Michigan Engineering/Center for Advanced Computing
http://cac.engin.umich.edu     acaird at umich.edu     734.647.5273

On Tue, 20 Sep 2005, Chris Samuel wrote:

> Hi folks,
>
> I've got a job that's queued and obviously tried to start and failed and has
> ended up with the following attribute set on it:
>
>   exec_host = edda010/0+edda007/3+edda007/2+edda007/1
>
> I suspect it's stopping Moab or Torque from running it again on other nodes,
> and I'd like to clear that attribute, but it doesn't appear to be accessible
> through qalter or qmgr.
>
> Any clues ?
>
> cheers,
> Chris
> -- 
> Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager
> Victorian Partnership for Advanced Computing http://www.vpac.org/
> Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia
>
>


More information about the torqueusers mailing list