[torqueusers] Removing the "exec_host" attribute from a queued
job ?
Andrew J Caird
acaird at umich.edu
Tue Sep 20 06:41:55 MDT 2005
I'm not sure if this is exactly the same thing, but we see this in maui
sometimes; it looks like:
Allocated Nodes:
[nyx020:1]
but if nyx020 has crashed, it just sits there. We use the maui command:
runjob -c <jobid>
According to the help for runjob:
[ -c ] // CLEAR (clear stale job attributes)
and this seems to prompt maui to relook at where the job should be run.
----
Andrew Caird Manager of High Performance Computing
University of Michigan Engineering/Center for Advanced Computing
http://cac.engin.umich.edu acaird at umich.edu 734.647.5273
On Tue, 20 Sep 2005, Chris Samuel wrote:
> Hi folks,
>
> I've got a job that's queued and obviously tried to start and failed and has
> ended up with the following attribute set on it:
>
> exec_host = edda010/0+edda007/3+edda007/2+edda007/1
>
> I suspect it's stopping Moab or Torque from running it again on other nodes,
> and I'd like to clear that attribute, but it doesn't appear to be accessible
> through qalter or qmgr.
>
> Any clues ?
>
> cheers,
> Chris
> --
> Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager
> Victorian Partnership for Advanced Computing http://www.vpac.org/
> Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia
>
>
More information about the torqueusers
mailing list