[torqueusers] [Mauiusers] Moving jobs from one node to another

Fernando Caba fcaba at uns.edu.ar
Mon Aug 13 13:07:26 MDT 2012

Gus, thank´s for your answer.
I´ll send it to our cluster´s users.


El 13/08/2012 03:48 PM, Gus Correa escribió:
> On 08/13/2012 02:17 PM, Denis wrote:
>> 2012/8/13 Fernando Caba<fcaba at uns.edu.ar>:
>>> Hy, i want to know something about moving jobs from one node to another.
>>> If i need to do some manteinance in one node with a certain number of
>>> running jobs (they cannot be killed).
>>> Can i move those all jobs (or specific) to another node (free or not)? If
>>> yes, how?
>>> Sorry because I´m asking again the same, is it a dumb question?
>> Hello, Fernando.
>> You cannot move a running job to another node. That would be possible
>> with Condor if you link your code against its libraries when
>> compiling.
>> D.
> Hi Fernando
> The best thing is to use algorithms and programs that can be restarted
> from a given state/configuration,
> and run them for a relatively small time [hours, not days, or weeks, or
> months], restarting as needed.
> Not all programs are written this way, but often times they have this
> capability, and users simply don't know about it
> or how to use it.
> This way, if the user loses one job, [s]he doesn't loose too much, and
> can restart from the state/configuration
> saved by the previous job in the sequence.
> Also, you won't feel too guilty for killing a job that has been running
> for a few hours only,
> but your user may become very upset if you kill  her/his job that has
> been running for three weeks.
> Our queues here have a maximum walltime of 12h, but 6h is common
> in many public computers.
> A modest job runtime also improves the overall throughput of the cluster,
> and prevents hogging of the cluster nodes by one or a few users.
> I hope this helps,
> Gus Correa
>>> Regards
>>> Fernando
>>> --
>>> ----------------------------------------------------
>>> Ing. Fernando Caba
>>> Director General de Telecomunicaciones
>>> Universidad Nacional del Sur
>>> http://www.dgt.uns.edu.ar
>>> Tel/Fax: (54)-291-4595166
>>> Tel: (54)-291-4595101 int. 2050
>>> Avda. Alem 1253, (B8000CPB) Bahía Blanca - Argentina
>>> ----------------------------------------------------
>>> _______________________________________________
>>> mauiusers mailing list
>>> mauiusers at supercluster.org
>>> http://www.supercluster.org/mailman/listinfo/mauiusers
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4533 bytes
Desc: Firma criptogr??fica S/MIME
Url : http://www.supercluster.org/pipermail/torqueusers/attachments/20120813/49364acc/attachment.bin 

More information about the torqueusers mailing list