[torqueusers] pbs_mom request, was Re: PBS_MOM kills running jobs when restarted

Glen Beane glen.beane at gmail.com
Thu Jul 30 04:38:49 MDT 2009


you do not want -p as the default behavior when you reboot a node.
pbs_mom could find pids that match its last known jobs and attempt to
take ownership of them when in fact the pids have no relation to the
previous jobs.  -p should not be in any startup script, except for
maybe a "restart" option

-p should only be used in rare cases when you need to (re)start
pbs_mom on a node already running jobs, _never_ at boot time, which is
why it is not the default and why it is not in most sites startup
scripts.



On Thu, Jul 30, 2009 at 5:13 AM, Jacques
Foury<Jacques.Foury at math.u-bordeaux1.fr> wrote:
> George Wm Turner a écrit :
>> use the -p option when you restart pbs_mom; from pbs_mom man page
>>
>> <snip>
>>   -p     Specifies  the  impact  on jobs which were in execution when
>> the
>>          mini-server shut down.  On any restart of  MOM,  the  new
>> mini-
>>          server  will not be the parent of any running jobs, MOM has
>> lost
>>          control of her offspring (not a new  situation  for  a
>> mother).
>>          With  the  -p option, Mom will allow the jobs to continue to
>> run
>>          and monitor them indirectly via polling.  The -p option is
>> mutu-
>>          ally exclusive with the -r option.
>>
>
> I experienced serious trouble because this is not the default behaviour
> for mom...
>
> I always put the -p parameter, and I wonder why I may have something
> else !!! Why could someone want his job killed when restarting mom ???
>
> I urge the developers to put "-p" as the default behaviour for mom... or
> at least to put this option in the startup script for the different
> packages (rpm or deb...)
>
> --
> Jacques Foury
> administrateur systemes, reseaux, clusters
> Institut de Mathematiques de Bordeaux
> http://www.math.u-bordeaux1.fr/maths/cellule
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>


More information about the torqueusers mailing list