[torqueusers] Restarting pbs_mom question
Steve Young
chemadm at hamilton.edu
Tue Jun 24 09:01:15 MDT 2008
Hi,
I've wondered this but haven't had to do it much. Looking at the man
page for pbs_mom I see:
-p Specifies the impact on jobs which were in
execution when the mini-server shut down. On any
restart of MOM, the new mini-server will not
be the parent of any running jobs, MOM has lost con-
trol of her offspring (not a new situation
for a mother). With the -p option, Mom will allow the
jobs to continue to run and monitor them
indirectly via polling. The -p option is mutually exclu-
sive with the -r option.
would this do it? And I assume this means the pbs_mom would be the
parent for new jobs coming to the node?
-Steve
On Jun 24, 2008, at 10:38 AM, Rob Lines wrote:
> We need to restart the pbs_mom to implement the fix found here
> http://www.clusterresources.com/pipermail/torqueusers/2007-March/
> 005360.html. We have never restarted the pbs_mom process while
> there were jobs running on a node (atleast ones that we cared about
> keeping) so I am wondering what the results would be of restarting
> them on machines with active jobs. We have restarted the maui
> process before with no problem but its' part in the process is
> different.
>
> We had the backup plan of just draining all the nodes then
> restarting pbs_mom on any of them that don't have jobs currently
> then putting those nodes back in service then once the other nodes
> that have current jobs finish we would restart their pbs_mom and
> put them back in service. I had just hoped to avoid that because
> it would mean I have to pay attention to the them and some of the
> jobs that are running currently are multi day runs.
>
> Thanks for the help,
> Rob
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20080624/c5c71a9d/attachment-0001.html
More information about the torqueusers
mailing list