[Pacemaker] Should monitor operations be stopped after a resource is unmanaged?
Ron Kerry
rkerry at sgi.com
Sun Apr 3 18:29:36 UTC 2011
On 7/22/64 2:59 PM, Tim Serong wrote:
> On 4/2/2011 at 09:42 PM, Ron Kerry <rkerry at sgi.com> wrote:
> > On 7/22/64 2:59 PM, Serge Dubrouski wrote:
> > > On Fri, Apr 1, 2011 at 2:09 PM, Ron Kerry <rkerry at sgi.com> wrote:
> > > > On 7/22/64 2:59 PM, Pavel Levshin wrote:
> > > >>
> > > >> 01.04.2011 18:36, Ron Kerry:
> > > >> > Folks -
> > > >> >
> > > >> > Consider a running cluster with all resources managed. We want to stop
> > > >> > and quickly restart a particular resource without impacting other
> > > >> > resources. The software stack running on the system can deal with this
> > > >> > sort of temporary outage. We perform the following actions:
> > > >> > * unmanage the resource
> > > >> > * stop the resource
> > > >> > * start the resource
> > > >> > * manage the resource
> > > >> >
> > > >> > The above procedure is sometimes successful. However, we will also
> > > >> > sometimes get a resource monitor failure after stopping the resource.
> > > >> > It is clear that the monitor operation was not stopped (at least not
> > > >> > immediately) by unmanaging the resource.
> > > >>
> > > >> Unmanaged resource cannot be started and stopped, but can still be
> > > >> monitored.
> > > >
> > > > So unmanaged really means the resource is still being managed to some
> > > > degree?
> > >
> > > It means that Pacemaker still wants to know its state. What kind of
> > > problem does it create?
> > >
> >
> > An unmanaged resource whoose monitor is still running will cause a monitor
> > failure when the resource
> > is stopped. Pacemaker then takes the 'onfail' action defined for the monitor
> > operation. In other
> > words, the resource is still being managed to some degree. If the monitor
> > operation was still
> > running but no action was taken as a result of the monitor operation
> > outcome, there would be no issue.
>
> Try "crm configure property maintenance-mode=true". Admittedly this
> affects the entire cluster, but it will ensure no starts, stops or
> monitors...
>
> Regards,
>
> Tim
Tim -
Thanks, this does work but is rather like using a sledge hammer to do the work of a ball peen
hammer. It unmanages ALL resources and stops all the monitor operations.
How do we go about requesting a change to pacemaker to achieve the desired behavior? As I see it
there are two options:
1. fix 'crm resource unmanage <rsc>' to also stop the individual resource monitor
-or-
2. create a 'crm resource maintenance <rsc>' to unmanage and stop the individual resource monitor
--
Ron Kerry rkerry at sgi.com
Global Product Support - SGI Federal
More information about the Pacemaker
mailing list