[Pacemaker] Enable remote monitoring

Mon Dec 3 16:32:14 EST 2012

----- Original Message -----
> From: "Yan Gao" <ygao at suse.com>
> To: pacemaker at oss.clusterlabs.org
> Sent: Monday, December 3, 2012 5:49:31 AM
> Subject: Re: [Pacemaker] Enable remote monitoring
> 
> Hi,
> 
> On 11/13/12 03:52, David Vossel wrote:
> > ----- Original Message -----
> >> From: "Lars Marowsky-Bree" <lmb at suse.com>
> >> To: "The Pacemaker cluster resource manager"
> >> <pacemaker at oss.clusterlabs.org>
> >> Sent: Monday, November 12, 2012 1:16:49 PM
> >> Subject: Re: [Pacemaker] Enable remote monitoring
> >>
> >> On 2012-11-12T14:03:24, David Vossel <dvossel at redhat.com> wrote:
> >>
> >>>> We want "A" to be restarted if "B" fails. (If A->B are also
> >>>> collocated, we'd also get fail-over after migration-threshold
> >>>> triggers. That may not always be desired.)
> >>> I'm not sure I follow why we'd be concerned about
> >>> migration-threshold
> >>> here.  The only situation I can think of were migration-threshold
> >>> could cause weird behavior is if someone sets a migration
> >>> threshold
> >>> on
> >>> the children but not on the parent, but that seems like a
> >>> configuration problem to me.
> >>
> >> I was saying that we don't need to concern ourselves about them,
> >> actually.
> >>
> >> If rsc-vm1 and vm1-http are collocated (in addition to being
> >> ordered
> >> with the magic flag), the Nth failure of the web service will
> >> trigger
> >> the rsc-vm1 to be moved along with vm1-http, which is desired.
> >>
> >>>> A "restart-origin" attribute, perhaps?
> >>> Would this attribute need to be exposed through the
> >>> configuration?
> >>>
> >>> I was thinking this constraint would be an implied relationship
> >>> between the container parent and members internally.  We probably
> >>> already have the right set of flags internally in the pengine to
> >>> represent this sort of constraint.  If we don't need to expose
> >>> this
> >>> logic to the config my vote is to limit it to the container use
> >>> case
> >>> for now.
> >>
> >> I was thinking that the constraint - either as a flag to the order
> >> constraint or a new one - *would* be the configuration syntax.
> >>
> >> I don't so much like a new container object. That was one of the
> >> things
> >> that were wrong with "groups", design-wise. The
> >> grouping/relationships
> >> of objects belong into the constraints section.
> >>
> >> A new attribute to the order constraint is also fully and
> >> completely
> >> backwards compatible with any and all tools we're using today.
> > 
> > Yes, introducing the new order constraint attribute would allow all
> > this to be possible without the container object, but all the
> > dependencies between the vm and the children would have to be
> > generated in the constraint section (order and colocation
> > constraints).  I'm not sure how I feel about that.  It is easier
> > from an implementation standpoint, but puts a larger burden on the
> > user.
> > 
> > Perhaps we introduce the order constraint attribute and the new
> > lrmd work so remote monitoring will be technically possible (large
> > configuration burden though).  Then we approach the container
> > object as a syntactic shortcut similar to the group object later
> > on if we want to.
> >
> Okay. We can think about the "container" later, let's introduce the
> attribute for the order constraint first. Perhaps like:
>
> diff --git a/xml/constraints-1.1.rng b/xml/constraints-1.1.rng
> index e224600..5c4ef73 100644
> --- a/xml/constraints-1.1.rng
> +++ b/xml/constraints-1.1.rng
> @@ -141,6 +141,9 @@
>  	  </attribute>
>  	</choice>
>        </optional>
> +      <optional>
> +	<attribute name="restart-origin"><data type="boolean"/></attribute>
> +      </optional>
>        <choice>
>  	<oneOrMore>
>  	  <ref name="element-resource-set"/>

I don't feel strongly about this. Here's what comes to mind for me.

force-recover - force recovery of both sides of the constraint if either side fails

> If we are ok with it, I'm happy to take it if nobody has started
> working
> on it yet. :-)

Here's a thought.  Add the new constraint flag as well as a new option on the primitive that escalates failures to the parent resource (pretty sure this idea isn't mine, maybe Andrew threw it at me a few weeks ago)

Then you could do something like this.

primitive vm
group vm-resources
    primitive nagios-monitor-foo
    primitive nagios-monitor-bar

order vm then vm-resources reset-origin
colocation vm vm-resources.

It isn't as simple (configuration wise, not implementation wise) as the container concept, but at least this way you don't have to build relationships between the vm and every resource in it explicitly.  It seems like leveraging groups here would be a good idea.

-- Vossel

> Regards,
>   Gao,Yan
> --
> Gao,Yan <ygao at suse.com>
> Software Engineer
> China Server Team, SUSE.
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>