[Pacemaker] New System Health feature

Wed May 6 17:32:25 EDT 2009

beekhof at gmail.com wrote on 04/28/2009 10:31:43 AM:
>
> On Mon, Apr 27, 2009 at 22:25, Mark Hamzy <hamzy at us.ibm.com> wrote:
> > beekhof at gmail.com wrote on 04/24/2009 11:00:01 AM:
> >>
> >> On Thu, Apr 23, 2009 at 17:49, Mark Hamzy <hamzy at us.ibm.com> wrote:
> >> >
> > Not only does this has to be done for all of the resources, but new
health
> > metrics must
> > be known to the administrators.
>
> I'd suggest that they should know this sort of thing - much like they
> do with pingd.
>
> This is particularly true when there are multiple health devices as
> the admin needs to decide how to combine the "scores" (and rules are
> designed for exactly this sort of function).

If you give administrators the choice of how to combine scores, its seems
that there is another potential point of failure being introduced.  Lucky
admins will not see problems where unlucky admins will have to wait until
their machines truly fail before HA moves resources over.

How about a PE option for pacemaker to automatically calculate health?
Admins could then enable these calculations if they do not want to go?
through the effort to investigate all of the health variables and how?
they might affect system operations?

>
> >
> > This is a logistical nightmare. ?What I am proposing is that pacemaker
add
> > health
> > scores to nodes. ?Currently, nodes with no rules applied to them start
at
> > zero.
> >
> > We want the constraints left alone.
>
> But it is a constraint that you want the system to observe.
> Granted we may not expose it as a constraint (eg. migration-failure or
> standby), but that doesn't change the fact that it is one.

By setting a PE option, does this now reinforce the notion that constraints
are being implicitly added?

>
> > <constraints>
> > ?<rsc_location id="rsc_location_apache_1" rsc="apache_1">
> > ? ?<rule id="prefered_location_apache_1" score="100">
> > ? ? ?<expression attribute="#uname"
id="prefered_location_apache_1_expr"
> > operation="eq" value="hs21c"/>
> > ? ?</rule>
> > ?</rsc_location>
> > </constraints>
> >
> > How were you proposing that this should be done under the current
> > rsc_location?
>
> In 1.0.x, rules are reusable, so in the worst case, each additional
> resource "only" need have
>
>   <rsc_location id="other_health" rsc="other">
>     <rule id-ref="health_location_1_apache_1"/>
>     <rule id-ref="health_location_2_apache_1"/>
>   </rsc_location>

Let me work out what it would look like for a system with two resources
and two health variables.  Currently, an admin would have to write out
four constraints to weight certain resources on certain nodes.  They would
then have to add the two health variables to each resource weighting.

<constraints>
  <rsc_location id="prefered_location_apache_1" node="hs21c" rsc="apache_1"
score="100"/>
  <rsc_location id="health_apache_1" rsc="apache_1">
    <rule id="health_location_1_apache_1" score-attribute="#health-ipmi">
      <expression attribute="#health-ipmi" id="apache_1_ipmi_expr"
operation="defined"/>>
    </rule>
    <rule id="health_location_2_apache_1" score-attribute="#health-smart">
      <expression attribute="#health-smart" id="apache_1_smart_expr"
operation="defined"/>
    </rule>
  </rsc_location>
  <rsc_location id="prefered_location_nfs_1" node="hs21c" rsc="nfs_1"
score="100"/>
  <rsc_location id="health_nfs_1" rsc="nfs_1">
    <rule id="health_location_1_nfs_1" score-attribute="#health-ipmi">
      <expression attribute="#health-ipmi" id="nfs_1_ipmi_expr"
operation="defined"/>
    </rule>
    <rule id="health_location_2_nfs_1" score-attribute="#health-smart">
      <expression attribute="#health-smart" id="nfs_1_smart_expr"
operation="defined"/>
    </rule>
  </rsc_location>
</constraints>

This is a lot of work for N resources and M health variables.

In 1.0.x, you can reference previous rule ids in other resources.  So,
you leave the first resource "as is" and then switch the longer health
rules to health rules references in the other resources.

<constraints>
  <rsc_location id="prefered_location_apache_1" node="hs21c" rsc="apache_1"
score="100"/>>
  <rsc_location id="health_apache_1" rsc="apache_1">
    <rule id="health_location_1" score-attribute="#health-ipmi">
      <expression attribute="#health-ipmi" id="apache_1_ipmi_expr"
operation="defined"/>>>
    </rule>
    <rule id="health_location_2" score-attribute="#health-smart">
      <expression attribute="#health-smart" id="apache_1_smart_expr"
operation="defined"/>
    </rule>
  </rsc_location>
  <rsc_location id="prefered_location_nfs_1" node="hs21c" rsc="nfs_1"
score="100"/>
  <rsc_location id="health_nfs_1" rsc="nfs_1">
    <rule id-ref="health_location_1"/>
    <rule id-ref="health_location_2"/>
  </rsc_location>
</constraints>

While a little typing is saved, you still have to enter information
for each N x M combination.

> I've also been mulling over the idea of allowing one rule to match
> multiple resources somehow...
> Something like  <rsc_location id="rsc_location_apache_3"
rsc-pattern="*">...
>
> Which would again significantly reduce the configuration overhead.
>
> Is this starting to sound a little less aggravating?

Does this mean that you would have to create a fake resource with
health rules in it?  Something like this:

<constraints>
  <rsc_location id="other_health" rsc="*">
    <rule id="health_location_1" score-attribute="#health-ipmi">
      <expression attribute="#health-ipmi" id="health_location_1_ipmi_expr"
operation="defined"/>
    </rule>
    <rule id="health_location_2" score-attribute="#health-smart">
      <expression attribute="#health-smart"
id="health_location_2_smart_expr" operation="defined"/>
    </rule>
  </rsc_location>
  <rsc_location id="prefered_location_apache_1" node="hs21c" rsc="apache_1"
score="100"/>
  <rsc_location id="prefered_location_nfs_1" node="hs21c" rsc="nfs_1"
score="100"/>
</constraints>

That is a little bit cleaner.  The health calculations are consolidated
and every resource does not have to bring them in.  However, why should
you have to list them all out if you don't want to?  It could introduce
errors if you forget one.

> > lmb at suse.de wrote:
> >> Basically your mechanism modifies the "base score" for a node,
somewhat
> >> similar to the cluster-wide default set by symmetric-cluster true (0)
or
> >> false (-INFINITY).
> >>
> >> Sure, but I'd go even beyond this and just add the mechanism for
setting
> >> the base score; translating health scores into these can be done
outside
> >> the core system.
>
> Also an interesting way to think about it.
> I could see this working if there was only ever one health agent per
> node, but from what you're saying it sounds as if there is likely to
> be many.
>
> Actually, it would still work if the entity responsible for updating
> the node health combined the readings from the different sources into
> a single value.
> However, then you start to require a daemon and some way to configure
> it (in order to specify how the sources should be combined).
> And of course eventually people will want more detail than the
> combined score...  "why is the health red? ".

Yes.  There is really only one overall health status of a system.  It
can be summed from multiple, independent reporting mechanisms (ipmi,
smart, mcelog, etc).

I don't think that complex rules need to exist in order to combine
scores.  Any "red" health forces -INF.  All "yellow" healths
keep weighting the resource more and more to a different node.  -INF
plus anything equals -INF.  So a simple merge_weights would work.

I envision different daemons for each reporting interface.  For example,
if you system supports ipmi, then it will run a daemon that listens to
ipmi events, determines if the event deals with system health, and
then set a #health-ipmi variable to notify HA.

With this architecture, you don't want another daemon summing up
all #health-* variables that it sees.  There would be a delay as an
event needs to be reported; detected and summed; and then HA detecting
the change and acting.  Also, HA would detect changes to all of the
sub-health variables and not really care about them.  And what
happens if the combining daemon dies?

> And of course eventually people will want more detail than the
> combined score...  "why is the health red? ".

Yes.  I think adequate logging would work here.  The ipmi daemon will
notify the log of a fault that it detected as it sets the HA #health-ipmi
variable for example.

Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.clusterlabs.org/pipermail/pacemaker/attachments/20090506/fe018639/attachment.htm>