[Pacemaker] chain/cascade stonith agents?

Andrew Beekhof andrew at beekhof.net
Wed Aug 15 23:37:33 UTC 2012


On Thu, Aug 16, 2012 at 1:59 AM, Bob Haxo <bhaxo at sgi.com> wrote:
> HI All,
>
> Is chaining/cascading of stonith agents implemented?

Yes.  But you'll want to use the current git HEAD

> If yes, would
> someone please point me to the documentation?

Um, I'm sorry to say that it's not actually documented yet :-(

I can provide an example though, it should be reasonably self explanatory

<cib crm_feature_set="3.0.6" validate-with="pacemaker-1.2"
admin_epoch="1" epoch="0" num_updates="0">
  <configuration>
...
    <fencing-topology>
      <!-- try poison-pill and fail back to power -->
      <fencing-level id="f-p1.1" target="pcmk-1" index="1"
devices="poison-pill"/>
      <fencing-level id="f-p1.2" target="pcmk-1" index="2" devices="power"/>

      <!-- try disk and network, and fail back to power -->
      <fencing-level id="f-p2.1" target="pcmk-2" index="1"
devices="disk,network"/>
      <fencing-level id="f-p2.2" target="pcmk-2" index="2" devices="power"/>
    </fencing-topology>
  </configuration>
  <status/>
</cib>
.

> I'd like to implement a stonith chain in which stonith_ipmilan is the
> first stonith agent, and if that fails, a second stonith agent gets
> called (for example stonith_apc).
>
> ((In short, I find it tiresome to pull the power cable(s) for a HA
> failover demonstration only to have the failover, well, fail, when
> stonith_ipmilan goes into a failure loop when it doesn't get a response
> from the powered-off BMC.))
>
> Is there a way of setting stonith_ipmilan to give up and return a
> "stonith success"?  I was thinking that I would chain stonith_ipmilan
> with the ever popular stonith_null to achieve this end.

For a demo, sure.
But in production, how do you tell the difference between "I can't
reach the BMC because its powered off" and "I can't reach the BMC
because my network link to it is disrupted"?

Note there is also 'stonith_admin --confirm $node' which will tell
stonith-ng and the rest of pacemaker that $node is safely down.

>
> Cheers,
> Bob Haxo
> bhaxo at sgi.com
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org




More information about the Pacemaker mailing list