[Pacemaker] chain/cascade stonith agents?
Bob Haxo
bhaxo at sgi.com
Thu Aug 16 16:21:01 UTC 2012
On Thu, 2012-08-16 at 09:37 +1000, Andrew Beekhof wrote:
> On Thu, Aug 16, 2012 at 1:59 AM, Bob Haxo <bhaxo at sgi.com> wrote:
> > HI All,
> >
> > Is chaining/cascading of stonith agents implemented?
>
> Yes. But you'll want to use the current git HEAD
>
> > If yes, would
> > someone please point me to the documentation?
>
> Um, I'm sorry to say that it's not actually documented yet :-(
>
> I can provide an example though, it should be reasonably self explanatory
>
> <cib crm_feature_set="3.0.6" validate-with="pacemaker-1.2"
> admin_epoch="1" epoch="0" num_updates="0">
> <configuration>
> ...
> <fencing-topology>
> <!-- try poison-pill and fail back to power -->
> <fencing-level id="f-p1.1" target="pcmk-1" index="1"
> devices="poison-pill"/>
> <fencing-level id="f-p1.2" target="pcmk-1" index="2" devices="power"/>
>
> <!-- try disk and network, and fail back to power -->
> <fencing-level id="f-p2.1" target="pcmk-2" index="1"
> devices="disk,network"/>
> <fencing-level id="f-p2.2" target="pcmk-2" index="2" devices="power"/>
> </fencing-topology>
> </configuration>
> <status/>
> </cib>
> .
>
> > I'd like to implement a stonith chain in which stonith_ipmilan is the
> > first stonith agent, and if that fails, a second stonith agent gets
> > called (for example stonith_apc).
> >
> > ((In short, I find it tiresome to pull the power cable(s) for a HA
> > failover demonstration only to have the failover, well, fail, when
> > stonith_ipmilan goes into a failure loop when it doesn't get a response
> > from the powered-off BMC.))
> >
> > Is there a way of setting stonith_ipmilan to give up and return a
> > "stonith success"? I was thinking that I would chain stonith_ipmilan
> > with the ever popular stonith_null to achieve this end.
>
> For a demo, sure.
> But in production, how do you tell the difference between "I can't
> reach the BMC because its powered off" and "I can't reach the BMC
> because my network link to it is disrupted"?
>
> Note there is also 'stonith_admin --confirm $node' which will tell
> stonith-ng and the rest of pacemaker that $node is safely down.
Yes, it is a trade-off. Certainly during development, I'm less
concerned about a corrupted virt than I am concerned about the hang that
occurs when there is no response to the lack of response to the
powered-off system. The virt can easily be re-imaged.
Is there an easier way of forcing the stonith_ipmilan to give-up than
chaining to stonith_null?
Thanks,
Bob Haxo
>
> >
> > Cheers,
> > Bob Haxo
> > bhaxo at sgi.com
> >
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list