[Pacemaker] trigger STONITH for testing purposes
Bob Haxo
bhaxo at sgi.com
Mon May 18 18:12:31 UTC 2009
OK, I've set the stonith action to "poweroff" and I already had quarum
action set to "ignore". The "poweroff" makes is much easier to re-set
"stonith-enabled" to "false" so that I can get two systems online
again. ;-)
However, I was more hoping to be able to reboot the fenced system
without triggering a reboot (or halt) of the working system. Here are
some specifics:
SLES11 HAE (GA)
external/ipmi
two HA servers
<crm_config>
<cluster_property_set id="cib-bootstrap-options">
<nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="1.0.3-0080ec086ae9c20ad5c4c3562000c0ad68374f0a"/>
<nvpair id="cib-bootstrap-options-expected-quorum-votes" name="expected-quorum-votes" value="2"/>
<nvpair id="cib-bootstrap-options-last-lrm-refresh" name="last-lrm-refresh" value="1242661586"/>
<nvpair id="cib-bootstrap-options-no_quorum_policy" name="no_quorum_policy" value="ignore"/>
<nvpair id="cib-bootstrap-options-stonith-enabled" name="stonith-enabled" value="true"/>
<nvpair id="nvpair-a8fa01f7-fd6c-4e9b-adf6-0e48250691f1" name="stonith-action" value="poweroff"/>
<nvpair id="nvpair-1d2c923d-7619-4b45-989a-698357f9f8cb" name="no-quorum-policy" value="ignore"/>
</cluster_property_set>
And, the two stonith resources:
<primitive class="stonith" id="ipmi_stonith_hikari" type="external/ipmi">
<meta_attributes id="ipmi_stonith_hikari-meta_attributes"/>
<operations id="ipmi_stonith_hikari-operations">
<op id="ipmi_stonith_hikari-op-monitor-15" interval="30" name="monitor" start-delay="30" timeout="30"/>
</operations>
<instance_attributes id="ipmi_stonith_hikari-instance_attributes">
<nvpair id="nvpair-d95c4018-1ebc-447b-9028-050e68c9929c" name="hostname" value="hikari"/>
<nvpair id="nvpair-3aca66aa-bb82-43ec-8b63-e936b2507fa3" name="ipaddr" value="172.16.1.247"/>
<nvpair id="nvpair-3f623098-c266-4132-8d9c-77744e0e8713" name="userid" value="ADMIN"/>
<nvpair id="nvpair-04e6a6d7-6541-45d4-8d36-9768e240e79d" name="passwd" value="ADMIN"/>
<nvpair id="nvpair-1a90ef3c-3b67-41c2-98cf-58b8a2f9cfe0" name="interface" value="lanplus"/>
</instance_attributes>
</primitive>
<primitive class="stonith" id="ipmi_stonith_hikari2" type="external/ipmi">
<meta_attributes id="ipmi_stonith_hikari2-meta_attributes">
<nvpair id="nvpair-88049439-39e2-459d-9820-78cdeb9ae282" name="target-role" value="started"/>
</meta_attributes>
<operations id="ipmi_stonith_hikari2-operations">
<op id="ipmi_stonith_hikari2-op-monitor-15" interval="30" name="monitor" start-delay="30" timeout="30"/>
</operations>
<instance_attributes id="ipmi_stonith_hikari2-instance_attributes">
<nvpair id="nvpair-c4b4e4ce-6f9a-4a8d-a7fb-b8726f09ccf0" name="hostname" value="hikari2"/>
<nvpair id="nvpair-e9d42aca-110f-4308-a3dd-645d793e49d3" name="ipaddr" value="172.16.1.248"/>
<nvpair id="nvpair-31b086de-5209-4361-a4b8-55460cad95a8" name="userid" value="ADMIN"/>
<nvpair id="nvpair-5b3c6b97-a49e-4d18-beea-6d7aaec000fa" name="passwd" value="ADMIN"/>
<nvpair id="nvpair-6f98c068-7b2e-4309-8f5b-2c7c2527cc93" name="interface" value="lanplus"/>
</instance_attributes>
</primitive>
And the relevant pair of constraints:
<rsc_location id="stonith_hikari_on_hikari2" node="hikari" rsc="ipmi_stonith_hikari" score="-INFINITY"/>
<rsc_location id="stonith_hikari2_on_hikari" node="hikari2" rsc="ipmi_stonith_hikari2" score="-INFINITY"/>
Any suggestions as to what needs changing so that the stonith deathmarch
can be avoided?
Cheers and thanks,
Bob Haxo
SGI
On Fri, 2009-05-15 at 20:26 -0500, Karl Katzke wrote:
> Bob, as we've discussed a few other times recently, when you're
> testing (and depending on your preference in production), you may want
> to set the stonith policy to 'poweroff' as opposed to 'reboot'.
> Also, if you have a two-node cluster, pacemaker depends on quorum and
> the loss thereof creates another stonith event. You'll want to set the
> loss of quorum action to 'ignore'.
> ... in short, RTFM: http://www.clusterlabs.org/wiki/Documentation --
> Pacemaker Configuration Explained 1.0 has *everything* you need to
> know in it.
>
>
> -K
>
>
> ---
> Karl Katzke
> Systems Analyst II
> TAMU - DRGS
>
>
>
>
>
>
> >>> On 5/15/2009 at 7:22 PM, in message
> <1242433367.21186.4.camel at nalu.engr.sgi.com>, Bob Haxo <bhaxo at sgi.com> wrote:
>
> > Ok, never mind this question. "ifdown interface" works nicely to
> > trigger STONITH action.
> >
> > Unfortunately (if I may ask a new question) ... I now have one server
> > rebooting, then the other rebooting, and back to the first rebooting in
> > what looks to be an endless loop of reboots.
> >
> > Suggestions?
> >
> > Cheers,
> > Bob Haxo
> > SGI
> >
> > On Fri, 2009-05-15 at 16:53 -0700, Bob Haxo wrote:
> >
> > > Greetings,
> > >
> > > What manual administrative actions can be used to trigger STONITH
> > > action?
> > >
> > > I have created a pair of STONITH resources (external/ipmi) and would
> > > like to test that these resources work as expected (which, if I
> > > understand the default correctly, is to reboot the node).
> > >
> > > Thanks,
> > > Bob Haxo
> > > SGI
> > >
> > > SLES11 HAE
> > >
> > > _______________________________________________
> > > Pacemaker mailing list
> > > Pacemaker at oss.clusterlabs.org
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
>
>
>
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20090518/0c6c2ea6/attachment-0002.htm>
More information about the Pacemaker
mailing list