[Pacemaker] trigger STONITH for testing purposes

Andrew Beekhof andrew at beekhof.net
Fri May 22 06:33:03 EDT 2009


On Wed, May 20, 2009 at 6:39 PM, Bob Haxo <bhaxo at sgi.com> wrote:
> Hi Andrew,
>
> I'd say you removed no-quorum-policy=ignore
>
> Actually, the pair of no_quorum_policy and no-quorum-policy are set to
> "ignore", and expected-quorum-votes is set to "2":
>
>   <crm_config>
>     <cluster_property_set id="cib-bootstrap-options">
>       ...
>       <nvpair id="cib-bootstrap-options-expected-quorum-votes"
> name="expected-quorum-votes" value="2"/>
>       <nvpair id="cib-bootstrap-options-no_quorum_policy"
> name="no_quorum_policy" value="ignore"/>
>       <nvpair id="nvpair-1d2c923d-7619-4b45-989a-698357f9f8cb"
> name="no-quorum-policy" value="ignore"/>
>       ...
>       </cluster_property_set>
>    </crm_config>
>
> Removing the no-quorum-policy=ignore and no_quorum_policy=ignore (as in,
> deleting the variables) left the cluster unable to failover with either an
> ifdown iface or with a node reboot.  The state displayed by the GUI did not
> agree with the state displayed by crm_mon (the GUI showed the ifdown or
> rebooted node as still controlling resources, whereas crm_mon showed the
> resources unavailable ... both showed the inaccessible node as offline).

Assuming stonith-enabled was set to false, crm_mon is correct as the
cluster assumes that the node is cleanly down*.
You should file a bug for the GUI in that case.

* Which is clearly insane and going to cause data corruption some day,
but its also the only way the cluster can continue if STONITH is
disabled.
For this reason SUSE wont support any cluster without a valid STONITH setup.

>
> Setting the no-quorum-policy=stop had the same results, which included the
> resources not migrating to the working system until returning
> no-quorum-policy=ignore.  One of the tests led to filesystem corruption.

Without STONITH I can easily believe this happened.

> Very messy.  (this is a test-only setup, so no real data is present)
>
> So, no, the change that I made was neither deleting nor setting
> no-quorum-policy=stop.

Strange.

> Setting no-quorum-policy=ignore seems to be required
> for the cluster to support migrations and failovers.

For two node clusters, yes.

Heartbeat pretends that 2 node clusters always have quorum but this is
not the case when using OpenAIS.




More information about the Pacemaker mailing list