[Pacemaker] Split brain and STONITH behavior (VMware fencing)

Andrew Beekhof andrew at beekhof.net
Wed Oct 29 21:24:05 UTC 2014


> On 29 Oct 2014, at 7:48 pm, Andrei Borzenkov <arvidjaar at gmail.com> wrote:
> 
> On Wed, Oct 29, 2014 at 10:46 AM, Ariel S <ariel_bis2030 at yahoo.co.id> wrote:
>> Hello,
>> 
>> I'm trying to understand how this STONITH works.
>> 
>> I have 2 VMware VMs (moon1a, moon1b) on two different hosts. Each have 2 nic
>> assigned: eth0 for heartbeat while eth1 used for everything else.
>> 
>> This is my testing configuration:
>> 
>>    node $id="168428034" moon1a
>>    node $id="168428035" moon1b
>>    primitive Foo ocf:heartbeat:Dummy
>>    primitive stonith_moon1a stonith:fence_vmware_soap \
>>            params ipaddr="192.168.1.134" login="foo" \
>>                    uuid="42053b22-d3fd-25fe-6fb3-7cb2c7cd2c63" \
>>                    action="off" verbose="true" passwd="bar" \
>>                    ssl="true" \
>>            op monitor interval="60s"
>>    primitive stonith_moon1b stonith:fence_vmware_soap \
>>            params ipaddr="192.168.1.134" login="foo" \
>>                    uuid="4205b986-4426-5de4-1069-b10a77123bc4" \
>>                    action="off" verbose="true" passwd="bar" \
>>                    ssl="true" \
>>            op monitor interval="60s"
>>    clone FooClones Foo
>>    location loc_stonith_moon1a stonith_moon1a -inf: moon1a
>>    location loc_stonith_moon1b stonith_moon1b -inf: moon1b
>>    property $id="cib-bootstrap-options" \
>>            dc-version="1.1.10-42f2063" \
>>            cluster-infrastructure="corosync" \
>>            stonith-enabled="true" \
>>            last-lrm-refresh="1414565715"
>>    rsc_defaults $id="rsc-options" \
>>            resource-stickiness="200"
>> 
>> 
>> The vCenter is at 192.168.1.134 and the uuids taken from a list generated by
>> fence_vmware_soap.
>> 
>> When I do fencing manually using:
>> 
>>    # fence_vmware_soap -z -a 192.168.1.134 \
>>                        -l foo -p bar \
>>                        -U 4205b986-4426-5de4-1069-b10a77123bc4 \
>>                        -o off
>> 
>> from moon1a, as expected the moon1b (4205b986-4426-5de4-1069-b10a77123bc4)
>> VM
>> died, so the configuration should be right, I think.
>> 
>> But so far I cant emulate split brain by killing corosync like this:
>> 
>>    # killall -9 corosync
>> 
> 
> Killing corosync is not strictly speaking split-brain, it is emulation
> of (partial) node failure.

Its almost the opposite of split-brain.
For split-brain, both sides need to believe they are full functional and it is the other side that has a problem.

> 
>> 
>> My questions:
>> 
>>    1.    Is my configuration correct?
> 
> On two node cluster you also need no-quorum-policy=ignore, otherwise
> remaining node won't initiate fencing.
> 
>>    2.    How one cause a split-brain to trigger the expected stonith
>> behavior?

Use a firewall to block corosync's ports on both hosts

>> 
>> 
>> 
>> Thank you,
>> Ariel
>> 
>> 
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org





More information about the Pacemaker mailing list