[Pacemaker] Split brain and STONITH behavior (VMware fencing)
Andrew Beekhof
andrew at beekhof.net
Wed Oct 29 21:24:05 UTC 2014
> On 29 Oct 2014, at 7:48 pm, Andrei Borzenkov <arvidjaar at gmail.com> wrote:
>
> On Wed, Oct 29, 2014 at 10:46 AM, Ariel S <ariel_bis2030 at yahoo.co.id> wrote:
>> Hello,
>>
>> I'm trying to understand how this STONITH works.
>>
>> I have 2 VMware VMs (moon1a, moon1b) on two different hosts. Each have 2 nic
>> assigned: eth0 for heartbeat while eth1 used for everything else.
>>
>> This is my testing configuration:
>>
>> node $id="168428034" moon1a
>> node $id="168428035" moon1b
>> primitive Foo ocf:heartbeat:Dummy
>> primitive stonith_moon1a stonith:fence_vmware_soap \
>> params ipaddr="192.168.1.134" login="foo" \
>> uuid="42053b22-d3fd-25fe-6fb3-7cb2c7cd2c63" \
>> action="off" verbose="true" passwd="bar" \
>> ssl="true" \
>> op monitor interval="60s"
>> primitive stonith_moon1b stonith:fence_vmware_soap \
>> params ipaddr="192.168.1.134" login="foo" \
>> uuid="4205b986-4426-5de4-1069-b10a77123bc4" \
>> action="off" verbose="true" passwd="bar" \
>> ssl="true" \
>> op monitor interval="60s"
>> clone FooClones Foo
>> location loc_stonith_moon1a stonith_moon1a -inf: moon1a
>> location loc_stonith_moon1b stonith_moon1b -inf: moon1b
>> property $id="cib-bootstrap-options" \
>> dc-version="1.1.10-42f2063" \
>> cluster-infrastructure="corosync" \
>> stonith-enabled="true" \
>> last-lrm-refresh="1414565715"
>> rsc_defaults $id="rsc-options" \
>> resource-stickiness="200"
>>
>>
>> The vCenter is at 192.168.1.134 and the uuids taken from a list generated by
>> fence_vmware_soap.
>>
>> When I do fencing manually using:
>>
>> # fence_vmware_soap -z -a 192.168.1.134 \
>> -l foo -p bar \
>> -U 4205b986-4426-5de4-1069-b10a77123bc4 \
>> -o off
>>
>> from moon1a, as expected the moon1b (4205b986-4426-5de4-1069-b10a77123bc4)
>> VM
>> died, so the configuration should be right, I think.
>>
>> But so far I cant emulate split brain by killing corosync like this:
>>
>> # killall -9 corosync
>>
>
> Killing corosync is not strictly speaking split-brain, it is emulation
> of (partial) node failure.
Its almost the opposite of split-brain.
For split-brain, both sides need to believe they are full functional and it is the other side that has a problem.
>
>>
>> My questions:
>>
>> 1. Is my configuration correct?
>
> On two node cluster you also need no-quorum-policy=ignore, otherwise
> remaining node won't initiate fencing.
>
>> 2. How one cause a split-brain to trigger the expected stonith
>> behavior?
Use a firewall to block corosync's ports on both hosts
>>
>>
>>
>> Thank you,
>> Ariel
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list