[Pacemaker] DRBD primary/primary + Pacemaker goes into split brain after crm node standby/online

Alexis de BRUYN alexis.mailinglist at de-bruyn.fr
Wed Jun 11 14:16:10 UTC 2014


On 10.06.2014 05:28, Digimer wrote:
> On 09/06/14 07:44 PM, Andrew Beekhof wrote:
>>
>> On 10 Jun 2014, at 4:07 am, Alexis de BRUYN
>> <alexis.mailinglist at de-bruyn.fr> wrote:
>>
>>> Hi Everybody,
>>>
>>> I have an issue with a 2-node Debian Wheezy primary/primary DRBD
>>> Pacemaker/Corosync configuration.
>>>
>>> After a 'crm node standby' then a 'crm node online', the DRBD volume
>>> stays in a 'split brain state' (cs:StandAlone ro:Primary/Unknown).
>>>
>>> A soft or hard reboot of one node gets rid of the split brain and/or
>>> doesn't create one.
>>>
>>> I have followed http://www.drbd.org/users-guide-8.3/ and keep my tests
>>> as simple as possible (no activity and no filesystem on the DRBD
>>> volume).
>>>
>>> I don't see what I am doing wrong. Could anybody help me with this
>>> please.
>>
>> There could be a pacemaker bug.
>> Master/slave resources are quite complex internally and have received
>> many improvements in the years since 1.1.7.
>> So simply upgrading pacemaker could be the answer.
> 
> In addition, setup/test stonith in pacemaker, then hook DRBD's fencing
> into pacemaker (set 'fencing resource-and-stonith;' and 'fence-handler
> /path/to/crm-fence-peer.sh). This way, if DRBD is about to split-brain,
> it will instead block and call a fence, and stay blocked until the fence
> succeeds. It will only resume when the peer is in a known state (off),
> thus avoiding split-brains entirely.
Thanks Digimer for your suggestion, but unfornately I don't have ipmi
hardware on my tests machines right now.

> 
> And, and Andrew said, upgrade pacemaker. :)
> 

-- 
Alexis de BRUYN




More information about the Pacemaker mailing list