[Pacemaker] DRBD primary/primary + Pacemaker goes into split brain after crm node standby/online
Digimer
lists at alteeve.ca
Tue Jun 10 03:28:48 UTC 2014
On 09/06/14 07:44 PM, Andrew Beekhof wrote:
>
> On 10 Jun 2014, at 4:07 am, Alexis de BRUYN <alexis.mailinglist at de-bruyn.fr> wrote:
>
>> Hi Everybody,
>>
>> I have an issue with a 2-node Debian Wheezy primary/primary DRBD
>> Pacemaker/Corosync configuration.
>>
>> After a 'crm node standby' then a 'crm node online', the DRBD volume
>> stays in a 'split brain state' (cs:StandAlone ro:Primary/Unknown).
>>
>> A soft or hard reboot of one node gets rid of the split brain and/or
>> doesn't create one.
>>
>> I have followed http://www.drbd.org/users-guide-8.3/ and keep my tests
>> as simple as possible (no activity and no filesystem on the DRBD volume).
>>
>> I don't see what I am doing wrong. Could anybody help me with this please.
>
> There could be a pacemaker bug.
> Master/slave resources are quite complex internally and have received many improvements in the years since 1.1.7.
> So simply upgrading pacemaker could be the answer.
In addition, setup/test stonith in pacemaker, then hook DRBD's fencing
into pacemaker (set 'fencing resource-and-stonith;' and 'fence-handler
/path/to/crm-fence-peer.sh). This way, if DRBD is about to split-brain,
it will instead block and call a fence, and stay blocked until the fence
succeeds. It will only resume when the peer is in a known state (off),
thus avoiding split-brains entirely.
And, and Andrew said, upgrade pacemaker. :)
--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
More information about the Pacemaker
mailing list