[Pacemaker] DRBD primary/primary + Pacemaker goes into split brain after crm node standby/online
Alexis de BRUYN
alexis.mailinglist at de-bruyn.fr
Wed Jun 11 16:13:18 CEST 2014
On 10.06.2014 01:44, Andrew Beekhof wrote:
>
> On 10 Jun 2014, at 4:07 am, Alexis de BRUYN <alexis.mailinglist at de-bruyn.fr> wrote:
>
>> Hi Everybody,
>>
>> I have an issue with a 2-node Debian Wheezy primary/primary DRBD
>> Pacemaker/Corosync configuration.
>>
>> After a 'crm node standby' then a 'crm node online', the DRBD volume
>> stays in a 'split brain state' (cs:StandAlone ro:Primary/Unknown).
>>
>> A soft or hard reboot of one node gets rid of the split brain and/or
>> doesn't create one.
>>
>> I have followed http://www.drbd.org/users-guide-8.3/ and keep my tests
>> as simple as possible (no activity and no filesystem on the DRBD volume).
>>
>> I don't see what I am doing wrong. Could anybody help me with this please.
>
> There could be a pacemaker bug.
> Master/slave resources are quite complex internally and have received many improvements in the years since 1.1.7.
> So simply upgrading pacemaker could be the answer.
Hi Andrew,
I have followed your advice and updated Pacemaker/Corosync by installing
a fresh Debian Sid but I still have the issue with the following packages:
# uname -a
Linux testvm1 3.13-1-amd64 #1 SMP Debian 3.13.10-1 (2014-04-15) x86_64
GNU/Linux
# cat /etc/issue && dpkg -l | egrep "corosync|pacemaker|drbd"
Debian GNU/Linux jessie/sid \n \l
ii corosync 1.4.6-1 amd64
Standards-based cluster framework (daemon and modules)
ii crmsh 1.2.6+git+e77add-1.2 amd64
CRM shell for the pacemaker cluster manager
ii drbd8-utils 2:8.4.4-1 amd64
RAID 1 over TCP/IP for Linux (user utilities)
ii pacemaker 1.1.10+git20130802-4 amd64
HA cluster resource manager
ii pacemaker-cli-utils 1.1.10+git20130802-4 amd64
Command line interface utilities for Pacemaker
And with the "experimental" packages, I cannot connect to the cluster
via crmsh too:
# cat /etc/issue && dpkg -l | egrep "corosync|pacemaker|drbd"
Debian GNU/Linux jessie/sid \n \l
ii corosync 2.3.3-1 amd64
Standards-based cluster framework (daemon and modules)
ii crmsh 1.2.6+git+e77add-1.2 amd64
CRM shell for the pacemaker cluster manager
ii drbd8-utils 2:8.4.4-1 amd64
RAID 1 over TCP/IP for Linux (user utilities)
ii libcorosync-common4 2.3.3-1 amd64
Standards-based cluster framework, common library
ii pacemaker 1.1.11-1 amd64
HA cluster resource manager
ii pacemaker-cli-utils 1.1.11-1 amd64
Command line interface utilities for Pacemaker
I will try to build last versions of Pacemaker/Corosync on a Debian
Wheezy before reporting my issue via Bugzilla.
Thanks for your help.
--
Alexis de BRUYN
More information about the Pacemaker
mailing list