[Pacemaker] [Cluster-devel] [PATCH] Fix pacemaker's wrong quorum view in a CMAN+pacemaker cluster

Andrew Beekhof andrew at beekhof.net
Mon Mar 14 06:41:51 EDT 2011


On Mon, Mar 14, 2011 at 11:04 AM, Simone Gotti <simone.gotti at gmail.com> wrote:
> On 03/14/2011 08:36 AM, Andrew Beekhof wrote:
>> On Sun, Mar 13, 2011 at 2:17 PM, Simone Gotti <simone.gotti at gmail.com> wrote:
>>> Hi all,
>>>
>>> Testing a cman+pacemaker cluster on rhel6 I noticed a very nasty
>>> behavior when some nodes were leaving and rejoining the cluster. When a
>>> nodes starts leaving and rejoining the cluster the quorum view of
>>> pacemaker starts becoming sometimes different from the quorum view of
>>> cman. The one not telling the truth was pacemaker.
>> Do they ever start agreeing again?  In other words, is the situation
>> transient or is Pacemaker always 1 (or more) behind after that?
>
> Looks like it will be always behind and even more at every new cluster
> change. I don't see any other part in the code where CMAN's events are
> dequeued other than this one.

Ok, it looks like the functions behind G_main_add_fd() don't work as
one would expect.
This is a very surprising thing to find out after all these years.

Patch applied. Thanks!
Could I trouble you to create a RHEL6 bug for this though?
That will allow me to fix it there too.

>
> The event just says if we have or not quorum. So can happen that an old
> dequeued message says the same as the current real state but it's just a
> coincidence.
>
>
>
>>> I reproduced the problem with a simple test case made of 2 nodes using
>>> cman (no two_nodes flag) and pacemaker (started only on the first node:
>>> pcmk01).
>>>
>>> For the tests I was using the latest version of pacemaker (1.1.5) while
>>> keeping the original versions of corosync and cluster (cman) packages
>>> provided by the rhel6 (corosync-1.2.3-21.el6.x86_64,
>>> cman-3.0.12-23.el6.4.x86_64)
>>>
>>> The problem is that when a node joins a cluster (starting cman) the cman
>>> on the other nodes emits not one but 2 events (I didn't investigated if
>>> this is normal or present only in some versions of cman) but when crmd
>>> calls cman_dispatch it's using the flag CMAN_DISPATCH_ONE so only one of
>>> the two events is dequeued. In the subsequent cluster event the old one
>>> is dequeued.
>>>
>>> The fix I tried used CMAN_DISPATCH_ALL instead of CMAN_DISPATCH_ONE and
>>> looks like its working.
>>>
>>> I'm CCing the cluster-devel list as they can be interested in the double
>>> event emitted by cman.
>>>
>>>
>>> Thanks.
>>>
>>> Bye!
>>>
>>>
>>> == Test case ==
>>>
>>> === Without the patch ===
>>>
>>> Start with both nodes with cman started (so the cluster is quorate).
>>>
>>>
>>> Now stop cman on pcmk02. Output on pcmk01:
>>>
>>> pcmk01 corosync[16793]:   [CMAN  ] quorum lost, blocking activity
>>> pcmk01 corosync[16793]:   [QUORUM] This node is within the non-primary
>>> component and will NOT provide any services.
>>> pcmk01 corosync[16793]:   [QUORUM] Members[1]: 1
>>> pcmk01 corosync[16793]:   [TOTEM ] A processor joined or left the
>>> membership and a new membership was formed.
>>> pcmk01 corosync[16793]:   [CPG   ] downlist received left_list: 1
>>> pcmk01 corosync[16793]:   [CPG   ] chosen downlist from node r(0)
>>> ip(192.168.200.71)
>>> pcmk01 corosync[16793]:   [MAIN  ] Completed service synchronization,
>>> ready to provide service.
>>> pcmk01 crmd: [16993]: notice: cman_event_callback: Membership 668:
>>> quorum lost
>>>
>>> Only one event is enqueued.
>>>
>>> Now start again cman on pcmk02. Output on pcmk01:
>>>
>>> pcmk01 corosync[16793]:   [TOTEM ] A processor joined or left the
>>> membership and a new membership was formed.
>>> pcmk01 corosync[16793]:   [CMAN  ] quorum regained, resuming activity
>>> pcmk01 corosync[16793]:   [QUORUM] This node is within the primary
>>> component and will provide service.
>>> pcmk01 corosync[16793]:   [QUORUM] Members[2]: 1 2
>>> pcmk01 corosync[16793]:   [QUORUM] Members[2]: 1 2
>>> pcmk01 crmd: [16993]: notice: cman_event_callback: Membership 672:
>>> quorum acquired
>>> pcmk01 corosync[16793]:   [CPG   ] downlist received left_list: 0
>>> pcmk01 corosync[16793]:   [CPG   ] downlist received left_list: 0
>>> pcmk01 corosync[16793]:   [CPG   ] chosen downlist from node r(0)
>>> ip(192.168.200.71)
>>> pcmk01 corosync[16793]:   [MAIN  ] Completed service synchronization,
>>> ready to provide service.
>>>
>>> As you can see two events are enqueued and only one si dequeued (due to
>>> the CMAN_DISPATCH_ONE flag passed to cman_dispatch).
>>>
>>> The quorum is ragained both on cman and crmd. But there's another event
>>> saying that the quorum is regained in the queue.
>>>
>>>
>>> Now stop again cman on pcmk02. Output on pcmk01:
>>>
>>> pcmk01 corosync[16793]:   [CMAN  ] quorum lost, blocking activity
>>> pcmk01 corosync[16793]:   [QUORUM] This node is within the non-primary
>>> component and will NOT provide any services.
>>> pcmk01 corosync[16793]:   [QUORUM] Members[1]: 1
>>> pcmk01 corosync[16793]:   [TOTEM ] A processor joined or left the
>>> membership and a new membership was formed.
>>> pcmk01 corosync[16793]:   [CPG   ] downlist received left_list: 1
>>> pcmk01 corosync[16793]:   [CPG   ] chosen downlist from node r(0)
>>> ip(192.168.200.71)
>>> pcmk01 corosync[16793]:   [MAIN  ] Completed service synchronization,
>>> ready to provide service.
>>> pcmk01 crmd: [16993]: info: cman_event_callback: Membership 676: quorum
>>> retained
>>>
>>> CMAN says that the quorum is lost and only one event is dispatched. But
>>> crmd dequeued the previous event and thinks that we have the quorum.
>>>
>>>
>>> Now start again cman on pcmk02. Output on pcmk01:
>>>
>>> pcmk01 corosync[16793]:   [TOTEM ] A processor joined or left the
>>> membership and a new membership was formed.
>>> pcmk01 corosync[16793]:   [CMAN  ] quorum regained, resuming activity
>>> pcmk01 corosync[16793]:   [QUORUM] This node is within the primary
>>> component and will provide service.
>>> pcmk01 corosync[16793]:   [QUORUM] Members[2]: 1 2
>>> pcmk01 corosync[16793]:   [QUORUM] Members[2]: 1 2
>>> pcmk01 crmd: [16993]: notice: cman_event_callback: Membership 680:
>>> quorum lost
>>> pcmk01 corosync[16793]:   [CPG   ] downlist received left_list: 0
>>> pcmk01 corosync[16793]:   [CPG   ] downlist received left_list: 0
>>> pcmk01 corosync[16793]:   [CPG   ] chosen downlist from node r(0)
>>> ip(192.168.200.71)
>>> pcmk01 corosync[16793]:   [MAIN  ] Completed service synchronization,
>>> ready to provide service.
>>>
>>> CMAN says that the quorum is regained but crmd dequeued again the old
>>> event and now it says that the quorum is lost. And so on...
>>>
>>>
>>>
>>> === With the patch ===
>>>
>>> stop cman on pcmk02. Output on pcmk01:
>>>
>>> pcmk01 corosync[13149]:   [CMAN  ] quorum lost, blocking activity
>>> pcmk01 corosync[13149]:   [QUORUM] This node is within the non-primary
>>> component and will NOT provide any services.
>>> pcmk01 corosync[13149]:   [QUORUM] Members[1]: 1
>>> pcmk01 corosync[13149]:   [TOTEM ] A processor joined or left the
>>> membership and a new membership was formed.
>>> pcmk01 corosync[13149]:   [CPG   ] downlist received left_list: 1
>>> pcmk01 corosync[13149]:   [CPG   ] chosen downlist from node r(0)
>>> ip(192.168.200.71)
>>> pcmk01 corosync[13149]:   [MAIN  ] Completed service synchronization,
>>> ready to provide service.
>>>
>>>  pcmk01 crmd: [13351]: notice: cman_event_callback: Membership 648:
>>> quorum lost
>>>
>>> Only one event is enqued.
>>>
>>>
>>> Now start again cman on pcmk02. Output on pcmk01:
>>>
>>> pcmk01 corosync[13149]:   [TOTEM ] A processor joined or left the
>>> membership and a new membership was formed.
>>> pcmk01 corosync[13149]:   [CMAN  ] quorum regained, resuming activity
>>> pcmk01 corosync[13149]:   [QUORUM] This node is within the primary
>>> component and will provide service.
>>> pcmk01 corosync[13149]:   [QUORUM] Members[2]: 1 2
>>> pcmk01 corosync[13149]:   [QUORUM] Members[2]: 1 2
>>> pcmk01 crmd: [13351]: notice: cman_event_callback: Membership 652:
>>> quorum acquired
>>> pcmk01 corosync[13149]:   [CPG   ] downlist received left_list: 0
>>> pcmk01 corosync[13149]:   [CPG   ] downlist received left_list: 0
>>> pcmk01 corosync[13149]:   [CPG   ] chosen downlist from node r(0)
>>> ip(192.168.200.71)
>>> pcmk01 corosync[13149]:   [MAIN  ] Completed service synchronization,
>>> ready to provide service.
>>> pcmk01 crmd: [13351]: info: cman_event_callback: Membership 652: quorum
>>> retained
>>>
>>> As you can see two events are enqued and both are dequeued.
>>>
>>>
>>>
>>> --
>>> Simone Gotti
>>>
>>>
>>>
>
>
> --
> Simone Gotti
>
>




More information about the Pacemaker mailing list