[Pacemaker] Recovery after lost quorum
Denis Witt
denis.witt at concepts-and-training.de
Wed Jun 5 00:43:30 UTC 2013
Am 05.06.2013 um 02:15 schrieb Andrew Beekhof <andrew at beekhof.net>:
>> Jun 5 01:11:06 test4 pengine: [18625]: WARN: cluster_status: We do not have quorum - fencing and resource management disabled
>> Jun 5 01:11:06 test4 pengine: [18625]: notice: LogActions: Start pingtest:0#011(test4 - blocked)
>> Jun 5 01:11:06 test4 pengine: [18625]: notice: LogActions: Start drbd:0#011(test4 - blocked)
>
> Here's your reason. We didn't get quorum until:
>> Jun 5 01:11:11 test4 crmd: [18626]: notice: ais_dispatch_message: Membership 128: quorum acquired
Hi Andrew,
I thought this means that there is a quorum. Anyway, crm status says:
root at test4:~# crm status
============
Last updated: Wed Jun 5 02:36:20 2013
Last change: Tue Jun 4 17:55:28 2013 via crm_attribute on backup3
Stack: openais
Current DC: test4 - partition with quorum
Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
3 Nodes configured, 3 expected votes
8 Resources configured.
============
Online: [ test4 backup3 ]
OFFLINE: [ test3 ]
But no resources are started, so I suspect there really is quorum. Anyway, I noticed, that, if I start pacemaker on the backup3-node the services are restarted, even if it sometime takes some time. So I might have to live with the "not installed" messages and start the backup3-node in standby-Mode as long no one comes up with a better solution. Maybe I'll fake the status of the monitors on this node and add some location-rules to avoid that resources will be moved to this node.
Thanks for your help.
Best regards,
Denis Witt
More information about the Pacemaker
mailing list