[Pacemaker] DC election with downed node in 2-way cluster

Miki Shapiro Miki.Shapiro at coles.com.au
Wed Jan 13 02:25:00 UTC 2010


Hi all

I'm attempting to build a 2-way cluster, SLES-11-based with an openais/pacemaker stack. I've got the nodes and a resource (a drbd volume) happening. What I'm not sure about is the active CRM DC election process.

I configured a null stonith resource for each node.
I have stonith-enabled set to true ( I will implement a real stonith facility once final solution is in place)
I have no-quorum-policy set to ignore (as the cluster is expected to work with one node active).

I look at crm_mon or crm_gui, and it's all green and happy.

I now go and halt a node.

Observing crm_mon or crm_gui on node2, I expect to see :

1.       Services appear as down thanks to resource monitoring directives.

2.       The quorum broken (... do I care?)

3.       The new node elected as DC. Despite what the book states (here: < http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/s-cluster-status.html > at the bottom)  that:

"The DC (Designated Controller) node is where all the decisions are made and if the current DC fails a new one is elected from the remaining cluster nodes. The choice of DC is of no significance to an administrator beyond the fact that its logs will generally be more interesting."



Is of significance. I want the brain, in as far as the surviving node is concerned, to be running on a non-halted server.


What happens in practice is:
If I halt the DC,

1.       Resources DO appear stopped and do-their-thing(tm)

2.       [PROBLEM?] Quorum DOES NOT appear as broken

3.       [PROBLEM?] The remaining node DOES NOT get (visibly) elected as the new DC.
If I halted the non-DC node,

1.       Resources DO appear stopped and do-their-thing(tm)

2.       Quorum DOES appear as broken

3.       [PROBLEM?]The remaining node DOES NOT get (visibly) elected as the new DC.

Now if my understanding serves me right, the DC is the baton-holding CRM that does the thinking for the entire cluster. If the surviving node1 think that the (DEAD) node2 is the de-facto brains of the cluster and doesn't take the reigns, I have a dysfunctional cluster.

Can someone please offer some clarification on how one would reasonably expect this to work?

Thanks!


Miki Shapiro
Linux Systems Engineer
Infrastructure Services & Operations

[cid:image001.png at 01CA9453.D1ECBD70]
745 Springvale Road
Mulgrave 3170 Australia
Email miki.shapiro at coles.com.au<mailto:miki.shapiro at coles.com.au>
Phone: 61 3 854 10520
Fax:     61 3 854 10558



______________________________________________________________________
This email and any attachments may contain privileged and confidential
information and are intended for the named addressee only. If you have
received this e-mail in error, please notify the sender and delete
this e-mail immediately. Any confidentiality, privilege or copyright
is not waived or lost because this e-mail has been sent to you in
error. It is your responsibility to check this e-mail and any
attachments for viruses.  No warranty is made that this material is
free from computer virus or any other defect or error.  Any
loss/damage incurred by using this material is not the sender's
responsibility.  The sender's entire liability will be limited to
resupplying the material.
______________________________________________________________________
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20100113/0badcd57/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 1637 bytes
Desc: image001.png
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20100113/0badcd57/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 162 bytes
Desc: image002.png
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20100113/0badcd57/attachment-0007.png>


More information about the Pacemaker mailing list