[Pacemaker] configuration variants for 2 node cluster
Kostiantyn Ponomarenko
konstantin.ponomarenko at gmail.com
Tue Jun 24 08:36:32 UTC 2014
Hi Chrissie,
But wait_for_all doesn't help when there is no connection between the nodes.
Because in case I need to reboot the remaining working node I won't get
working cluster after that - both nodes will be waiting connection between
them.
That's why I am looking for the solution which could help me to get one
node working in this situation (after reboot).
I've been thinking about some kind of marker which could help a node to
determine a state of the other node.
Like external disk and SCSI reservation command. Maybe you could suggest
another kind of marker?
I am not sure can we use a presents of a file on external SSD as the
marker. Kind of: if there is a file - the other node is alive, if no - node
is dead.
Digimer,
Thanks for the links and information.
Anyway if I go this way, I will write my own daemon to determine a state of
the other node.
Also the information about fence loop is new for me, thanks =)
Thank you,
Kostya
On Tue, Jun 24, 2014 at 10:55 AM, Christine Caulfield <ccaulfie at redhat.com>
wrote:
> On 23/06/14 15:49, Digimer wrote:
>
>> Hi Kostya,
>>
>> I'm having a little trouble understanding your question, sorry.
>>
>> On boot, the node will not start anything, so after booting it, you
>> log in, check that it can talk to the peer node (a simple ping is
>> generally enough), then start the cluster. It will join the peer's
>> existing cluster (even if it's a cluster on just itself).
>>
>> If you booted both nodes, say after a power outage, you will check
>> the connection (again, a simple ping is fine) and then start the cluster
>> on both nodes at the same time.
>>
>
>
> wait_for_all helps with most of these situations. If a node goes down then
> it won't start services until it's seen the non-failed node because
> wait_for_all prevents a newly rebooted node from doing anything on its own.
> This also takes care of the case where both nodes are rebooted together of
> course, because that's the same as a new start.
>
> Chrissie
>
>
> If one of the nodes needs to be shut down, say for repairs or
>> upgrades, you migrate the services off of it and over to the peer node,
>> then you stop the cluster (which tells the peer that the node is leaving
>> the cluster). After that, the remaining node operates by itself. When
>> you turn it back on, you rejoin the cluster and migrate the services back.
>>
>> I think, maybe, you are looking at things more complicated than you
>> need to. Pacemaker and corosync will handle most of this for you, once
>> setup properly. What operating system do you plan to use, and what
>> cluster stack? I suspect it will be corosync + pacemaker, which should
>> work fine.
>>
>> digimer
>>
>> On 23/06/14 10:36 AM, Kostiantyn Ponomarenko wrote:
>>
>>> Hi Digimer,
>>>
>>> Suppose I disabled to cluster on start up, but what about remaining
>>> node, if I need to reboot it?
>>> So, even in case of connection lost between these two nodes I need to
>>> have one node working and providing resources.
>>> How did you solve this situation?
>>> Should it be a separate daemon which checks somehow connection between
>>> the two nodes and decides to run corosync and pacemaker or to keep them
>>> down?
>>>
>>> Thank you,
>>> Kostya
>>>
>>>
>>> On Mon, Jun 23, 2014 at 4:34 PM, Digimer <lists at alteeve.ca
>>> <mailto:lists at alteeve.ca>> wrote:
>>>
>>> On 23/06/14 09:11 AM, Kostiantyn Ponomarenko wrote:
>>>
>>> Hi guys,
>>> I want to gather all possible configuration variants for 2-node
>>> cluster,
>>> because it has a lot of pitfalls and there are not a lot of
>>> information
>>> across the internet about it. And also I have some questions
>>> about
>>> configurations and their specific problems.
>>> VARIANT 1:
>>> -----------------
>>> We can use "two_node" and "wait_for_all" option from Corosync's
>>> votequorum, and set up fencing agents with delay on one of them.
>>> Here is a workflow(diagram) of this configuration:
>>> 1. Node start.
>>> 2. Cluster start (Corosync and Pacemaker) at the boot time.
>>> 3. Wait for all nodes. All nodes joined?
>>> No. Go to step 3.
>>> Yes. Go to step 4.
>>> 4. Start resources.
>>> 5. Split brain situation (something with connection between
>>> nodes).
>>> 6. Fencing agent on the one of the nodes reboots the other node
>>> (there
>>> is a configured delay on one of the Fencing agents).
>>> 7. Rebooted node go to step 1.
>>> There are two (or more?) important things in this configuration:
>>> 1. Rebooted node remains waiting for all nodes to be visible
>>> (connection
>>> should be restored).
>>> 2. Suppose connection problem still exists and the node which
>>> rebooted
>>> the other guy has to be rebooted also (for some reasons). After
>>> reboot
>>> he is also stuck on step 3 because of connection problem.
>>> QUESTION:
>>> -----------------
>>> Is it possible somehow to assign to the guy who won the reboot
>>> race
>>> (rebooted other guy) a status like a "primary" and allow him not
>>> to wait
>>> for all nodes after reboot. And neglect this status after
>>> other node
>>> joined this one.
>>> So is it possible?
>>> Right now that's the only configuration I know for 2 node
>>> cluster.
>>> Other variants are very appreciated =)
>>> VARIANT 2 (not implemented, just a suggestion):
>>> -----------------
>>> I've been thinking about using external SSD drive (or other
>>> external
>>> drive). So for example fencing agent can reserve SSD using SCSI
>>> command
>>> and after that reboot the other node.
>>> The main idea of this is the first node, as soon as a cluster
>>> starts on
>>> it, reserves SSD till the other node joins the cluster, after
>>> that SCSI
>>> reservation is removed.
>>> 1. Node start
>>> 2. Cluster start (Corosync and Pacemaker) at the boot time.
>>> 3. Reserve SSD. Did it manage to reserve?
>>> No. Don't start resources (Wait for all).
>>> Yes. Go to step 4.
>>> 4. Start resources.
>>> 5. Remove SCSI reservation when the other node has joined.
>>> 5. Split brain situation (something with connection between
>>> nodes).
>>> 6. Fencing agent tries to reserve SSD. Did it manage to reserve?
>>> No. Maybe puts node in standby mode ...
>>> Yes. Reboot the other node.
>>> 7. Optional: a single node can keep SSD reservation till he is
>>> alone in
>>> the cluster or till his shut-down.
>>> I am really looking forward to find the best solution (or a
>>> couple of
>>> them =)).
>>> Hope I am not the only person ho is interested in this topic.
>>>
>>>
>>> Thank you,
>>> Kostya
>>>
>>>
>>> Hi Kostya,
>>>
>>> I only build 2-node clusters, and I've not had problems with this
>>> going back to 2009 over dozens of clusters. The tricks I found are:
>>>
>>> * Disable quorum (of course)
>>> * Setup good fencing, and add a delay to the node you you prefer (or
>>> pick one at random, if equal value) to avoid dual-fences
>>> * Disable to cluster on start up, to prevent fence loops.
>>>
>>> That's it. With this, your 2-node cluster will be just fine.
>>>
>>> As for your question; Once a node is fenced successfully, the
>>> resource manager (pacemaker) will take over any services lost on the
>>> fenced node, if that is how you configured it. A node the either
>>> gracefully leaves or dies/fenced should not interfere with the
>>> remaining node.
>>>
>>> The problem is when a node vanishes and fencing fails. Then, not
>>> knowing what the other node might be doing, the only safe option is
>>> to block, otherwise you risk a split-brain. This is why fencing is
>>> so important.
>>>
>>> Cheers
>>>
>>> --
>>> Digimer
>>> Papers and Projects: https://alteeve.ca/w/
>>> What if the cure for cancer is trapped in the mind of a person
>>> without access to education?
>>>
>>> _________________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> <mailto:Pacemaker at oss.clusterlabs.org>
>>> http://oss.clusterlabs.org/__mailman/listinfo/pacemaker
>>> <http://oss.clusterlabs.org/mailman/listinfo/pacemaker>
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started:
>>> http://www.clusterlabs.org/__doc/Cluster_from_Scratch.pdf
>>> <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>
>>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140624/25dbcf62/attachment.htm>
More information about the Pacemaker
mailing list