<div dir="ltr"><span style="font-family:arial,sans-serif;font-size:13px">Chrissie,</span><br><div><span style="font-family:arial,sans-serif;font-size:13px"><br></span></div><div><font face="arial, sans-serif">I don&#39;t wont to reinvent a quorum disk =)</font></div>


<div><font face="arial, sans-serif">I know about its complexity.</font><br></div><div><font face="arial, sans-serif">That&#39;s why I think that the most reasonable decision for me is to wait till Corosync 2 gets quorum disk :)</font></div>


<div><span style="font-family:arial,sans-serif">But meanwhile I need to deal somehow with my situation.</span><br></div>

<div>So, the possible solution for me is creating a daemon, which will start cluster stack based on some circumstances.<br></div><div><br></div><div>Here is how I see it (any improvements are appreciated):</div><div><div>


<br></div><div>The marker: SCSI reservation of SSD</div><div>IMPORTANT: The daemon should distinguish which node marker belongs to.</div><div>QUESTION: What other markers is it possible to use?</div><div><br></div><div>--------------</div>


<div>Main workflow:</div><div>--------------</div><div>1. Node start</div><div>2. Daemon start</div><div>    2.1. Check the marker. Is marker present?</div><div>        NO:</div><div>            2.1.1. Set marker. Successful?</div>


<div>                NO: Do nothing. (Go to 2.1 and repeat it for few times).</div><div>                YES: Start cluster stack.</div><div>        YES:</div><div>            2.1.2. Ping the other node. Successful?</div>

<div>

                NO: Do nothing: the other node is probably (99%) on.</div><div>                YES:</div><div>                    Remove the marker.</div><div>                    Start cluster stack.[*]</div><div>                    P.S.: In case cluster won&#39;t establish connection with the other node, fencing agent on this node is triggered and will fence the other node (can be fence loop but we can minimize possibility of it[1]).</div>


<div><br></div><div>----------------------</div><div>Split brain situation:</div><div>----------------------</div><div>1. Fencing agent tries to set the marker. Successful?</div><div>    NO: Do nothing: this node is gonna be fenced. Meanwhile this node can be put in standby mode while waiting for fencing.</div>


<div>    YES: STONITH (reboot) the other node. Marker is kept.</div><div><br></div><div>---------</div><div>Benefits:</div><div>---------</div><div>Even after reboot, one of the nodes still starts cluster stack - the one that the marker belongs to.</div>


<div><br></div><div>------------------</div><div>Possible problems:</div><div>------------------</div><div>If the node, that the marker belongs to, is not working, we need to force run cluster stack on the other node.</div>


<div>It requires human interaction.</div><div><br></div><div><br></div><div>=====================</div><div>* In case ping is successful but cluster doesn&#39;t see the other node (is it even possible?) we can do the next:</div>


<div>    a. Daemon starts Corosync.</div><div>    b. Gets a list of nodes and ensures that the other node is present there. This is the guarantee that the nodes are seeing each other in the cluster.</div><div>    c. Starts Pacemaker.</div>


</div><div><br></div><div><br></div></div><div class="gmail_extra"><br clear="all"><div><div dir="ltr">Thank you,<div>Kostya</div></div></div>

<br><br><div class="gmail_quote">On Tue, Jun 24, 2014 at 11:44 AM, Christine Caulfield <span dir="ltr">&lt;<a href="mailto:ccaulfie@redhat.com" target="_blank">ccaulfie@redhat.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div class="">On 24/06/14 09:36, Kostiantyn Ponomarenko wrote:<br>

</div><div class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Hi Chrissie,<br>

<br>

But wait_for_all doesn&#39;t help when there is no connection between the nodes.<br>

Because in case I need to reboot the remaining working node I won&#39;t get<br>

working cluster after that - both nodes will be waiting connection<br>

between them.<br>

That&#39;s why I am looking for the solution which could help me to get one<br>

node working in this situation (after reboot).<br>

I&#39;ve been thinking about some kind of marker which could help a node to<br>

determine a state of the other node.<br>

Like external disk and SCSI reservation command. Maybe you could suggest<br>

another kind of marker?<br>

I am not sure can we use a presents of a file on external SSD as the<br>

marker. Kind of: if there is a file - the other node is alive, if no -<br>

node is dead.<br>

<br>

</blockquote>

<br></div>

More seriously, that solution is harder than it might seem - which is one reason qdiskd was as complex as it became, and why votequorum is as conservative as it is when it comes to declaring a workable cluster. If someone is there to manually reboot nodes then it might be as well for a human decision to be made about which one is capable of running services.<br>


<br>

Chrissie<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="">

Digimer,<br>

<br>

Thanks for the links and information.<br>

Anyway if I go this way, I will write my own daemon to determine a state<br>

of the other node.<br>

Also the information about fence loop is new for me, thanks =)<br>

<br>

Thank you,<br>

Kostya<br>

<br>

<br>

On Tue, Jun 24, 2014 at 10:55 AM, Christine Caulfield<br></div><div><div class="h5">

&lt;<a href="mailto:ccaulfie@redhat.com" target="_blank">ccaulfie@redhat.com</a> &lt;mailto:<a href="mailto:ccaulfie@redhat.com" target="_blank">ccaulfie@redhat.com</a>&gt;&gt; wrote:<br>

<br>

    On 23/06/14 15:49, Digimer wrote:<br>

<br>

        Hi Kostya,<br>

<br>

            I&#39;m having a little trouble understanding your question, sorry.<br>

<br>

            On boot, the node will not start anything, so after booting<br>

        it, you<br>

        log in, check that it can talk to the peer node (a simple ping is<br>

        generally enough), then start the cluster. It will join the peer&#39;s<br>

        existing cluster (even if it&#39;s a cluster on just itself).<br>

<br>

            If you booted both nodes, say after a power outage, you will<br>

        check<br>

        the connection (again, a simple ping is fine) and then start the<br>

        cluster<br>

        on both nodes at the same time.<br>

<br>

<br>

<br>

    wait_for_all helps with most of these situations. If a node goes<br>

    down then it won&#39;t start services until it&#39;s seen the non-failed<br>

    node because wait_for_all prevents a newly rebooted node from doing<br>

    anything on its own. This also takes care of the case where both<br>

    nodes are rebooted together of course, because that&#39;s the same as a<br>

    new start.<br>

<br>

    Chrissie<br>

<br>

<br>

            If one of the nodes needs to be shut down, say for repairs or<br>

        upgrades, you migrate the services off of it and over to the<br>

        peer node,<br>

        then you stop the cluster (which tells the peer that the node is<br>

        leaving<br>

        the cluster). After that, the remaining node operates by itself.<br>

        When<br>

        you turn it back on, you rejoin the cluster and migrate the<br>

        services back.<br>

<br>

            I think, maybe, you are looking at things more complicated<br>

        than you<br>

        need to. Pacemaker and corosync will handle most of this for<br>

        you, once<br>

        setup properly. What operating system do you plan to use, and what<br>

        cluster stack? I suspect it will be corosync + pacemaker, which<br>

        should<br>

        work fine.<br>

<br>

        digimer<br>

<br>

        On 23/06/14 10:36 AM, Kostiantyn Ponomarenko wrote:<br>

<br>

            Hi Digimer,<br>

<br>

            Suppose I disabled to cluster on start up, but what about<br>

            remaining<br>

            node, if I need to reboot it?<br>

            So, even in case of connection lost between these two nodes<br>

            I need to<br>

            have one node working and providing resources.<br>

            How did you solve this situation?<br>

            Should it be a separate daemon which checks somehow<br>

            connection between<br>

            the two nodes and decides to run corosync and pacemaker or<br>

            to keep them<br>

            down?<br>

<br>

            Thank you,<br>

            Kostya<br>

<br>

<br>

            On Mon, Jun 23, 2014 at 4:34 PM, Digimer &lt;<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a><br>

            &lt;mailto:<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a>&gt;<br></div></div><div><div class="h5">

            &lt;mailto:<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a> &lt;mailto:<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a>&gt;&gt;&gt; wrote:<br>

<br>

                 On 23/06/14 09:11 AM, Kostiantyn Ponomarenko wrote:<br>

<br>

                     Hi guys,<br>

                     I want to gather all possible configuration<br>

            variants for 2-node<br>

                     cluster,<br>

                     because it has a lot of pitfalls and there are not<br>

            a lot of<br>

                     information<br>

                     across the internet about it. And also I have some<br>

            questions<br>

            about<br>

                     configurations and their specific problems.<br>

                     VARIANT 1:<br>

                     -----------------<br>

                     We can use &quot;two_node&quot; and &quot;wait_for_all&quot; option<br>

            from Corosync&#39;s<br>

                     votequorum, and set up fencing agents with delay on<br>

            one of them.<br>

                     Here is a workflow(diagram) of this configuration:<br>

                     1. Node start.<br>

                     2. Cluster start (Corosync and Pacemaker) at the<br>

            boot time.<br>

                     3. Wait for all nodes. All nodes joined?<br>

                           No. Go to step 3.<br>

                           Yes. Go to step 4.<br>

                     4. Start resources.<br>

                     5. Split brain situation (something with connection<br>

            between<br>

            nodes).<br>

                     6. Fencing agent on the one of the nodes reboots<br>

            the other node<br>

                     (there<br>

                     is a configured delay on one of the Fencing agents).<br>

                     7. Rebooted node go to step 1.<br>

                     There are two (or more?) important things in this<br>

            configuration:<br>

                     1. Rebooted node remains waiting for all nodes to<br>

            be visible<br>

                     (connection<br>

                     should be restored).<br>

                     2. Suppose connection problem still exists and the<br>

            node which<br>

                     rebooted<br>

                     the other guy has to be rebooted also (for some<br>

            reasons). After<br>

                     reboot<br>

                     he is also stuck on step 3 because of connection<br>

            problem.<br>

                     QUESTION:<br>

                     -----------------<br>

                     Is it possible somehow to assign to the guy who won<br>

            the reboot<br>

            race<br>

                     (rebooted other guy) a status like a &quot;primary&quot; and<br>

            allow him not<br>

                     to wait<br>

                     for all nodes after reboot. And neglect this status<br>

            after<br>

            other node<br>

                     joined this one.<br>

                     So is it possible?<br>

                     Right now that&#39;s the only configuration I know for<br>

            2 node<br>

            cluster.<br>

                     Other variants are very appreciated =)<br>

                     VARIANT 2 (not implemented, just a suggestion):<br>

                     -----------------<br>

                     I&#39;ve been thinking about using external SSD drive<br>

            (or other<br>

            external<br>

                     drive). So for example fencing agent can reserve<br>

            SSD using SCSI<br>

                     command<br>

                     and after that reboot the other node.<br>

                     The main idea of this is the first node, as soon as<br>

            a cluster<br>

                     starts on<br>

                     it, reserves SSD till the other node joins the<br>

            cluster, after<br>

                     that SCSI<br>

                     reservation is removed.<br>

                     1. Node start<br>

                     2. Cluster start (Corosync and Pacemaker) at the<br>

            boot time.<br>

                     3. Reserve SSD. Did it manage to reserve?<br>

                           No. Don&#39;t start resources (Wait for all).<br>

                           Yes. Go to step 4.<br>

                     4. Start resources.<br>

                     5. Remove SCSI reservation when the other node has<br>

            joined.<br>

                     5. Split brain situation (something with connection<br>

            between<br>

            nodes).<br>

                     6. Fencing agent tries to reserve SSD. Did it<br>

            manage to reserve?<br>

                           No. Maybe puts node in standby mode ...<br>

                           Yes. Reboot the other node.<br>

                     7. Optional: a single node can keep SSD reservation<br>

            till he is<br>

                     alone in<br>

                     the cluster or till his shut-down.<br>

                     I am really looking forward to find the best<br>

            solution (or a<br>

                     couple of<br>

                     them =)).<br>

                     Hope I am not the only person ho is interested in<br>

            this topic.<br>

<br>

<br>

                     Thank you,<br>

                     Kostya<br>

<br>

<br>

                 Hi Kostya,<br>

<br>

                    I only build 2-node clusters, and I&#39;ve not had<br>

            problems with this<br>

                 going back to 2009 over dozens of clusters. The tricks<br>

            I found are:<br>

<br>

                 * Disable quorum (of course)<br>

                 * Setup good fencing, and add a delay to the node you<br>

            you prefer (or<br>

                 pick one at random, if equal value) to avoid dual-fences<br>

                 * Disable to cluster on start up, to prevent fence loops.<br>

<br>

                    That&#39;s it. With this, your 2-node cluster will be<br>

            just fine.<br>

<br>

                    As for your question; Once a node is fenced<br>

            successfully, the<br>

                 resource manager (pacemaker) will take over any<br>

            services lost on the<br>

                 fenced node, if that is how you configured it. A node<br>

            the either<br>

                 gracefully leaves or dies/fenced should not interfere<br>

            with the<br>

                 remaining node.<br>

<br>

                    The problem is when a node vanishes and fencing<br>

            fails. Then, not<br>

                 knowing what the other node might be doing, the only<br>

            safe option is<br>

                 to block, otherwise you risk a split-brain. This is why<br>

            fencing is<br>

                 so important.<br>

<br>

                 Cheers<br>

<br>

                 --<br>

                 Digimer<br>

                 Papers and Projects: <a href="https://alteeve.ca/w/" target="_blank">https://alteeve.ca/w/</a><br>

                 What if the cure for cancer is trapped in the mind of a<br>

            person<br>

                 without access to education?<br>

<br></div></div>

                 ______________________________<u></u>_____________________<div class=""><br>

                 Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org" target="_blank">Pacemaker@oss.clusterlabs.org</a><br>

            &lt;mailto:<a href="mailto:Pacemaker@oss.clusterlabs.org" target="_blank">Pacemaker@oss.<u></u>clusterlabs.org</a>&gt;<br></div><div class="">

                 &lt;mailto:<a href="mailto:Pacemaker@oss." target="_blank">Pacemaker@oss.</a>__<a href="http://clusterlabs.org" target="_blank">cluste<u></u>rlabs.org</a><br>

            &lt;mailto:<a href="mailto:Pacemaker@oss.clusterlabs.org" target="_blank">Pacemaker@oss.<u></u>clusterlabs.org</a>&gt;&gt;<br>

            <a href="http://oss.clusterlabs.org/____mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/___<u></u>_mailman/listinfo/pacemaker</a><br>

            &lt;<a href="http://oss.clusterlabs.org/__mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/__<u></u>mailman/listinfo/pacemaker</a>&gt;<br>

<br></div><div class="">

            &lt;<a href="http://oss.clusterlabs.org/__mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/__<u></u>mailman/listinfo/pacemaker</a><br>

            &lt;<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/<u></u>mailman/listinfo/pacemaker</a>&gt;&gt;<br>

<br>

                 Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>

                 Getting started:<br></div>

            <a href="http://www.clusterlabs.org/____doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/___<u></u>_doc/Cluster_from_Scratch.pdf</a><br>

            &lt;<a href="http://www.clusterlabs.org/__doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/__<u></u>doc/Cluster_from_Scratch.pdf</a>&gt;<div><div class="h5"><br>

<br>

            &lt;<a href="http://www.clusterlabs.org/__doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/__<u></u>doc/Cluster_from_Scratch.pdf</a><br>

            &lt;<a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/<u></u>doc/Cluster_from_Scratch.pdf</a>&gt;&gt;<br>

                 Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>

<br>

<br>

<br>

<br>

            ______________________________<u></u>___________________<br>

            Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org" target="_blank">Pacemaker@oss.clusterlabs.org</a><br>

            &lt;mailto:<a href="mailto:Pacemaker@oss.clusterlabs.org" target="_blank">Pacemaker@oss.<u></u>clusterlabs.org</a>&gt;<br>

            <a href="http://oss.clusterlabs.org/__mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/__<u></u>mailman/listinfo/pacemaker</a><br>

            &lt;<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/<u></u>mailman/listinfo/pacemaker</a>&gt;<br>

<br>

            Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>

            Getting started:<br>

            <a href="http://www.clusterlabs.org/__doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/__<u></u>doc/Cluster_from_Scratch.pdf</a><br>

            &lt;<a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/<u></u>doc/Cluster_from_Scratch.pdf</a>&gt;<br>

            Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>

<br>

<br>

<br>

<br>

<br>

    ______________________________<u></u>___________________<br>

    Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org" target="_blank">Pacemaker@oss.clusterlabs.org</a><br>

    &lt;mailto:<a href="mailto:Pacemaker@oss.clusterlabs.org" target="_blank">Pacemaker@oss.<u></u>clusterlabs.org</a>&gt;<br>

    <a href="http://oss.clusterlabs.org/__mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/__<u></u>mailman/listinfo/pacemaker</a><br>

    &lt;<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/<u></u>mailman/listinfo/pacemaker</a>&gt;<br>

<br>

    Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>

    Getting started:<br>

    <a href="http://www.clusterlabs.org/__doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/__<u></u>doc/Cluster_from_Scratch.pdf</a><br>

    &lt;<a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/<u></u>doc/Cluster_from_Scratch.pdf</a>&gt;<br>

    Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>

<br>

<br>

<br>

<br>

______________________________<u></u>_________________<br>

Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org" target="_blank">Pacemaker@oss.clusterlabs.org</a><br>

<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/<u></u>mailman/listinfo/pacemaker</a><br>

<br>

Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>

Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/<u></u>doc/Cluster_from_Scratch.pdf</a><br>

Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>

<br>

</div></div></blockquote><div class="HOEnZb"><div class="h5">

<br>

<br>

______________________________<u></u>_________________<br>

Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org" target="_blank">Pacemaker@oss.clusterlabs.org</a><br>

<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/<u></u>mailman/listinfo/pacemaker</a><br>

<br>

Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>

Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/<u></u>doc/Cluster_from_Scratch.pdf</a><br>

Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>

</div></div></blockquote></div><br></div>