[Pacemaker] Favor one node during stonith?

Digimer lists at alteeve.ca
Wed Aug 13 14:56:58 CEST 2014


On 13/08/14 08:37 AM, Andrey Borzenkov wrote:
> Hi,
>
> Sorry for may be basic question, but it is my first Linux HA project.
>
> I (will) have two node cluster in active/passive configuration -
> single application on one node and second as standby; application si
> implemented as master/slave clone. Is it possible to prioritize node,
> that have active application? So that in case of split brain passive
> node gets killed?
>
> Usually this is done using staggered delay for fencing requests. I
> think that it may be possible to implement in pacemaker using rules,
> but I'm a bit uneasy about how to express it. Rule should select a
> node where master is currently active, not fixed node.
>
> Thank you in advance!
>
> -andrei

Hi Andrei,

   "Basic questions" is how you avoid mistakes, so please never 
apologize for asking them. I sure ask my own basic questions... :P

   First up, a little semantics; "split-brain" is what happens when 
fencing fails. Your asking what happens when the connection between the 
nodes break when both nodes are otherwise happy (sometimes called a 
"partitioning of the cluster", though I don't think there is an official 
term).

   You are right in guessing that it is "delay" to set this. You add the 
attribute 'delay="15"' to the fence method. You put the delay attribute 
on the node you want to win. Here is an example;

http://clusterlabs.org/wiki/STONITH_Levels#Configuring_The_Fence_Methods

   In that example, the node called "pcmk-1" has the 'delay' set, so it 
will get a 15 second head start in fencing "pcmk-2". It works that way 
because what it does is tell the cluster "If you want to fence 'pcmk-1', 
pause for 15 seconds before doing so". So in a 2-node cluster 
partitioning, both would initiate a fence against the other immediately, 
but pcmk-2 would pause before fencing pcmk-1, where pcmk-1 would *not* 
pause before fencing pcmk-2.

   There is another important note when using IPMI-based fencing in 
2-node clusters.

   If you have acpid running on the nodes, then when the node is fenced 
over IPMI, it take ~4 seconds to be forced off. This is because the 
power button is effectively pressed and held. This four seconds is time 
in which the node could get a fence call started against the peer, 
causing a dual-fence. On the surface, the 'delay="15"' should deal with 
this, because 15 > 4, but there are corner cases where the delay alone 
isn't enough.

   Consider a broadcast storm on your network that takes, say, 30 
seconds to fix. By the time that has passed, the delay has expired and 
both nodes will be sitting there trying to fence the other. When the 
storm ends, both will potentially immediately begin a fence against the 
other.

   So to reduce the chance of a dual-fence in this corner case, you want 
to disable acpid. Most servers will react to a power button even by 
nearly instantly powering off, thus further reducing the chance of a 
dual fence because now, even if the delay has failed, there is only a 
fraction of a second between the slower node being fenced and being 
disabled.

hth

digimer

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?



More information about the Pacemaker mailing list