[Pacemaker] spilit brain situation

Dominik Klein dk at in-telegence.net
Fri Feb 6 04:17:32 EST 2009


Romi Verma wrote:
> On Fri, Feb 6, 2009 at 2:28 PM, Andrew Beekhof <beekhof at gmail.com> wrote:
> 
>> On Feb 6, 2009, at 9:53 AM, Romi Verma wrote:
>>
>> Thanks Dominic,
>> i have two questions now.
>>
>> 1) what does no-quorum-policy= suicide means then??  does it remove the
>> resource completely.
>>
>>
>> no, the node kills itself and any other node in the partition
>> this makes no sense in a 2 node cluster because both nodes will do this
>>
> 
> i assume the partition having less nodes will loose quorum and if
> no-quorum-policy is set to suicide  then then they will commit suicide.
>  like if in 3 nodes cluster , if one node looses communication to other
> nodes then there will be two partitions . one will contain 2 nodes and
> second will contain 1 node. partition having 2 nodes will be having quorum
> and will not be affected. partition having 1 node will loose quourm and and
> it will kill itself.  is my understanding is right??

correct

> i want the partition without quorum to reset the nodes instead of killing .
> is it possible.

define the difference between reset node and kill node?

>> 2) why each node is thinking itsef as DC as Andrew said after spilit brain
>> election happens and one node is selected as DC.
>>
>>
>> no, i said after the split-brain is _repaired_ an election occurs.
>> clearly this can't happen during a split-brian because by definition they
>> can't communicate.
>>
> 
> ok got it  , so how do we repair this spilit brain condition . by setting
> no-quorum-policy to reset??  or is there any other way also.

stonith would reboot the node. This means, in case of a clustersoftware
failure that led to loss of communication, the node reboots, restarts
the cluster software and everything should be fine again.

If there's a network problem, you would of course have to fix that ;)

Regards
Dominik

>> This is not happening in my case.
>> i dont have any stonith configured in my cluster . do i need stonith to
>> handle spilit brain situation.
>>
>>
>>
>> On Fri, Feb 6, 2009 at 1:59 PM, Dominik Klein <dk at in-telegence.net> wrote:
>>
>>> Romi Verma wrote:
>>>> Thanks for fast reply ,
>>>> Ok, Let me explain the situation. i have two nodes cluster . i pulled
>>> out
>>>> the network cable of one
>>>> node which produced spilit brain situation. this time both nodes are
>>>> thinking that other one is dead.  each node is thinking itself as DC and
>>> on
>>>> each node cluster is up and running without quorum.
>>>>
>>>> i am new to openais/pacemaker so dont know much but according to some
>>>> documents it seems by default no-quorum-policy is  to "stop" the
>>> cluster. i
>>>> have not specified any no-quorum-policy that's why i expect that my
>>> cluster
>>>> should stop if it looses quorum somehow.
>>> The "stop" refers to the resources. policy=stop on a node with no quorum
>>> means: do not run any resources.
>>>
>>> "ignore" would mean: run resources even though we don't have quorum
>>> (like the old heartbeat behaviour would be)
>>>
>>> "freeze" would mean: run and manage what you did run up to this point,
>>> but don't aquire any other resources.
>>>
>>> Regards
>>> Dominik
>>>
>>>> But in present spilit brain situation , on each node cluster is up and
>>>> running without quorum.   could you please explain why this is
>>> happening.
>>>> Romi
>>>>
>>>>
>>>> On Fri, Feb 6, 2009 at 12:52 PM, Andrew Beekhof <beekhof at gmail.com>
>>> wrote:
>>>>> Well the no-quorum-policy option applies during the split and an
>>> election
>>>>> is held to determine the DC when the partitions reform.
>>>>> Can you be more specific please?
>>>>> On Feb 6, 2009, at 4:54 AM, Romi Verma wrote:
>>>>>
>>>>>
>>>>>
>>>>> hi all,
>>>>>> how does openais + pacemaker (suse 11) cluster handles spilit brain
>>>>>> situation .  can any one explain.
>>>>>>
>>>>>> Thanks,
>>>>>> Romi.




More information about the Pacemaker mailing list