[Pacemaker] drbd connection

Digimer lists at alteeve.ca
Mon Jun 17 10:37:02 EDT 2013


If you look in your logs when you try to connect the two nodes, you will 
likely see a message like "split-brain detected, dropping connection". 
This is the result of not using fencing as you created a condition where 
both nodes went StandAlone and Primary.

To prevent this, you need to setup pacemaker to use stonith, make sure 
it works, and then link DRBD into it by setting 'resource-and-stonith' 
fence policy and use the 'crm-fence-peer.sh' to hook DRBD's fencing into 
pacemaker.

This way, when you fail a node like you did, both nodes will block (stop 
accessing storage), call their fence and the faster node will win and 
the slower/failed node will be forced off. Only then will the remaining 
node resume work and it can do so confident that it's peer is not still 
running, thus avoiding a split-brain.

You must use fencing.

digimer

AND PLEASE REPLY TO THE MAILING LIST! It benefits everyone when 
conversations like this are public.


On 06/17/2013 10:16 AM, andreas graeper wrote:
> i am still testing, want to learn, i do not prepair a system for
> production, not yet.
>
> but i still would like to know, how i can get out of the situation in
> which two drbd-standalone nodes does not connect.
>
> the former passive node was invalidated (locally, no connection. but i
> think when both try to connect, the invalidated tells the other about
> being inconsistent ?!)
>
> between Standalone and WFConnection is there a state Unconnected ?
> if one of both nodes tries to change from Standalone to WFConnect he
> falls back to Standalone after short Unconnected, if the other node is
> in WFConnection.
>
> thanks in advance
> andreas
>
>
> 2013/6/17 Digimer <lists at alteeve.ca <mailto:lists at alteeve.ca>>
>
>     On 06/17/2013 09:53 AM, andreas graeper wrote:
>
>         hi,
>
>         i will not have a stonith-device. i can test for a day a 'expert
>         power
>         control 8212', but in the end i will stay without.
>
>
>     This is an extremely flawed approach. Clustering with shared storage
>     and without stonith will certainly cause data loss or corruption
>     eventually. I can not stress this enough.
>
>     Please also keep replies on list. It helps others that these
>     conversations get archived.
>
>
>     --
>     Digimer
>     Papers and Projects: https://alteeve.ca/w/
>     What if the cure for cancer is trapped in the mind of a person
>     without access to education?
>
>


-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?




More information about the Pacemaker mailing list