[Pacemaker] drbd connection
Digimer
lists at alteeve.ca
Mon Jun 17 13:16:33 EDT 2013
On 06/17/2013 12:30 PM, Elmar Marschke wrote:
>
> Am 17.06.2013 15:59, schrieb Digimer:
>> On 06/17/2013 09:53 AM, andreas graeper wrote:
>>> hi,
>>> i will not have a stonith-device. i can test for a day a 'expert power
>>> control 8212', but in the end i will stay without.
>>
>> This is an extremely flawed approach. Clustering with shared storage and
>> without stonith will certainly cause data loss or corruption eventually.
>> I can not stress this enough.
>
> hi all,
>
> just an idea, or moreover a question: what about using drbd's abilities
> to automatically handle split brain situations instead of "real
> stonithing" ; maybe like this (global_common.conf):
>
> handlers {
> split-brain "/usr/lib/drbd/notify-split-brain.sh root";
> pri-lost-after-sb "/usr/local/sbin/reboot.sh";
> }
>
> net {
> after-sb-0pri discard-least-changes;
> after-sb-1pri call-pri-lost-after-sb;
> after-sb-2pri call-pri-lost-after-sb;
> }
>
> Couldn't this work like a "poor man's stonith device"?
> (Of course this reboots the whole node with all ressources and discards
> the node with the least changes, but maybe there are situations where
> this doesn't matter?)
>
> regards
>
> Elmar
There are two issues here.
First; Pacemaker/corosync needs fencing anyway, and it has a very large
array of supported fence devices. These are very well tested in the field.
Second; If you put fencing into DRBD directly, you are duplicating
effort and configs. The 'crm-fence-peer.sh' script was written to "hook"
DRBD's fencing into the existing pacemaker fencing. This way, you have
one place to configure and maintain, rather than two.
Back to this specific case;
Andreas tested by failing corosync. This would trigger pacemaker to see
the node as failed and try to recover the services on the backup node.
All of this happens without DRBD directly knowing what was going on. Had
Andreas configured fencing, as soon as pacemaker called it's fence
against the peer, it would have shut down and then DRBD would have known
something was wrong (and block) before a split-brain could occur.
It also would mean that, when pacemaker recovered/promoted the surviving
node, it would not have happened until the peer was off, also protecting
against a split-brain.
--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
More information about the Pacemaker
mailing list