[Pacemaker] Mostly STONITH Questions / Seeking Best Practice

Lars Marowsky-Bree lmb at suse.com
Fri Sep 7 09:21:51 EDT 2012


On 2012-09-06T09:07:44, David Morton <davidmorton78 at gmail.com> wrote:

> What I'm looking for here is not a backup for the existing STONITH
> mechanism but an additional level of (storage based) protection as we are
> using non-clustered filesystems. From what i read in the documentation this
> is the purpose of sfex ? To provide a lock resource, so even if STONITH
> fails silently and / or you are in a split brain situation the storage will
> not be mounted in more than one location ... and used in a group, no
> database services will start = no risk of data corruption ?

STONITH failing silently has not been reported. I'd take this risk over
the added complexity of using sfex.

In a split-brain scenario, STONITH protects the resources; Pacemaker
will not start them unless STONITH has succeeded. (Effectively healing
the split-brain scenario.) If you're considering the case that Pacemaker
itself is faulty and doesn't check with STONITH - well, that's the same
component that you're hoping for will take sfex results into account.
;-)

For this to matter, you need a network failure as well as a silent
STONITH error at the same time. That is not exactly likely.

> If this is not the purpose of sfex, what is the best mechanism to ensure
> filesystems are not mounted more than once in a cluster ? Am I just being
> paranoid ? ;)

STONITH protects you against this.

It *is* the purpose of sfex, it's just that - in scenarios with working
STONITH - it is not needed. The idea behind sfex is mostly to use it in
scenarios where you don't have STONITH.

(And if I had a setup that satisfied the needs for sfex, I might also
investigate sbd instead, and get a real STONITH at the same time.)

> So I assume that  if IO gets held up Pacemaker will wait until the monitor
> fails and then take down the dependent resources on the healthy node ?

Yes. But you're not using OCFS2/GFS2, so this doesn't apply to you.


Regards,
    Lars

-- 
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde





More information about the Pacemaker mailing list