[Pacemaker] SBD Fencing daemon: explain me more clear
Lars Marowsky-Bree
lmb at novell.com
Tue Jun 15 13:12:58 UTC 2010
On 2010-06-14T17:24:16, Aleksey Zholdak <aleksey at zholdak.com> wrote:
Hi Aleksey,
> Can anybody explain me more clear than on official and (IMHO)
> outdated page http://www.linux-ha.org/wiki/SBD_Fencing next:
>
> What timeouts I must specify, if my multipath needs from 90 to 160
> secs to be switched off the dead path... Timeouts below are maybe
> wrong because sometime node1 kills node2 (or vice versa) or some
> node makes suicide...
>
> > Timeout (watchdog) : 90
> > Timeout (allocate) : 2
> > Timeout (loop) : 10
> > Timeout (msgwait) : 180
>
> And what logic in the calculation of the above timeouts?
Well, 90-160s is a very long time; that effectively could make SBD
unusable in your environment, basically you're introducing a delay of at
least 160s on each fail-over. (At least with the current sbd
implementation.)
You need to increase the watchdog timeout to >160s - probably 180s
should be good in your environment, if you completely want to eliminate
spurious self-fencing.
msgwait should be larger than watchdog timeout; so probably 200s, which
will imply a 200s latency on fail-over.
You may want to make the timeouts lower, leading to a faster fail-over,
since the work-load is paused during the MPIO downtime too I assume, so
fail-over may actually be faster than waiting for MPIO to recover.
But with a ~160s MPIO latency, I'd personally be wary to use sbd
fencing. Why is the MPIO scenario so slow?
Regards,
Lars
--
Architect Storage/HA, OPS Engineering, Novell, Inc.
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde
More information about the Pacemaker
mailing list