[Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

Digimer lists at alteeve.ca
Mon Jul 1 14:35:47 EDT 2013


On 07/01/2013 01:57 PM, Lars Marowsky-Bree wrote:
> On 2013-07-01T13:52:22, Digimer <lists at alteeve.ca> wrote:
> 
>> 1. It won't (reliably) work with DRBD because.
> 
> Not by itself, no. You need shared storage for it, not replicated
> storage. (Though the shared storage can be provided by other nodes via
> iSCSI too.)

True, but it's an additional layer at questionable ROI.

>> 2. I never trust a fence method that requires the victim be in any way
>> operational.
> 
> The watchdog integration is hardware-assisted. It's similar to relying
> on the hypervisor for fencing, or the management board.

Yup, watchdog is acceptable as it doesn't rely of the node being
functional. Trick there is how long it takes for the watchdog to bite
the node. It's much slower MTTR than IPMI/PDU. Perfectly acceptable, but
doesn't suit my use case.

>> 3. If I have a SAN, I probably have a SAN switch and can do fabric fencing.
> 
> That is not the same. Fabric fencing only cuts the node off from shared
> storage; it does not trigger a reboot, nor does it cut the node off from
> misbehaving on the network. (Unless you can fence that too, which is
> more tricky to setup.)
> 
> One of the key advantages is that it is trivial to set up, and works
> even on shared storage that doesn't support SCSI(2/3) reservations.

Agreed, that is why I don't use fabric fencing myself. That said, I
wrote 'fence_dlink_snmp' as a way to totally isolate a failed node from
the network, as a proof of concept. I've never deployed it though.

I'd debate "trivial" though, as it has some hefty hardware requirements
(SANs aren't cheap, last I checked). If you have the hardware though,
then you have a point.

>> 4. There is a similar mechanism already in the fence_* world;
>> https://alteeve.ca/w/Watchdog_Recovery
> 
> Oh, someone reinvented the wheel. Nice ;-)

Maybe they got tired of waiting for a wrapper? Oh snap! ;)

> fence_sanlock though doesn't properly support multiple fencing devices
> (sbd supports 1 to 3), and it doesn't offer the additional level of
> pacemaker integration that sbd has.
> 
> I'm curious; you claim 'fence_sanlock' is a "real fence method", but you
> seem to suggest sb

I don't, I was just writing the docs for it. However, anything that
works outside of the node is a "real" fence device. So sbd and
fence_sanlock are equivalently "real".

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?




More information about the Pacemaker mailing list