[Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

Andrew Beekhof andrew at beekhof.net
Mon Jul 1 11:14:35 UTC 2013


On 01/07/2013, at 5:32 PM, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:

> 29.06.2013 02:22, Andrew Beekhof wrote:
>> 
>> On 29/06/2013, at 12:22 AM, Digimer <lists at alteeve.ca> wrote:
>> 
>>> On 06/28/2013 06:21 AM, Andrew Beekhof wrote:
>>>> 
>>>> On 28/06/2013, at 5:22 PM, Lars Marowsky-Bree <lmb at suse.com> wrote:
>>>> 
>>>>> On 2013-06-27T12:53:01, Digimer <lists at alteeve.ca> wrote:
>>>>> 
>>>>>> primitive fence_n01_psu1_off stonith:fence_apc_snmp \
>>>>>>      params ipaddr="an-p01" pcmk_reboot_action="off" port="1"
>>>>>> pcmk_host_list="an-c03n01.alteeve.ca"
>>>>>> primitive fence_n01_psu1_on stonith:fence_apc_snmp \
>>>>>>      params ipaddr="an-p01" pcmk_reboot_action="on" port="1"
>>>>>> pcmk_host_list="an-c03n01.alteeve.ca"
>>>>> 
>>>>> So every device twice, including location constraints? I see potential
>>>>> for optimization by improving how the fence code handles this ... That's
>>>>> abhorrently complex. (And I'm not sure the 'action' parameter ought to
>>>>> be overwritten.)
>>>> 
>>>> I'm not crazy about it either because it means the device is tied to a specific command.
>>>> But it seems to be something all the RHCS people try to do...
>>> 
>>> Maybe something in the rhcs water cooler made us all mad... ;)
>>> 
>>>>> Glad you got it working, though.
>>>>> 
>>>>>> location loc_fence_n01_ipmi fence_n01_ipmi -inf: an-c03n01.alteeve.ca
>>>>> [...]
>>>>> 
>>>>> I'm not sure you need any of these location constraints, by the way. Did
>>>>> you test if it works without them?
>>>>> 
>>>>>> Again, this is after just one test. I will want to test it several more
>>>>>> times before I consider it reliable. Ideally, I would love to hear
>>>>>> Andrew or others confirm this looks sane/correct.
>>>>> 
>>>>> It looks correct, but not quite sane. ;-) That seems not to be
>>>>> something you can address, though. I'm thinking that fencing topology
>>>>> should be smart enough to, if multiple fencing devices are specified, to
>>>>> know how to expand them to "first all off (if off fails anywhere, it's a
>>>>> failure), then all on (if on fails, it is not a failure)". That'd
>>>>> greatly simplify the syntax.
>>>> 
>>>> The RH agents have apparently already been updated to support multiple ports.
>>>> I'm really not keen on having the stonith-ng doing this.
>>> 
>>> This doesn't help people who have dual power rails/PDUs for power
>>> redundancy.
>> 
>> I'm yet to be convinced that having two PDUs is helping those people in the first place.
>> If it were actually useful, I suspect more than two/three people would have asked for it in the last decade.
> 
> I'm just silently waiting for this to happen.

Rarely a good plan.
Better to make my life so miserable that implementing it seems like a vacation in comparison :)

> Although I use different fencing scheme (and plan to use even more
> different one), that is very nice fall-back path for me. And I strongly
> prefer all complexities like reboot -> off-off-on-on to be hidden from
> the configuration. Naturally, that is task for the entity which has
> whole picture of what to do - stonithd. Just my 'IMHO'.

If the tides of public opinion change, then yes, stonithd is the place.
But I can't justify the effort for only a handful of deployments.

> 
> And, to PSU/PDU. I, like Digimer, always separate power circuits as much
> as possible. Of course I always use redundant PSUs.
> 
> Vladislav
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org





More information about the Pacemaker mailing list