[Pacemaker] Proposed new stonith topology syntax
Andrew Beekhof
andrew at beekhof.net
Mon Feb 6 21:22:53 UTC 2012
Stonith is never a SPOF.
Something else needs to have failed before fencing has even a chance to do so.
Unless you put all the nodes on the same PDU... but that would be silly.
On Mon, Feb 6, 2012 at 3:29 PM, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
> 06.02.2012 01:55, Andrew Beekhof wrote:
>> On Sat, Feb 4, 2012 at 5:50 AM, Vladislav Bogdanov <bubble at hoster-ok.com> wrote:
>>> Hi Andrew, Dejan, all,
>>>
>>> 25.01.2012 03:24, Andrew Beekhof wrote:
>>> [snip]
>>>>>> If they're for the same host but different devices, then at most
>>>>>> you'll get the commands sent in parallel, guaranteeing simultaneous is
>>>>>> near impossible.
>>>>>
>>>>> Yes, what I meant is almost simultaneous, i.e. that both ports
>>>>> are for a while turned "off" at the same time. I'm not sure how
>>>>> does it work in reality. For instance, how long does the reset
>>>>> command keep the power off on the outlet. So, it should be
>>>>> "simultanous enough" :)
>>>>
>>>> I dont think 'reboot' is an option if you're using multiple devices.
>>>> You have to use 'off' (followed by a manual 'on') for any kind of reliability.
>>>>
>>>
>>> Why not to implement subsequent 'ons' after all 'offs' are confirmed?
>>
>> That could be possible in the future.
>> However since none of this was possible in the old stonithd, its not
>> something I plan for the initial implementation.
>>
>> Also, you're requiring an extra level of intelligence in stonith-ng,
>> to know that even though the admin asked for 'reboot' and the devices
>> support 'reboot', that we should ignore that and do 'off' + 'on' in
>> some specific scenarios.
>>
>>> With some configurable delay f.e.
>>> That would be great for careful admins who keep fencing device lists actual.
>>> From admin's PoV, reset and reset-like on-off operations should not
>>> differ in a result, offending host should be restarted if admin says
>>> 'restart' or 'reboot' in fencing parameters for that host (sorry, do not
>>> remember which one is used).
>>> Need in manual 'on' looks like a limitation for me so I wouldn't use
>>> such fencing mechanism. I prefer to have everything automated and
>>> predictable as much as possible.
>>
>> Then don't put a node under the control of two devices.
>> Have it be two ports on the same host and you wont hit this limitation.
>
> It's a SPOF in the case of PDUs.
>
> I do not use PDUs at all, I have everything ready to shorten 'reset'
> lines on servers instead of plugging off power cords, just waiting for
> linear fencing topology to be implemented in both snonith-ng and crmsh.
>
> So, I just care about generic admin who wants to use PDUs for fencing.
>
>>
>>> If 'on' is not done, then fencing is not doing what you've specified
>>> (for 'reboot/reset' action).
>>>
>>> Even more, if we need to do 'reset' of a host which has two PSUs
>>> connected to two different PDUs, then it should be translated to
>>> 'all-off' - 'delay' - 'all-on' automatically. I would like such powerful
>>> fencing system very much (yes, I'm a careful admin).
>>>
>>> I understand that implementation will require some efforts (even for so
>>> great programmer like you Andrew), but that would be a really useful
>>> feature,
>>>
>>> Best,
>>> Vladislav
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list