[Pacemaker] Very strange behavior on asymmetric cluster
Andrew Beekhof
andrew at beekhof.net
Fri Mar 18 07:58:30 UTC 2011
On Thu, Mar 17, 2011 at 9:14 PM, Pavel Levshin <pavel at levshin.spb.ru> wrote:
> 17.03.2011 11:05, Andrew Beekhof:
>>
>> You don't need fake RAs.
>> The cluster will behave just fine if the RA is missing, just not if
>> it's present and reports bogus status
>
> To be precise, the cluster will behave fine if the RA is missing, and
> cluster tries to "monitor", and all infrastructure works fine (so return
> code "not installed" is not lost somewhere as it have been in my case two
> weeks ago). So this cluster will not tolerate, for example, some network
> faults. It's a strange feature for a fault-tolerant cluster.
>
> Also this design conflicts with idea of "quorum node", which is not supposed
> to run resources. Quorum node, by it's existence only, may cause resource
> failure!
Not at all, just dont run the pacemaker part there.
>
> It will not work also in the case when two resources are using one RA, and
> one of those resources is not applicable to a node. Resource agent will be
> here, returning something like "not configured". Here come in fake RAs, one
> per resource.
No, they don't.
> To be impartial,
Good one.
> I would like to know what good do you see in the current
> design, for the case of asymmetrical cluster or quorum node. What's good in
> checking a resource status on nodes where the resource can not exist? What
> justifies increased resource downtime caused by monitor failures, which are
> inevitable in real world?
>
> Is it a kind of automation? But it saves nothing, because cluster
> administrator if forced to delete unused RAs after every installation,
> upgrade and even some reconfigurations.
>
> You are stating that RAs must be reliable. It is a good point. But even the
> Earth is not completely round, and any program may fail, even if it is
> bug-free, due to external problems. Are we concerned about fault tolerance
> and high availability? If so, then we should think of erroneous or
> disastrous conditions.
>
>
> --
> Pavel Levshin
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
More information about the Pacemaker
mailing list