[Pacemaker] Very strange behavior on asymmetric cluster

Pavel Levshin pavel at levshin.spb.ru
Thu Mar 17 16:14:45 EDT 2011


17.03.2011 11:05, Andrew Beekhof:
>
> You don't need fake RAs.
> The cluster will behave just fine if the RA is missing, just not if
> it's present and reports bogus status

To be precise, the cluster will behave fine if the RA is missing, and 
cluster tries to "monitor", and all infrastructure works fine (so return 
code "not installed" is not lost somewhere as it have been in my case 
two weeks ago). So this cluster will not tolerate, for example, 
some network faults. It's a strange feature for a fault-tolerant cluster.

Also this design conflicts with idea of "quorum node", which is not 
supposed to run resources. Quorum node, by it's existence only, may 
cause resource failure!

It will not work also in the case when two resources are using one RA, 
and one of those resources is not applicable to a node. Resource agent 
will be here, returning something like "not configured". Here come in 
fake RAs, one per resource.

To be impartial, I would like to know what good do you see in the 
current design, for the case of asymmetrical cluster or quorum node. 
What's good in checking a resource status on nodes where the resource 
can not exist? What justifies increased resource downtime caused by 
monitor failures, which are inevitable in real world?

Is it a kind of automation? But it saves nothing, because cluster 
administrator if forced to delete unused RAs after every installation, 
upgrade and even some reconfigurations.

You are stating that RAs must be reliable. It is a good point. But even 
the Earth is not completely round, and any program may fail, even if it 
is bug-free, due to external problems. Are we concerned about fault 
tolerance and high availability? If so, then we should think of 
erroneous or disastrous conditions.


--
Pavel Levshin





More information about the Pacemaker mailing list