[Pacemaker] Very strange behavior on asymmetric cluster

Serge Dubrouski sergeyfd at gmail.com
Mon Mar 21 12:10:58 EDT 2011

On Sat, Mar 19, 2011 at 4:14 PM, Pavel Levshin <pavel at levshin.spb.ru> wrote:
> 19.03.2011 19:10, Dan Frincu:
>>> Even if that is set, we need to verify that the resources are, indeed,
>>> NOT running where they shouldn't be; remember, it is our job to ensure
>>> that the configured policy is enforced. So, we probe them everywhere to
>>> ensure they are indeed not around, and stop them if we find them.
>> Again, WHY do you need to verify things which cannot happen by setup? If
>> some resource cannot, REALLY CANNOT exist on a node, and administrator can
>> confirm this, why rely on network, cluster stack, resource agents,
>> electricity in power outlet, etc. to verify that 2+2 is still 4?
> Don't want to step on any toes or anything, mainly because me stepping on
> somebody's toes without the person wearing a pair of steel-toe cap boots
> would leave them toeless, but I've been hearing the ranting go on and on and
> just felt like maybe something's missing from the picture, specifically, an
> example for why checking for resources on passive nodes is a good thing,
> which I haven't seen thus far.
> ...
> Ok, so far it sounds perfect, but what happens if on the secondary/passive
> node, someone starts the service, by user error, by upgrading the software
> and thus activating its automatic startup at the given runlevel and
> restarting the secondary node (common practice when performing upgrades in a
> cluster environment), etc. If Pacemaker were not to check all the nodes for
> the service being active or not => epic fail. Its state-based model, where
> it maintains a state of the resources and performs the necessary actions to
> bring the cluster to that state is what saves us from the "epic fail"
> moment.
> Surely you are right. Resources must be monitored on standby nodes to
> prevent such a scenario. You can screw your setup by many other ways,
> howewer. And pacemaker (1.0.10, at least) does not execute recurring monitor
> on passive node, so you may start your service by hands, and it will be
> unnoticed for quite some time.
> What I am talking about is monitoring (probing) of a resource on a node
> where this resource cannot be exist. For example, if you have five nodes in
> your cluster and a DRBD resource, which can, by it's nature, work on no more
> than two nodes. Then, other three of your nodes will be occasionally probed
> for that resource. If that action fails, the resource will be restarted
> everywhere. If that node cannot be fenced, the resource will be dead.

As far as I understand that would require a definition of a "quorum"
node or another special kind of node where resource cannot exist.
Figuring out a a such role from location/collocation rules seems to
complex to me. The idea of quorum node was abandoned by long ago in
favor for some other features/project that Lars mentioned earlier.

> There is still at least one case when such a failure may happen even if RA
> is perfect: misbehaving or highly overloaded node may cause RA timeout. And
> bugs or configuration errors may, of course.
> A resource should not depend on unrelated things, such as nodes which have
> no connections to the resource. Then the resource will be more stable.
> I'm trying to be impartial here, although I may be biased by my experience
> to rule in favor of Pacemaker, but here's a thought, it's a free world, we
> all have the freedom of speech, which I'm also exercising at the moment,
> want something done, do it yourself, patches are being accepted, don't have
> the time, ask people for their help, in a polite manner, wait for them to
> reply, kindly ask them again (and prayers are heard, Steven Dake released
>>> http://www.mail-archive.com/openais@lists.linux-foundation.org/msg06072.html << a
> patch for automatic redundant ring recovery, thank you Steven), want
> something done fast, pay some developers to do it for you, say the folks
> over at www.linbit.com wouldn't mind some sponsorship (and I'm not
> affiliated with them in any way, believe it or not, I'm actually doing this
> without external incentives, from the kindness of my heart so to speak).
> My goal for now is to make the problem clear to the team. It is doubtful
> that such a patch will be accepted without that, given current reaction.
> Moreover, it is not clear how to fix the problem to the best advantage.
> This cluster stack is brilliant. It's a pity to see how it fails to keep a
> resource running while it is relatively simple to avoid unneeded downtime.
> Thank you for participating.
> P.S. There is a crude workaround: op monitor interval="0" timeout="10"
> on_fail="nothing". Obvoiusly, it has own deficiencies.
> --
> Pavel Levshin
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Serge Dubrouski.

More information about the Pacemaker mailing list