[Pacemaker] Adding VIP support for the MySQL RA

Sat Nov 12 15:56:55 EST 2011

An N+1 or N+X topology might be good for that cascading scenario...  Find a
sweet spot for the evict date.  If slave are lagging too much,  scale and
tune.

I haven't read Yves' patch, but I'll check it out.  I just saw that he was
looking for slave to work with VIP and suggested a couple ways I've seen it
work.

On Sat, Nov 12, 2011 at 2:51 PM, Florian Haas <florian at hastexo.com> wrote:

> Hi Yves and Michael,
>
> On 2011-11-12 19:22, Yves Trudeau wrote:
> > lol... How many large databases have you managed?  Once evicted, MySQL
> > will be restarted by Pacemaker so all the caches will be cold.
>
> If I may say so, before you start laughing at people on the list, it may
> be a good idea to actually get your facts straight and check what
> evict_outdated_slaves does. For a too-far-behind slave it bails out of
> monitor with $OCF_ERR_INSTALLED, which Pacemaker considers a hard error.
> Thus, that instance will _not_ be restarted by Pacemaker on this node
> unless an administrator intervenes.
>
> Still, Michael, Yves has a point that evict_outdated_slaves is not
> optimal (and I'm saying this as the guy that wrote that part of the
> agent). It's fine for a temporary problem that affects a single slave,
> but please consider this scenario:
>
> - High load on the database, across several instances.
> - Slaves start lagging behind.
> - We shut down a slave that is too far behind.
> - We now have _fewer_ instances to handle the same load.
> - Slaves fall further behind.
> - We shut down more slaves.
>
> This can turn into a cascading failure. Note, specifically, that the
> lagging slave has no real option to catch up even when the database
> isn't being hammered anymore, unless an admin has intervened and
> recovered/restarted the instance manually. And, of course, Yves' point
> about cold caches is entirely valid.
>
> In Yves' approach, we wouldn't shut down MySQL, but merely shift away
> the slave's virtual IP. So while clients can't connect to the slave via
> its virtual IP anymore, the slave can still fetch updates from the
> master -- and thus, actually has a chance to catch up. Once it's
> sufficiently caught up, it gets the VIP back and clients can talk to
> that slave again. And since we never stopped MySQL, we also don't have
> the cold cache problem.
>
> Yves' patches are not perfect (and they're not expected to be, that's
> what a review is for), but I think his approach is sound and shouldn't
> be shot down simply because evict_outdated_slaves is already there.
>
> Cheers,
> Florian
>
> --
> Need help with High Availability?
> http://www.hastexo.com/now
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20111112/b30d49ae/attachment-0003.html>