[Pacemaker] Speeding up failover
Andrew Beekhof
andrew at beekhof.net
Wed Aug 7 07:11:12 UTC 2013
On 25/07/2013, at 4:31 PM, Devdas Bhagat <devdas.bhagat at booking.com> wrote:
> We have a master-slave setup for Redis, running 6 instances of Redis on
> each physical host, and one floating IP between them.
>
> Each redis instance is part of a single group.
>
> When we fail over the IP in production, I'm observing this sequence of
> events:
> Pacemaker brings down the floating IP
> Pacemaker demotes the master redis instance
> Pacemaker stops each running redis process in sequence (essentially
> stopping the group)
> Pacemaker promotes the slave
> Pacemaker brings up the floating IP on the former slave
>
> (This follows documented behaviour as I understand it, see
> http://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg05344.html
> for someone else with a similar problem).
>
> Under production traffic load, each redis process takes about 4 to 5
> seconds to sync to disk and cleanup.
Can they be stopped and/or started in parallel?
If so, don't put them in a group - problem solved
> This means that a simple failover
> takes between 24 and 30 seconds, which is a bit too long for us.
> Acceptable failover times would be less than 5 seconds (the lower the
> better).
>
> Is there a configuration option to change the failover process to *not*
> stop the group before promoting the secondary? Alternatively,
> suggestions on how to get pacemaker to manage only the state of the
> redis process but not the process itself are welcome (A process failure
> can be diagnosed by monitoring the response or lack thereof from redis
> itself, so a dead or non responding process can be treated alike as far
> as monitoring it goes).
>
> Devdas Bhagat
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Pacemaker
mailing list