[Pacemaker] Pacemaker issues on Amazon EC2

Mon Jun 17 17:19:45 EDT 2013

tl;dr summary: On EC2, we can't reuse IP addresses, and we need a reliable, scriptable procedure for replacing a dead (guaranteed no longer running) server with another one without needing to take the remaining cluster members down.

I'm trying to build a Pacemaker solution using Percona Replication Manager (https://github.com/jayjanssen/Percona-Pacemaker-Resource-Agents/blob/master/doc/PRM-setup-guide.rst) within our EC2 environment. Essentially, the architecture would be 3 independent MySQL servers, running in different data centers, each of which runs Pacemaker/Corosync with an agent that manages master/slave replication.

I have a script that builds a new instance from the base OS, which installs the cluster software, generates the appropriate config files, and loads the CRM configuration on boot. This is the method we use to launch servers; in the event that a server dies, we don't attempt to recover it. Instead, we launch an entirely new instance (possibly even in a different data center), which corresponds to building a brand new server, assigning it a new private IP address. (Every server has a private IP address that directs traffic within the data center, and a public address that leaves the cloud only to come back in, introducing security implications, latency and additional cost.) Ideally, the boot script should be able to handle everything on its own -- we should be able to create the instance, and by the time it's finished running, the new box should be in the cluster as a slave, taking the place of whichever one had previously died.

The problem I'm running into is that because we're on EC2, we don't control our IP address allocation. If we did, we'd start a new server with the same IP as the one that it's replacing, and my understanding is that Pacemaker would pick right back up and let it join the cluster. Instead, because it has a new IP, we always end up in a split-brain situation, where the two original members of the cluster see each other but think the third is down, and the new one thinks it's the first member of a new cluster with two members that are down. The only way I've found to correct this is to stop pacemaker/corosync on all instances, regenerate the config files, and start them up again. This is not really an ideal scenario.

Does anyone have any experience or suggestions with working in this kind of situation? Moving off of EC2 is not an option; creating a private network (Amazon VPC) so that we can get static addresses has performance implications we'd rather avoid. Any ideas for solutions or reliable workarounds, especially if they can be scriptable, would be extremely helpful. (That is, we won't have any process that automatically replaces a server after one goes down, but we would like to be able to have the chef boot script, which is kicked off manually, be able to go from software installation to rejoining the cluster automatically.)

Some options we have available, along with some things we've tried:

- We can create DNS entries for the three servers by known names (i.e. mysql-01, mysql-02, mysql-03) which point to the private network IP addresses. We can put those hostnames into the config files, or we can resolve them at boot time and put the IP addresses directly. However, this requires that all three servers be online before running the installation scripts on any one box. The ideal solution would use only hostnames and re-resolve the IP any time the cluster needs to configure membership, thus letting any new server take over the DNS entry but not the IP address.

- We can create an Elastic IP, which provides a static public IP even before any of the servers are running. This way, the config can always reference that IP, and always be accessible, but requires the traffic going to that IP leave the cloud, which we'd like to avoid. Given that pacemaker/corosync is relatively low traffic, however, having only those services run over the public IP would be acceptable; however, so far that has not seemed to solve our split-brain problem.

- We can always ensure that there is only one server corresponding to one of the DNS entries at any given time. (That is, no running server thinks that it's mysql-02 if we launch another one with the same name.)

- We can regenerate the corosync.conf at any time without requiring the services to be stopped, if it's possible to have that config take effect without a service restart.

- We can always determine the current IPs of all members from external scripts via DNS.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130617/6076718f/attachment-0002.html>