[Pacemaker] MySQL Master-Master replication with Corosync and Pacemaker

Thu Jan 26 13:25:11 UTC 2012

Hi,

On Thu, Jan 26, 2012 at 1:43 AM, Peter Scott <Peter at psdt.com> wrote:
> Hello.  Our problem is that a Corosync restart on the idle machine in a
> 2-node cluster shutds down the mysqld process there and we need it to stay
> up for replication.  We are very new to Corosync and Pacemaker and have been
> slogging through every tutorial and document we can find.
>
> Here's the detail: We have two MySQL comasters (each is a master and a slave
> of the other).  Traffic needs to arrive at only one machine at a time
> because otherwise conflicting simultaneous updates at each machine would
> cause a problem.  There is a single IP for clients (192.168.185.50, see
> below).
>
> After much sweating, we came up with the configuration below.  It works: if
> we kill the machine that's in use we see it switch to the other one.  MySQL
> connections are seamlessly rerouted.
>
> The problem is this: Say that dev-mysql01 is the active node.  If we restart
> Corosync on dev-mysql02, it stops mysqld there and does not restart it.  We
> can of course restart it manually but we want to understand why this is
> happening because it surprises us and maybe there are other circumstances
> under which it would either stop mysqld or fail to reatart it.

Corosync is the first layer in the cluster stack (membership and
messaging), Pacemaker is the second layer (cluster resource
management), your services are on the third layer.

You take down the bottom layer, that ensures communication, the upper
layers have no way to talk to the rest of the cluster.

Bottom line, when services are controlled by the cluster and through
manual intervention the processes that control them are stopped,
everything under their control stops as well.

If this is intended for administrative purposes, follow Florian's advice.

HTH,
Dan

>
> mysqld has to run on the inactive machine so that the active one can
> replicate all the transactions there, so that if the active one goes down
> the inactive one can come up in the current state.
>
> Why is a Corosync restart stopping mysqld?
>
> Here's our configuration:
>
> node dev-mysql01
> node dev-mysql02
> primitive DBIP ocf:heartbeat:IPaddr2 \
>        params ip="192.168.185.50" cidr_netmask="24" \
>        op monitor interval="30s"
> primitive mysql ocf:heartbeat:mysql \
>        params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf"
> datadir="/var/lib/mysql" user="mysql" pid="/var/run/mysqld/mysqld.pid"
> socket="/var/lib/mysql/mysql.sock" test_passwd="secret"
> test_table="lbcheck.lbcheck" test_user="lbcheck" \
>        op monitor interval="20s" timeout="10s" \
>        meta migration-threshold="10"
> group mysql_group DBIP mysql
> location master-prefer-node1 mysql_group 50: dev-mysql01
> property $id="cib-bootstrap-options" \
>        dc-version="1.1.2-f059ec7ced7a86ff4a0b963bccfe" \
>        cluster-infrastructure="openais" \
>        expected-quorum-votes="2" \
>        stonith-enabled="false" \
>        no-quorum-policy="ignore"
> rsc_defaults $id="rsc-options" \
>        resource-stickiness="100"
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-- 
Dan Frincu
CCNA, RHCE