[Pacemaker] Slave does not start after failover: Mysql circular replication and master-slave resources
Attila Megyeri
amegyeri at minerva-soft.com
Thu Dec 15 14:42:30 CET 2011
Hi All,
Some time ago I exchanged a couple of posts with you here regarding Mysql active-active HA.
The best solution I found so far was the Mysql multi-master replication, also referred to as circular replication.
Basically I set up two nodes, both were capable of the master role, and the changes were immediately propagated to the other node.
But still I wanted to have a M/S approach, to have a RW master and a RO slave - mainly because I prefer to have a signle master VIP where my apps can connect to.
(In the first approach I configured a two node clone, and the master IP was always bound to one of the nodes)
I applied the following configuration:
node db1 \
attributes IP="10.100.1.31" \
attributes standby="off" db2-log-file-db-mysql="mysql-bin.000021" db2-log-pos-db-mysql="40730"
node db2 \
attributes IP="10.100.1.32" \
attributes standby="off"
primitive db-ip-master ocf:heartbeat:IPaddr2 \
params lvs_support="true" ip="10.100.1.30" cidr_netmask="8" broadcast="10.255.255.255" \
op monitor interval="20s" timeout="20s" \
meta target-role="Started"
primitive db-mysql ocf:heartbeat:mysql \
params binary="/usr/bin/mysqld_safe" config="/etc/mysql/my.cnf" datadir="/var/lib/mysql" user="mysql" pid="/var/run/mysqld/mysqld.pid" socket="/var/run/mysqld/mysqld.sock" test_passwd="XXXXX"
test_table="replicatest.connectioncheck" test_user="slave_user" replication_user="slave_user" replication_passwd="XXXXX" additional_parameters="--skip-slave-start" \
op start interval="0" timeout="120s" \
op stop interval="0" timeout="120s" \
op monitor interval="30" timeout="30s" OCF_CHECK_LEVEL="1" \
op promote interval="0" timeout="120" \
op demote interval="0" timeout="120"
ms db-ms-mysql db-mysql \
meta notify="true" master-max="1" clone-max="2" target-role="Started"
colocation db-ip-with-master inf: db-ip-master db-ms-mysql:Master
property $id="cib-bootstrap-options" \
dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
resource-stickiness="0"
The setup works in the basic conditions:
* After the "first" startup, nodes start up as slaves, and shortly after, one of them is promoted to master.
* Updates to the master are replicated properly to the slave.
* Slave accepts updates, which is Wrong, but I can live with this - I will allow connect to the Master VIP only.
* If I stop the slave for some time, and re-start it, it will catch up with the master shortly and get into sync.
I have, however a serious issue:
* If I stop the current master, the slave is promoted, accepts RW queries, the Master IP is bound to it - ALL fine.
* BUT - when I want to bring the other node online, it simply shows: Stopped (not installed)
Online: [ db1 db2 ]
db-ip-master (ocf::heartbeat:IPaddr2): Started db1
Master/Slave Set: db-ms-mysql [db-mysql]
Masters: [ db1 ]
Stopped: [ db-mysql:1 ]
Node Attributes:
* Node db1:
+ IP : 10.100.1.31
+ db2-log-file-db-mysql : mysql-bin.000021
+ db2-log-pos-db-mysql : 40730
+ master-db-mysql:0 : 3601
* Node db2:
+ IP : 10.100.1.32
Failed actions:
db-mysql:0_monitor_30000 (node=db2, call=58, rc=5, status=complete): not installed
I checked the logs, and could not find a reason why the slave at db2 is not started.
Any IDEA Anyone ?
Thanks,
Attila
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://oss.clusterlabs.org/pipermail/pacemaker/attachments/20111215/ece1ff2a/attachment.html>
More information about the Pacemaker
mailing list