Thanks very much for the link. The percona mysql script does pretty much exactly what I need in regards to master/slave promotion/demotion. I did run into a couple of issues revolving around virtual IPs and how they should follow the master/slave(s) around.<div>
<br><div>The problem that I'm now running into has to do with the grouping of my vip for writer/reader. I'm trying to get it so that the vip_writer will stay with the Master sql server and the reader will stay with the Slaves. I'll have N slaves and I only care about having 1 virtual IP for the slaves. I don't care which slave pacemaker picks to have the vip, it just needs to pick one. I've tried to setup some colocation types to map them together but I get into states where the Master sql server will be sitting with the reader_vip. I've included my cib output and what the crm_mon is reporting my current status to be. As you can see, the sql master and writer_vip are not lined up.</div>
<div><br></div><div>Basically, I now have a cib that looks like this (crm configure show):</div><div><br></div><div><div>node $id="0d6be727-1552-4028-ad8a-cf54b2766da0" three \</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>attributes IP="172.17.0.130" standby="off"</div>
<div>node $id="7deca2cd-9a64-476c-8ea2-372bca859a4f" four \</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>attributes IP="172.17.0.131" standby="off"</div><div>node $id="bb15cdbc-8bec-4f64-83bb-8bbd6d4ca1a7" seven \</div>
<div><span class="Apple-tab-span" style="white-space:pre">        </span>attributes IP="172.17.0.134" standby="off"</div><div>primitive p_sql ocf:percona:MySQL_replication \</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>params reader_vip_prefix="reader_vip_" ms_replication_resource_name="ms_novaSQL" master_log_file="mysql-bin.000038" master_log_pos="106" promoted_coordinates="::" master_host="172.17.0.131" \</div>
<div><span class="Apple-tab-span" style="white-space:pre">        </span>params super_db_user="root" super_db_password="nova" \</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>params repl_db_user="novaSlave" repl_db_password="nova" allowed_sbm="10" \</div>
<div><span class="Apple-tab-span" style="white-space:pre">        </span>params state_file="/var/run/heartbeat/novaSQL.state" recover_file="/var/run/heartbeat/novaSQL.recovery" \</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>params p_replication_resource_name="p_sql" \</div>
<div><span class="Apple-tab-span" style="white-space:pre">        </span>params heartbeat_table="ocf.heartbeat" \</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>op monitor interval="10s" role="Master" \</div>
<div><span class="Apple-tab-span" style="white-space:pre">        </span>op monitor interval="10s" role="Slave"</div><div>primitive reader_vip_1 ocf:heartbeat:IPaddr2 \</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>params ip="172.17.0.97" nic="eth0" \</div>
<div><span class="Apple-tab-span" style="white-space:pre">        </span>meta target-role="Started"</div><div>primitive writer_vip ocf:heartbeat:IPaddr2 \</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>params ip="172.17.0.96" nic="eth0" \</div>
<div><span class="Apple-tab-span" style="white-space:pre">        </span>meta target-role="Started"</div><div>ms ms_novaSQL p_sql \</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>meta master-max="1" master-node-max="1" clone-max="3" clone-node-max="1" target-role="Master" notify="false" globally-unique="false"</div>
<div>colocation reader_vip_coloc_slave inf: ms_novaSQL:Slave reader_vip_1</div><div>colocation writer_vip_coloc_master inf: ms_novaSQL:Master writer_vip</div><div>order order_writer_vip_after_master inf: ms_novaSQL:promote writer_vip:start</div>
<div>property $id="cib-bootstrap-options" \</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>dc-version="1.0.9-unknown" \</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>cluster-infrastructure="Heartbeat" \</div>
<div><span class="Apple-tab-span" style="white-space:pre">        </span>stonith-enabled="false" \</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>no-quorum-policy="ignore" \</div><div>
<span class="Apple-tab-span" style="white-space:pre">        </span>last-lrm-refresh="1313615481"</div><div><br></div><div>Along with the output from crm_mon. You can see that the Master sql server has the reader_vip and the slave has the writer_vip (which should be reversed).</div>
<div><br></div><div><div>Node four (7deca2cd-9a64-476c-8ea2-372bca859a4f): online</div><div> p_sql:1 (ocf::percona:MySQL_replication) Master</div><div> reader_vip_1 (ocf::heartbeat:IPaddr2) Started</div>
<div>
Node three (0d6be727-1552-4028-ad8a-cf54b2766da0): online</div><div> p_sql:0 (ocf::percona:MySQL_replication) Slave</div><div> writer_vip (ocf::heartbeat:IPaddr2) Started</div><div>Node seven (bb15cdbc-8bec-4f64-83bb-8bbd6d4ca1a7): online</div>
<div> p_sql:2 (ocf::percona:MySQL_replication) Slave</div></div><div><br></div><div><br></div><div>As always, any ideas/suggestions are appreciated.</div><div><br></div><div>-Mike.</div><div><br></div><br><div class="gmail_quote">
On Mon, Aug 15, 2011 at 1:04 PM, Viacheslav Biriukov <span dir="ltr"><<a href="mailto:v.v.biriukov@gmail.com">v.v.biriukov@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
Hello.<div>Check it out <a href="https://code.launchpad.net/percona-prm" target="_blank">https://code.launchpad.net/percona-prm</a>. </div><div>And presentation: <a href="http://www.percona.com/files/presentations/percona-live/nyc-2011/PerconaLiveNYC2011-MySQL-High-Availability-with-Pacemaker.pdf" target="_blank">http://www.percona.com/files/presentations/percona-live/nyc-2011/PerconaLiveNYC2011-MySQL-High-Availability-with-Pacemaker.pdf</a><br>
<div><div><div></div><div class="h5"><br><div class="gmail_quote">2011/8/15 Michael Szilagyi <span dir="ltr"><<a href="mailto:mszilagyi@gmail.com" target="_blank">mszilagyi@gmail.com</a>></span><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I'm already using the mysql RA file from <a href="https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/mysql" target="_blank">https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/mysql</a> (which also seems to have replication support in it).<div>
<br></div><div>Basically what seems to be happening is that pacemaker detects that the master has dropped and promotes a slave up to master. However, it is not properly reconfiguring the slaves with a CHANGE MASTER TO. I can see some lines in the ocf file that relate to changing master but it isn't setting it up properly. If I login to the slave and issue a CHANGE MASTER TO ... / START SLAVE then replication will start up normally again.</div>
<div><br></div><div>Since I can see that the script does allow the master host to get set when going through set/unset_master I'm hoping it's just something I'm missing and not a limitation of using Pacemaker to manage the sql replication cluster.<br>
<br></div><div>Hopefully someone can point me at what I am missing.</div><div><br></div><div>-Mike.</div><div><div></div><div><div><br><div class="gmail_quote">On Mon, Aug 15, 2011 at 1:56 AM, Dan Frincu <span dir="ltr"><<a href="mailto:df.cluster@gmail.com" target="_blank">df.cluster@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>
<div><br>
On Sat, Aug 13, 2011 at 2:53 AM, Michael Szilagyi <<a href="mailto:mszilagyi@gmail.com" target="_blank">mszilagyi@gmail.com</a>> wrote:<br>
> I'm new to Pacemaker and trying to understand exactly what it can and can't<br>
> do.<br>
> I currently have a small, mysql master/slave cluster setup that is getting<br>
> monitored within Heartbeat/Pacemaker: What I'd like to be able to do (and<br>
> am hoping Pacemaker will do) is to have 1 node designated as Master and in<br>
> the event of a failure, automatically promote a slave to master and realign<br>
> all of the existing slaves to be slaves of the newly promoted master.<br>
> Currently what seems to be happening, however, is heartbeat correctly sees<br>
> that a node goes down and pacemaker promotes it up to master but the<br>
> replication is not adjusted so that it is now feeding everyone else. It<br>
> seems like this should be possible to do from within Pacemaker but I feel<br>
> like I'm missing a part of the puzzle. Any suggestions would be<br>
> appreciated.<br>
<br>
</div>You could try the mysql RA => from<br>
<a href="https://github.com/fghaas/resource-agents/blob/master/heartbeat/mysql" target="_blank">https://github.com/fghaas/resource-agents/blob/master/heartbeat/mysql</a><br>
Last I heard, it had replication support.<br>
<br>
HTH.<br>
<div><div></div><div><br>
><br>
> Here's an output of my crm configure show:<br>
> node $id="7deca2cd-9a64-476c-8ea2-372bca859a4f" four \<br>
> attributes 172.17.0.130-log-file-p_sql="mysql-bin.000013"<br>
> 172.17.0.130-log-pos-p_sql="632"<br>
> node $id="9b355ab7-8c81-485c-8dcd-1facedde5d03" three \<br>
> attributes 172.17.0.131-log-file-p_sql="mysql-bin.000020"<br>
> 172.17.0.131-log-pos-p_sql="106"<br>
> primitive p_sql ocf:heartbeat:mysql \<br>
> params config="/etc/mysql/my.cnf" binary="/usr/bin/mysqld_safe"<br>
> datadir="/var/lib/mysql" \<br>
> params pid="/var/lib/mysql/novaSQL.pid" socket="/var/run/mysqld/mysqld.sock"<br>
> \<br>
> params max_slave_lag="120" \<br>
> params replication_user="novaSlave" replication_passwd="nova" \<br>
> params additional_parameters="--skip-external-locking<br>
> --relay-log=novaSQL-relay-bin --relay-log-index=relay-bin.index<br>
> --relay-log-info-file=<a href="http://relay-bin.info" target="_blank">relay-bin.info</a>" \<br>
> op start interval="0" timeout="120" \<br>
> op stop interval="0" timeout="120" \<br>
> op promote interval="0" timeout="120" \<br>
> op demote interval="0" timeout="120" \<br>
> op monitor interval="10" role="Master" timeout="30" \<br>
> op monitor interval="30" role="Slave" timeout="30"<br>
> primitive p_sqlIP ocf:heartbeat:IPaddr2 \<br>
> params ip="172.17.0.96" \<br>
> op monitor interval="10s"<br>
> ms ms_sql p_sql \<br>
> meta target-role="Started" is-managed="true"<br>
> location l_sqlMaster p_sqlIP 10: three<br>
> location l_sqlSlave1 p_sqlIP 5: four<br>
> property $id="cib-bootstrap-options" \<br>
> dc-version="1.0.9-unknown" \<br>
> cluster-infrastructure="Heartbeat" \<br>
> stonith-enabled="false" \<br>
> no-quorum-policy="ignore" \<br>
> last-lrm-refresh="1313187103"<br>
><br>
> Thanks!<br>
> -Mike.<br>
</div></div>> _______________________________________________<br>
> Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org" target="_blank">Pacemaker@oss.clusterlabs.org</a><br>
> <a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
><br>
> Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
> Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
> Bugs:<br>
> <a href="http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker" target="_blank">http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker</a><br>
><br>
><br>
<br>
<br>
<br>
--<br>
Dan Frincu<br>
CCNA, RHCE<br>
<br>
_______________________________________________<br>
Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org" target="_blank">Pacemaker@oss.clusterlabs.org</a><br>
<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker" target="_blank">http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker</a><br>
</blockquote></div><br></div>
</div></div><br>_______________________________________________<br>
Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org" target="_blank">Pacemaker@oss.clusterlabs.org</a><br>
<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker" target="_blank">http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker</a><br>
<br></blockquote></div><br><br clear="all"><div><br></div>-- <br></div></div><font color="#888888">Viacheslav Biriukov<br>BR<br><div><a href="http://biriukov.com" target="_blank">http://biriukov.com</a></div><br>
</font></div></div>
<br>_______________________________________________<br>
Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>
<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker" target="_blank">http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker</a><br>
<br></blockquote></div><br></div></div>