[Pacemaker] Trouble getting two node cluster to failover when network lost

Andrew Beekhof andrew at beekhof.net
Thu Apr 3 22:18:48 EDT 2014


On 21 Mar 2014, at 4:15 am, Aaron Wilson <awilson at nautilusgrp.com> wrote:

> OK, I tried the ping RA but my VIPs do not migrate when ping connection is lost. I placed my two VIPs in a group and I believe  I must have something wrong with the scoring or location rules.  Should I be using clone for the ping RA?  
> 
> What is a good way to check is ping is failing or succeeding and if the scoring is happening correctly?
> 
> Below is my configuration and snippets from syslog
> 
> ode baymaster-67
> node baymaster-67-failover
> primitive ip1 ocf:heartbeat:IPaddr2 \
>         params ip="192.168.67.81" nic="eth0" \
>         op monitor interval="2s"
> primitive ip2 ocf:heartbeat:IPaddr2 \
>         params ip="192.168.200.1" nic="eth1" \
>         op monitor interval="2s"
> primitive ping-nodes ocf:pacemaker:ping \
>         params host_list="192.168.67.80 192.168.200.100" multiplier="100" dampen="5s" \
>         op monitor interval="60" timeout="60" \
>         op start interval="0" timeout="60" \
>         op stop interval="0" timeout="60"
> group baymaster-resources ip1 ip2
> clone c_ping-nodes ping-nodes
> location baymaster_ping baymaster-resources \
>         rule $id="ping_rule" inf: ping lte 0

This says, baymaster-resources MUST ONLY run on nodes with no connectivity.
Perhaps you meant -inf here.

> location baymaster_vip baymaster-resources \
>         rule $id="ip_rule" inf: #uname eq baymaster-67
> property $id="cib-bootstrap-options" \
>         dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
>         cluster-infrastructure="cman" \
>         no-quorum-policy="ignore" \
>         stonith-enabled="false"
> 
> Should I be conserned about this line in syslog 
> 
> 
> 
> 
> 
> On Wed, Mar 19, 2014 at 1:55 PM, Stefan Bauer <stefan.bauer at cubewerk.de> wrote:
> So you want to setup a ping ressource in each subnet. if your active node can not reach the ping node in one subnet, its an indication, that the node have lost its connectivity in that network.
> 
> Mit freundlichen Grüßen
> 
> Stefan Bauer
> --
> Cubewerk GmbH
> Herzog-Otto-Straße 32
> 83308 Trostberg
> 08621 - 99 60 237
> HRB 22195 AG Traunstein
> GF Stefan Bauer
> 
> > Am 19.03.2014 um 21:29 schrieb "Aaron Wilson" <awilson at nautilusgrp.com>:
> >
> > Stefan, thanks for the reply.
> >
> > Having two nics is not for redundancy in my case. Resources on the primary server are being accessed from both subnets at the same time. The secondary server is  to be a failover if the server goes down or if any of the Ethernet ports become disconnected for any reason.  I read through the documentation and I am still not sure of the relationship between the Corosync hostnames / interfaces and Pacemaker resources.  Could corosync be configured to detect failure and start failover of a node using rrp or does the resource need to be monitored by Pacemaker in order to get moved form primary to secondary server?
> >
> > There is actually a third nic on the servers which could be used only for cluster communication if that works better.
> >
> >
> > Thanks again for your input. I will do some more reading as well.
> >
> > - Aaron
> > _______________________________________________
> >
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> >
> >
> > Project Home: http://www.clusterlabs.org
> >
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >
> > Bugs: http://bugs.clusterlabs.org
> >
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> -- 
> Aaron Wilson | IT Manager
> 
> 
> 
> Nautilus Group Inc.
> www.nautilusgrp.com
> 2201 Dwight Way | Berkeley, CA 94704
> M: 801.644.2533
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140404/53940f3b/attachment-0002.sig>


More information about the Pacemaker mailing list