[Pacemaker] Trouble getting two node cluster to failover when network lost

Aaron Wilson awilson at nautilusgrp.com
Thu Mar 20 13:18:30 EDT 2014


Sorry... I sent the last message early by accident. L

syslog line:

Mar 20 09:59:53 baymaster-67 cib: [1846]: debug: cib_process_xpath:
cib_query:
//cib/status//node_state[@id='baymaster-67']//transient_attributes//nvpair[@name='pingd']
does not exist


On Thu, Mar 20, 2014 at 10:15 AM, Aaron Wilson <awilson at nautilusgrp.com>wrote:

> OK, I tried the ping RA but my VIPs do not migrate when ping connection is
> lost. I placed my two VIPs in a group and I believe  I must have something
> wrong with the scoring or location rules.  Should I be using clone for the
> ping RA?
>
> What is a good way to check is ping is failing or succeeding and if the
> scoring is happening correctly?
>
> Below is my configuration and snippets from syslog
>
>  ode baymaster-67
> node baymaster-67-failover
> primitive ip1 ocf:heartbeat:IPaddr2 \
>         params ip="192.168.67.81" nic="eth0" \
>         op monitor interval="2s"
> primitive ip2 ocf:heartbeat:IPaddr2 \
>         params ip="192.168.200.1" nic="eth1" \
>         op monitor interval="2s"
> primitive ping-nodes ocf:pacemaker:ping \
>         params host_list="192.168.67.80 192.168.200.100" multiplier="100"
> dampen="5s" \
>         op monitor interval="60" timeout="60" \
>         op start interval="0" timeout="60" \
>         op stop interval="0" timeout="60"
> group baymaster-resources ip1 ip2
> clone c_ping-nodes ping-nodes
> location baymaster_ping baymaster-resources \
>         rule $id="ping_rule" inf: ping lte 0
> location baymaster_vip baymaster-resources \
>         rule $id="ip_rule" inf: #uname eq baymaster-67
> property $id="cib-bootstrap-options" \
>         dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
>         cluster-infrastructure="cman" \
>         no-quorum-policy="ignore" \
>         stonith-enabled="false"
>
> Should I be conserned about this line in syslog
>
>
>
>
>
> On Wed, Mar 19, 2014 at 1:55 PM, Stefan Bauer <stefan.bauer at cubewerk.de>wrote:
>
>> So you want to setup a ping ressource in each subnet. if your active node
>> can not reach the ping node in one subnet, its an indication, that the node
>> have lost its connectivity in that network.
>>
>> Mit freundlichen Grüßen
>>
>> Stefan Bauer
>> --
>> Cubewerk GmbH
>> Herzog-Otto-Straße 32
>> 83308 Trostberg
>> 08621 - 99 60 237
>> HRB 22195 AG Traunstein
>> GF Stefan Bauer
>>
>> > Am 19.03.2014 um 21:29 schrieb "Aaron Wilson" <awilson at nautilusgrp.com
>> >:
>> >
>> > Stefan, thanks for the reply.
>> >
>> > Having two nics is not for redundancy in my case. Resources on the
>> primary server are being accessed from both subnets at the same time. The
>> secondary server is  to be a failover if the server goes down or if any of
>> the Ethernet ports become disconnected for any reason.  I read through the
>> documentation and I am still not sure of the relationship between the
>> Corosync hostnames / interfaces and Pacemaker resources.  Could corosync be
>> configured to detect failure and start failover of a node using rrp or does
>> the resource need to be monitored by Pacemaker in order to get moved form
>> primary to secondary server?
>> >
>> > There is actually a third nic on the servers which could be used only
>> for cluster communication if that works better.
>> >
>> >
>> > Thanks again for your input. I will do some more reading as well.
>> >
>> > - Aaron
>> > _______________________________________________
>> >
>> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> >
>> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> >
>> >
>> >
>> > Project Home: http://www.clusterlabs.org
>> >
>> > Getting started:
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> >
>> > Bugs: http://bugs.clusterlabs.org
>> >
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
>
>
> --
> Aaron Wilson | IT Manager
>
>
> Nautilus Group Inc.
> www.nautilusgrp.com
> 2201 Dwight Way | Berkeley, CA 94704
> M: *801.644.2533 <801.644.2533>*
>
>


-- 
Aaron Wilson | IT Manager


Nautilus Group Inc.
www.nautilusgrp.com
2201 Dwight Way | Berkeley, CA 94704
M: *801.644.2533*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140320/43c14b85/attachment-0003.html>


More information about the Pacemaker mailing list