[Pacemaker] IPaddr2 cloned address doesn't survive node standby
Andreas Ntaflos
daff at pseudoterminal.org
Fri May 17 19:25:32 UTC 2013
In a two-node cluster I am trying to use a cloned IP address with a
cloned Bind 9 instance, in an active-active way. Why? Because simple IP
failover does not work well with Bind, as it only answers queries on the
addresses that are bound to the NIC when starting up (I know about
Bind's "interface-interval" setting, but the minimum of one minute is
far too long). Using Ubuntu 12.04.2, Corosync 1.4.2 and Pacemaker 1.1.6.
So my configuration sees to it that the cloned address is set on both
nodes and Bind is started afterwards (op params omitted for readability):
node dns01
node dns02
primitive p_bind9 lsb:bind9
primitive p_ip_service_ns ocf:heartbeat:IPaddr2 \
params ip="192.168.114.17" cidr_netmask="24" nic="eth0" \
clusterip_hash="sourceip-sourceport"
clone cl_bind9 p_bind9 \
meta interleave="false"
clone cl_ip_service_ns p_ip_service_ns \
meta globally-unique="true" clone-max="2" \
clone-node-max="2" interleave="true"
order o_ip_before_bind9 inf: cl_ip_service_ns cl_bind9
(suggestions to improve or correct this configuration gladly accepted)
After Corosync starts up the first time everything seems correct, I can
see the cluster/cloned/service IP address and the CLUSTERIP iptables
rules on both nodes.
But after putting dns01 in standby and then bringing it online again the
cloned address is no longer present on dns01, only on dns02. iptables
rules are also gone from dns01.
Then, putting dns02 into standby the IP address is moved to dns01, and
after going online again no longer present on dns01 (neither are
iptables rules).
So the IP address is moved between the nodes, each move accompanied by a
restart of the Bind service (cl_bind9/p_bind9).
All of this doesn't seem right to me. Shouldn't the cloned IP address
always be present on *both* nodes when they are online?
Andreas
PS: In the end this configuration works since the Bind 9 service is
always available to answer queries on the cluster address (as long as
there is one node online) but it seems that the Bind 9 clones are
restarted too often and too liberally when things change. This, however,
may be a separate issue, possibly related to the order directive and the
interleave meta params.
More information about the Pacemaker
mailing list