[Pacemaker] IP Range Failover with IPaddr2 and clone / globally-unique="true"
Jake Smith
jsmith at argotec.com
Wed Jan 25 15:45:59 UTC 2012
----- Original Message -----
> From: "Anton Melser" <melser.anton at gmail.com>
> To: "Jake Smith" <jsmith at argotec.com>, "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org>
> Sent: Wednesday, January 25, 2012 9:24:09 AM
> Subject: Re: [Pacemaker] IP Range Failover with IPaddr2 and clone / globally-unique="true"
>
> > Let's try that again with something useful!
> >
> > I'm not an expert on it but...
> >
> > unique_clone_address:
> > If true, add the clone ID to the supplied value of ip to create a
> > unique address to manage (optional, boolean, default false)
> >
> > So for example:
> > primitive ClusterIP ocf:heartbeat:IPaddr2 \
> > params ip="10.0.0.1" cidr_netmask="32" clusterip_hash="sourceip"
> > \
> > op monitor interval="30s"
> > clone CloneIP ClusterIP \
> > meta globally-unique="true" clone-max="8"
> >
> > would result in 8 ip's: 10.0.0.2, 10.0.0.3, etc.
>
> Ok, so I have reinstalled everything and have a clean setup. However,
> it still ain't workin unfortunately. Can you explain how I'm supposed
> to use unique_clone_address? This is mentioned at the start of the
> thread but not with the command. I tried doing what you suggest here:
>
> # primitive ClusterIP.144.1 ocf:heartbeat:IPaddr2 params
> ip="10.144.1.1" cidr_netmask="32" clusterip_hash="sourceip" op
> monitor
> interval="120s"
> # clone CloneIP ClusterIP.144.1 meta globally-unique="true"
> clone-max="8"
>
As Dejan said I missed the clone-node-max="8" (it defaults to the number of nodes so it only started 2 instances of the clone)
I also missed something in the primitive which would have caused it to only create one IP no matter what pacemaker said it had created. I tested with 8 IP's and ip address show only showed one even though it said 8 were started on the node. You also have to have unique_clone_address="true" on the primitive. Here's the example I tested successfully (pinged all 8 without issue from other node):
root at Condor:~# crm configure show p_testIPs
primitive p_testIPs ocf:heartbeat:IPaddr2 \
params ip="192.168.2.104" cidr_netmask="29" clusterip_hash="sourceip" nic="bond0" iflabel="testing" unique_clone_address="true" \
op monitor interval="60"
root at Condor:~# crm configure show cl_testIPs
clone cl_testIPs p_testIPs \
meta globally-unique="true" clone-node-max="8" clone-max="8" target-role="Started"
root at Condor:~# crm_mon
<snip>
Clone Set: cl_testIPs [p_testIPs] (unique)
p_testIPs:0 (ocf::heartbeat:IPaddr2): Started Vulture
p_testIPs:1 (ocf::heartbeat:IPaddr2): Started Vulture
p_testIPs:2 (ocf::heartbeat:IPaddr2): Started Vulture
p_testIPs:3 (ocf::heartbeat:IPaddr2): Started Vulture
p_testIPs:4 (ocf::heartbeat:IPaddr2): Started Vulture
p_testIPs:5 (ocf::heartbeat:IPaddr2): Started Vulture
p_testIPs:6 (ocf::heartbeat:IPaddr2): Started Vulture
p_testIPs:7 (ocf::heartbeat:IPaddr2): Started Vulture
root at Vulture:~# ip a s
<snip>
6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
link/ether 84:2b:2b:1a:bf:d6 brd ff:ff:ff:ff:ff:ff
inet 192.168.2.42/22 brd 192.168.3.255 scope global bond0
inet 192.168.2.104/29 brd 192.168.2.111 scope global bond0:testing
inet 192.168.2.105/29 brd 192.168.2.111 scope global secondary bond0:testing
inet 192.168.2.106/29 brd 192.168.2.111 scope global secondary bond0:testing
inet 192.168.2.107/29 brd 192.168.2.111 scope global secondary bond0:testing
inet 192.168.2.110/29 brd 192.168.2.111 scope global secondary bond0:testing
inet 192.168.2.111/29 brd 192.168.2.111 scope global secondary bond0:testing
inet 192.168.2.108/29 brd 192.168.2.111 scope global secondary bond0:testing
inet 192.168.2.109/29 brd 192.168.2.111 scope global secondary bond0:testing
inet6 fe80::862b:2bff:fe1a:bfd6/64 scope link
valid_lft forever preferred_lft forever
You may want to look into the different options for clusterip_hash, nic, arp, and what to use for cidr_netmask for the IPaddr2 primitive
> That gave:
>
> [root at FW1 ~]# crm status
> ============
> Last updated: Wed Jan 25 13:57:51 2012
> Last change: Wed Jan 25 13:57:05 2012 via cibadmin on FW1
> Stack: openais
> Current DC: FW1 - partition with quorum
> Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558
> 2 Nodes configured, 2 expected votes
> 8 Resources configured.
> ============
>
> Online: [ FW1 FW2 ]
>
> Clone Set: CloneIP.144.1 [ClusterIP.144.1] (unique)
> ClusterIP.144.1:0 (ocf::heartbeat:IPaddr2): Started FW1
> ClusterIP.144.1:1 (ocf::heartbeat:IPaddr2): Started FW2
> ClusterIP.144.1:2 (ocf::heartbeat:IPaddr2): Stopped
> ClusterIP.144.1:3 (ocf::heartbeat:IPaddr2): Stopped
> ClusterIP.144.1:4 (ocf::heartbeat:IPaddr2): Stopped
> ClusterIP.144.1:5 (ocf::heartbeat:IPaddr2): Stopped
> ClusterIP.144.1:6 (ocf::heartbeat:IPaddr2): Stopped
> ClusterIP.144.1:7 (ocf::heartbeat:IPaddr2): Stopped
>
> But none of the IPs were pingable after running the clone (just with
> the primitive it was ok).
> doing:
> crm(live)# configure property stop-all-resources=false
> Didn't get the other IPs "Started".
>
> So I got rid of this (successfully) and tried:
>
> primitive ClusterIP.144.1 ocf:heartbeat:IPaddr2 params
> ip="10.144.1.1"
> cidr_netmask="32" clusterip_hash="sourceip"
> unique_clone_address="true" op monitor interval="120s"
>
> But now I have:
>
> crm(live)# status
> ============
> Last updated: Wed Jan 25 14:57:42 2012
> Last change: Wed Jan 25 14:50:09 2012 via cibadmin on FW1
> Stack: openais
> Current DC: FW1 - partition with quorum
> Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558
> 2 Nodes configured, 2 expected votes
> 1 Resources configured.
> ============
>
> Online: [ FW1 FW2 ]
>
> ClusterIP.144.1 (ocf::heartbeat:IPaddr2): Started FW1
> (unmanaged) FAILED
This shows the primitive is unmanaged - that means the stop all wont apply because pacemaker isn't managing the resource right now
Try:
crm(live)# resource
crm(live)resource# manage ClusterIP.144.1
crm(live)resource# up
crm(live)# configure
crm(live)configure# property stop-all-resources=true
crm(live)configure# commit
>
> Failed actions:
> ClusterIP.144.1_stop_0 (node=FW1, call=25, rc=6,
> status=complete):
> not configured
>
> And I can't delete it:
> crm(live)# configure property stop-all-resources=true
> crm(live)# configure commit
> INFO: apparently there is nothing to commit
> INFO: try changing something first
> crm(live)# configure erase
> WARNING: resource ClusterIP.144.1 is running, can't delete it
> ERROR: CIB erase aborted (nothing was deleted)
>
> I can't work out how to move forward... Any pointers?
> Cheers
> Anton
>
>
More information about the Pacemaker
mailing list