[Pacemaker] Wiki example problems

Thu May 28 16:35:22 EDT 2009

After following the wiki example for sharing an IP address (http://clusterlabs.org/wiki/Example_configurations), I'm 
able to manually fail over the resource with crm using the following statement (my nodes are ha1 and ha2):

	crm resource migrate failover-ip ha2

However, if I halt the box which currently owns the floating IP, or otherwise abruptly kill networking on it, the 
failover never automatically happens.  I did follow the example explicitly, and the resource was initially created with:

	primitive failover-ip ocf:heartbeat:IPaddr params ip=192.168.7.250 op monitor interval=10

...so I'm not quite sure what the issue is.  The messaging layer seems to work since crm status shows the node as being 
down, but the resource allocation layer seems to be failing, probably somewhere in the CRM...?

I have no firewall between these nodes, so I haven't run tcpdump either to see if the messages are making it, but I 
can't imagine that that's the issue here.  This is what things look like after the simulated problem:

root at ha1:~# crm status

============
Last updated: Thu May 28 16:31:20 2009
Current DC: ha1 (ha1)
Version: 1.0.2-c02b459053bfa44d509a2a0e0247b291d93662b7
2 Nodes configured.
1 Resources configured.
============

Node: ha1 (ha1): online
Node: ha2 (ha2): UNCLEAN (offline)

root at ha1:~# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:0c:29:cd:78:4e
           inet addr:192.168.7.134  Bcast:192.168.7.255  Mask:255.255.255.0
           inet6 addr: fe80::20c:29ff:fecd:784e/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:7212 errors:0 dropped:0 overruns:0 frame:0
           TX packets:12373 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:919781 (898.2 KB)  TX bytes:1489819 (1.4 MB)
           Base address:0x2000 Memory:d8920000-d8940000

lo        Link encap:Local Loopback
           inet addr:127.0.0.1  Mask:255.0.0.0
           inet6 addr: ::1/128 Scope:Host
           UP LOOPBACK RUNNING  MTU:16436  Metric:1
           RX packets:624 errors:0 dropped:0 overruns:0 frame:0
           TX packets:624 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:61572 (60.1 KB)  TX bytes:61572 (60.1 KB)

root at ha1:~# crm_resource -L
failover-ip	(ocf::heartbeat:IPaddr) Started

As you can see, nothing has happened.  Hopefully someone else can identify my mistake before I do after having read 
this.  Thanks in advance for any help.

-Ryan