[Pacemaker] Help with Pacemaker 2-node Router Setup
Eric Renfro
erenfro at gmail.com
Sat Dec 26 09:52:38 UTC 2009
Michael Schwartzkopff wrote:
> Am Samstag, 26. Dezember 2009 08:12:49 schrieb Eric Renfro:
>
>> Hello,
>>
>> I'm trying to setup 2 nodes that'll run pacemaker with openais as the
>> communication layer. Ideally what I want is for router1 to be the master
>> node and take over for router2 if it comes back up fully functional
>> again. In my setup, the routers are both internet-facing servers that
>> toggle the external internet IP to whichever controls it at the time,
>> and also handles the internal IP for the gateway for internal systems to
>> route via.
>>
>> My problem is with Route in my setup, so far, and later getting
>> shorewall to start/stop per whichever nodes active.
>>
>> Route, in my case in the setup I will show below, is failing to start
>> initially because I presume the internet IP address is not fully
>> initialized at the time it's trying to enable the route. If I do a crm
>> resource cleanup failover-gw, it brings it up just fine. If I try to
>> move the router_cluster resource to router2 from router1 after it's
>> fully up, it fails because of failover-gw on router2.
>>
>
> Very unlikely. If the IPaddr2 script finishes the IP address is up.
> Please search for other reasons and grep "lrm.*failover-gw" in the logs.
>
>
>> Here's my setup at present. For the moment, until I figure out how to do
>> it, shorewall is started manually, I want to automate this once the
>> setup is working, though, perhaps you guys could help me with that as well.
>>
>> primitive failover-int-ip ocf:heartbeat:IPaddr2 \
>> params ip="192.168.0.1" \
>> op monitor interval="2s"
>> primitive failover-ext-ip ocf:heartbeat:IPaddr2 \
>> params ip="24.227.124.158" cidr_netmask="30"
>> broadcast="24.227.124.159" nic="net0" \
>> op monitor interval="2s" \
>> meta target-role="Started"
>> primitive failover-gw ocf:heartbeat:Route \
>> params destination="0.0.0.0/0" gateway="24.227.124.157"
>> device="net0" \
>> meta target-role="Started" \
>> op monitor interval="2s"
>> group router_cluster failover-int-ip failover-ext-ip failover-gw
>> location router-master router_cluster \
>> rule $id="router-master-rule" $role="master" 100: #uname eq router1
>>
>> I would appreciate as much help as possible. I am fairly new to
>> pacemaker, but so far all but the Route part of this works well.
>>
>
> Please give us a chance to help you providing the interesting logs!
>
>
Sure..
Here's a big clip of a log grepped from just failover-gw, if this helps
hopefully, else, I can pinpoint more around what's happening, the logs
fill up pretty quickly as it's coming alive.
messages:Dec 26 02:00:21 router1 pengine: [4724]: info: unpack_rsc_op:
failover-gw_monitor_0 on router2 returned 5 (not installed) instead of
the expected value: 7 (not running)
messages:Dec 26 02:00:21 router1 pengine: [4724]: ERROR: unpack_rsc_op:
Hard error - failover-gw_monitor_0 failed with rc=5: Preventing
failover-gw from re-starting on router2
messages:Dec 26 02:00:21 router1 pengine: [4724]: notice:
native_print: failover-gw#011(ocf::heartbeat:Route):#011Started router1
messages:Dec 26 02:00:21 router1 pengine: [4724]: notice: LogActions:
Leave resource failover-gw#011(Started router1)
messages:Dec 26 02:15:21 router1 pengine: [4724]: info: unpack_rsc_op:
failover-gw_monitor_0 on router1 returned 0 (ok) instead of the expected
value: 7 (not running)
messages:Dec 26 02:15:21 router1 pengine: [4724]: WARN: unpack_rsc_op:
Operation failover-gw_monitor_0 found resource failover-gw active on router1
messages:Dec 26 02:15:21 router1 pengine: [4724]: info: unpack_rsc_op:
failover-gw_monitor_0 on router2 returned 5 (not installed) instead of
the expected value: 7 (not running)
messages:Dec 26 02:15:21 router1 pengine: [4724]: ERROR: unpack_rsc_op:
Hard error - failover-gw_monitor_0 failed with rc=5: Preventing
failover-gw from re-starting on router2
messages:Dec 26 02:15:21 router1 pengine: [4724]: notice:
native_print: failover-gw#011(ocf::heartbeat:Route):#011Started router1
messages:Dec 26 02:15:21 router1 pengine: [4724]: notice: LogActions:
Leave resource failover-gw#011(Started router1)
messages:Dec 26 02:30:21 router1 pengine: [4724]: info: unpack_rsc_op:
failover-gw_monitor_0 on router1 returned 0 (ok) instead of the expected
value: 7 (not running)
messages:Dec 26 02:30:21 router1 pengine: [4724]: WARN: unpack_rsc_op:
Operation failover-gw_monitor_0 found resource failover-gw active on router1
messages:Dec 26 02:30:21 router1 pengine: [4724]: info: unpack_rsc_op:
failover-gw_monitor_0 on router2 returned 5 (not installed) instead of
the expected value: 7 (not running)
messages:Dec 26 02:30:21 router1 pengine: [4724]: ERROR: unpack_rsc_op:
Hard error - failover-gw_monitor_0 failed with rc=5: Preventing
failover-gw from re-starting on router2
messages:Dec 26 02:30:21 router1 pengine: [4724]: notice:
native_print: failover-gw#011(ocf::heartbeat:Route):#011Started router1
messages:Dec 26 02:30:21 router1 pengine: [4724]: notice: LogActions:
Leave resource failover-gw#011(Started router1)
messages:Dec 26 02:45:21 router1 pengine: [4724]: info: unpack_rsc_op:
failover-gw_monitor_0 on router1 returned 0 (ok) instead of the expected
value: 7 (not running)
messages:Dec 26 02:45:21 router1 pengine: [4724]: WARN: unpack_rsc_op:
Operation failover-gw_monitor_0 found resource failover-gw active on router1
messages:Dec 26 02:45:21 router1 pengine: [4724]: info: unpack_rsc_op:
failover-gw_monitor_0 on router2 returned 5 (not installed) instead of
the expected value: 7 (not running)
messages:Dec 26 02:45:21 router1 pengine: [4724]: ERROR: unpack_rsc_op:
Hard error - failover-gw_monitor_0 failed with rc=5: Preventing
failover-gw from re-starting on router2
messages:Dec 26 02:45:21 router1 pengine: [4724]: notice:
native_print: failover-gw#011(ocf::heartbeat:Route):#011Started router1
messages:Dec 26 02:45:21 router1 pengine: [4724]: notice: LogActions:
Leave resource failover-gw#011(Started router1)
messages:Dec 26 03:00:21 router1 pengine: [4724]: info: unpack_rsc_op:
failover-gw_monitor_0 on router1 returned 0 (ok) instead of the expected
value: 7 (not running)
messages:Dec 26 03:00:21 router1 pengine: [4724]: WARN: unpack_rsc_op:
Operation failover-gw_monitor_0 found resource failover-gw active on router1
messages:Dec 26 03:00:21 router1 pengine: [4724]: info: unpack_rsc_op:
failover-gw_monitor_0 on router2 returned 5 (not installed) instead of
the expected value: 7 (not running)
messages:Dec 26 03:00:21 router1 pengine: [4724]: ERROR: unpack_rsc_op:
Hard error - failover-gw_monitor_0 failed with rc=5: Preventing
failover-gw from re-starting on router2
messages:Dec 26 03:00:21 router1 pengine: [4724]: notice:
native_print: failover-gw#011(ocf::heartbeat:Route):#011Started router1
messages:Dec 26 03:00:21 router1 pengine: [4724]: notice: LogActions:
Leave resource failover-gw#011(Started router1)
messages:Dec 26 03:15:21 router1 pengine: [4724]: info: unpack_rsc_op:
failover-gw_monitor_0 on router1 returned 0 (ok) instead of the expected
value: 7 (not running)
messages:Dec 26 03:15:21 router1 pengine: [4724]: WARN: unpack_rsc_op:
Operation failover-gw_monitor_0 found resource failover-gw active on router1
messages:Dec 26 03:15:21 router1 pengine: [4724]: info: unpack_rsc_op:
failover-gw_monitor_0 on router2 returned 5 (not installed) instead of
the expected value: 7 (not running)
messages:Dec 26 03:15:21 router1 pengine: [4724]: ERROR: unpack_rsc_op:
Hard error - failover-gw_monitor_0 failed with rc=5: Preventing
failover-gw from re-starting on router2
messages:Dec 26 03:15:21 router1 pengine: [4724]: notice:
native_print: failover-gw#011(ocf::heartbeat:Route):#011Started router1
More information about the Pacemaker
mailing list