[Pacemaker] Several problems with simple virtual-IP config (solved)
Andrew Beekhof
andrew at beekhof.net
Wed Feb 9 12:43:26 UTC 2011
On Wed, Feb 9, 2011 at 1:36 PM, Klaus Darilion
<klaus.mailinglists at pernau.at> wrote:
> Hi!
>
> I managed to sovle the problem by using the 'ping' OCF resource instead
> of 'pingd'. Although pingd is deprecated I thought it should work.
It should, but kinda doesn't and its not clear why/how we can fix it.
Which is why we ditched it :-)
>
> Anyway, for the records, my config which seems to work now (some
> tweaking of ping checks is still missing):
>
>
> node server1 \
> attributes standby="off"
> node server2 \
> attributes standby="off"
> primitive failover-ip ocf:heartbeat:IPaddr \
> params ip="11.222.32.161" \
> op monitor interval="3s"
> primitive pingtest ocf:pacemaker:ping \
> params host_list="11.222.53.113" multiplier="10" dampen="5s" \
> op monitor interval="10s"
> clone clonePing pingtest
> location aktiverLB failover-ip \
> rule $id="aktiverLB-rule" -inf: not_defined pingd or pingd lte 0
> property $id="cib-bootstrap-options" \
> dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \
> cluster-infrastructure="openais" \
> expected-quorum-votes="2" \
> stonith-enabled="false" \
> no-quorum-policy="ignore"
> rsc_defaults $id="rsc-options" \
> resource-stickiness="5"
>
>
> Thanks
> Klaus
>
>
> Am 09.02.2011 12:49, schrieb Klaus Darilion:
>>
>>
>> Am 08.02.2011 18:20, schrieb Florian Haas:
>>> On 02/08/2011 06:03 PM, Klaus Darilion wrote:
>>
>>>> Now I put server2 online again: # crm node online server2.
>>>> That means, server2 is online and has ping connectivity, server1 is
>>>> online and doesn't have ping connectivity. But the virtual-IP stayed
>>>> with server1:
>>>>
>>>> Online: [ server1 server2 ]
>>>>
>>>> failover-ip (ocf::heartbeat:IPaddr): Started server1
>>>> Clone Set: clonePing
>>>> Started: [ server2 server1 ]
>>>>
>>>> What do I have to change in the config to have here a failover to server2?
>>>
>>> IIUC your pingd attribute is still defined, albeit with a value of 0.
>>> Thus the location constraint is still fulfilled and the pingd score is 0
>>> on both nodes. Equal score means Pacemaker gets to decide where the
>>> resource runs, and currently there is no reason to migrate.
>>>
>>> Try this:
>>>
>>> location aktiverLB failover-ip \
>>> rule $id="aktiverLB-rule" -inf: pingd lte 0
>>>
>>> or:
>>>
>>> location aktiverLB failover-ip \
>>> rule $id="aktiverLB-rule" inf: pingd gt 0
>>>
>>
>> Hi Florian!
>>
>> I changed the config now as suggested by
>> http://www.clusterlabs.org/wiki/Example_configurations#pingd_location_constraint:
>>
>>
>> node server1 \
>> attributes standby="off"
>> node server2 \
>> attributes standby="off"
>> primitive failover-ip ocf:heartbeat:IPaddr \
>> params ip="83.136.32.161" \
>> op monitor interval="3s"
>> primitive pingtest ocf:pacemaker:pingd \
>> params host_list="88.198.53.113" multiplier="10" dampen="5s" \
>> op monitor interval="10s"
>> clone clonePing pingtest
>> location aktiverLB failover-ip \
>> rule $id="aktiverLB-rule" -inf: not_defined pingd or pingd lte 0
>> property $id="cib-bootstrap-options" \
>> dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \
>> cluster-infrastructure="openais" \
>> expected-quorum-votes="2" \
>> stonith-enabled="false" \
>> no-quorum-policy="ignore"
>>
>> But the behavior is IMO still confusing:
>>
>> server1 is started:
>>
>> Online: [ server1 ]
>> OFFLINE: [ server2 ]
>>
>> Clone Set: clonePing
>> Started: [ server1 ]
>> Stopped: [ pingtest:1 ]
>>
>> Migration summary:
>> * Node server1:
>>
>>
>> Why are there no scores? I guess this prevents the virtualIP resource to
>> be started on server1.
>>
>>
>> Now I block PING. After 5 seconds the score is 10 (although ping is
>> blocked) and the resource gets started:
>>
>> Online: [ server1 ]
>> OFFLINE: [ server2 ]
>>
>> failover-ip (ocf::heartbeat:IPaddr): Started server1
>> Clone Set: clonePing
>> Started: [ server1 ]
>> Stopped: [ pingtest:1 ]
>>
>> Migration summary:
>> * Node server1: pingd=10
>>
>>
>> After 5 more seconds the score becomes 0 and the resource will be removed:
>>
>> Online: [ server1 ]
>> OFFLINE: [ server2 ]
>>
>> Clone Set: clonePing
>> Started: [ server1 ]
>> Stopped: [ pingtest:1 ]
>>
>> Migration summary:
>> * Node server1: pingd=0
>>
>>
>> Now I allow PING. Even after 10 seconds the score is still 0 although
>> PINGs are working fine again:
>>
>> Online: [ server1 ]
>> OFFLINE: [ server2 ]
>>
>> Clone Set: clonePing
>> Started: [ server1 ]
>> Stopped: [ pingtest:1 ]
>>
>> Migration summary:
>> * Node server1: pingd=0
>>
>>
>> I think the problem is, that the pingd score is not calculated on
>> startup and is not updated when the connectivity problem recovers.
>>
>> Any ideas what could be wrong?
>>
>> Thanks
>> Klaus
>>
>>
>>
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
More information about the Pacemaker
mailing list