[Pacemaker] designing a load balancer - request for comments
Klaus Darilion
klaus.mailinglists at pernau.at
Mon Feb 14 14:46:24 UTC 2011
Am 14.02.2011 14:45, schrieb Raoul Bhatia [IPAX]:
> On 02/14/2011 02:37 PM, Klaus Darilion wrote:
>> Somehow pacemaker does not react as I would expect it. My config is:
>>
>> primitive failover-ip ocf:heartbeat:IPaddr \
>> params ip="83.136.32.161" \
>> op monitor interval="3s"
>> primitive kamailio lsb:kamailio \
>> meta migration-threshold="2" failure-timeout="60" \
>> op monitor interval="15" timeout="15"
>> clone cloneKamailio kamailio
>> colocation colo_ip_with_kamailio inf: failover-ip cloneKamailio
>> property $id="cib-bootstrap-options" \
>> dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \
>> cluster-infrastructure="openais" \
>> expected-quorum-votes="2" \
>> stonith-enabled="false" \
>> no-quorum-policy="ignore"
>> rsc_defaults $id="rsc-options" \
>> resource-stickiness="5"
> ...
>> So, what am I doing wrong? I would expect that after 60s the
>> failure-count is resetted.
>
> there is no "cluster-recheck-interval" in your properties:
>
> property $id="cib-bootstrap-options" \
> dc-version="1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b" \
> stonith-enabled="true" \
> cluster-infrastructure="openais" \
> ...
> cluster-recheck-interval="1min"
>
> try to set this and redo your testing.
Ah, intersting :-)
But still not as expected. On cluster recheck, pacemaker detects the
failure timeout:
notice: get_failcount: Failcount for cloneKamailio on armani has expired
(limit was 60s)
notice: RecurringOp: Start recurring monitor (15s) for kamailio:0 on armani
So, Kamailio gets restarted after the failure-timeout, but the
failure-count is still not reset.
virtual-IP on server1, Kamailio on server1 and server2
server1 failure count: 0
server2 failure count: 0
then I stop Kamailio on server1 --> pacemaker restarts Kamailio
virtual-IP on server1, Kamailio on server1 and server2
server1 failure count: 1
server2 failure count: 0
then I stop Kamailio on server1 --> pacemaker migrates the IP
virtual-IP on server2, Kamailio on server2
server1 failure count: 2
server2 failure count: 0
After failure-timeout, Kamailio gets restarted:
virtual-IP on server2, Kamailio on server1 and server2
server1 failure count: 2
server2 failure count: 0
Then server2 is set to standby, -> IP is migrated to server1
virtual-IP on server1, Kamailio on server1
server1 failure count: 2
server2 failure count: 0
Then server2 is set online again:
virtual-IP on server1, Kamailio on server1 and server2
server1 failure count: 2
server2 failure count: 0
then I stop Kamailio on server1 --> pacemaker migrates the IP
virtual-IP on server2, Kamailio on server2
server1 failure count: 3
server2 failure count: 0
After failure-timeout I would have expected that everything starts from
the beginning, so failure-count would be set to 0 again and it would
need again 2 failures (threshold) to migrate.
regards
Klaus
More information about the Pacemaker
mailing list