[Pacemaker] migration-threshold question
Juha Heinanen
jh at tutpro.com
Sat Mar 21 15:04:17 UTC 2009
i have a resource that used to have this crm definition:
primitive test lsb:test \
op monitor interval="30s" timeout="5s" \
meta target-role="Started"
if i stopped the resource by
/etc/init.d/test stop
pacemaker restarted as i was expecting it to do.
then i modified "test" init script so that starting of the resource
always failed. the result was that pacemaker kept on trying to restart
it forever without migrating the group of primitives of which "test" is
the last member to the other node.
i searched archives and found about parameter migration-threshold:
If you used pacemaker 1.0 you would not have to deal with
failure-stickiness anymore, but could use the very nice new
"migration-threshold" feature. Set this to 1 and after 1 failure, the
resource will failover, regardless of its score.
so i went and set migration-threshold to value 3 hoping that after three
failed attempts to restart the resource the group would migrate to the
other node:
primitive test lsb:test \
op monitor interval="30s" timeout="5s" \
meta target-role="Started" migration-threshold="3"
the result, however, was that after 3 restart attempts, the resource
has stayed "Stopped" on the node where it failed:
============
Last updated: Sat Mar 21 19:02:07 2009
Current DC: lenny2 (f13aff7b-6c94-43ac-9a24-b118e62d5325)
Version: 1.0.2-ec6b0bbee1f3aa72c4c2559997e675db6ab39160
2 Nodes configured.
2 Resources configured.
============
Node: lenny1 (8df8447f-6ecf-41a7-a131-c89fd59a120d): online
Node: lenny2 (f13aff7b-6c94-43ac-9a24-b118e62d5325): online
Master/Slave Set: ms-drbd0
drbd0:0 (ocf::heartbeat:drbd): Master lenny1
drbd0:1 (ocf::heartbeat:drbd): Slave lenny2
Resource Group: sip-proxy-group
fs0 (ocf::heartbeat:Filesystem): Started lenny1
mysql-server (lsb:mysql): Started lenny1
radius-server (lsb:freeradius): Started lenny1
virtual-ip (ocf::heartbeat:IPaddr2): Started lenny1
test (lsb:test): Stopped
Failed actions:
test_monitor_30000 (node=lenny1, call=30, rc=7, status=complete): not running
the question: what i'm missing here, i.e., what should add to the crm
config in order to get the group migrated to the other node if
restarting of "test" fails 3 times?
-- juha
More information about the Pacemaker
mailing list