[Pacemaker] promote is delayed more than 60 sec after stonith reset

hj lee kerdosa at gmail.com
Tue Oct 13 20:48:51 EDT 2009


Hi,

I configured two nodes cluster on RHEL 5.3 with the following resources.
Note that I am using pacemaker-1.0.6.
- IPMI stonith as a clone. Each IPMI clone is monitoring the other node.
- One Master/Slave resource: Master is running on node1, Slave is running on
node2.
- One FakeIPMI resource.

When I manually trigger the failure in monitor and stop operation of
FakeIPMI at node1, the IPMI stonith running on node2 detects its state
unclean correctly and it tries to demote Master resource in node1 and reset
th node1. The problem I am seeing is the promotion happens 60 sec later
after the stonith reset the node1 successfully.

I want the Slave gets promoted immediately right after the stonith reset
returned successfully! From the log, the promotion is started by demote
operation timeout. Obviously the Master node is rebooting and the demote
will get timeout. I think the demote operation should be cancelled when the
stonith reset the node and the promotion should happen immediately from.

Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20091013/c967defa/attachment.html>


More information about the Pacemaker mailing list