[Pacemaker] DRBD promotion timeout after pacemaker stop on other node

Vladislav Bogdanov bubble at hoster-ok.com
Mon Nov 11 10:46:13 EST 2013


11.11.2013 09:00, Vladislav Bogdanov wrote:
...
>>>> Looking at crm-fence-peer.sh script, it would determine peer state as
>>>> offline immediately if node state (all of)
>>>> * doesn't contain "expected" tag or has it set to "down"
>>>> * has "in_ccm" tag set to false
>>>> * has "crmd" tag set to anything except "online"
>>>>
>>>> On the other hand, crmd sets "expected" = "down" only after fencing is
>>>> complete (probably the same for "in_ccm"?). Shouldn't is do the same (or
>>>> may be just remove that tag) if clean shutdown about to be complete?
>>>
>>> That would make sense.  Are you using the plugin, cman or corosync 2?
> 

This one works in all tests I was able to imagine, but I'm not sure it is
completely safe to set expected="down" for old DC (in test when drbd is promoted on DC and it reboots).

>From ddfccc8a40cfece5c29d61f44a4467954d5c5da8 Mon Sep 17 00:00:00 2001
From: Vladislav Bogdanov <bubble at hoster-ok.com>
Date: Mon, 11 Nov 2013 14:32:48 +0000
Subject: [PATCH] Update node values in cib on clean shutdown

---
 crmd/callbacks.c  |    6 +++++-
 crmd/membership.c |    2 +-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/crmd/callbacks.c b/crmd/callbacks.c
index 3dae17b..9cfb973 100644
--- a/crmd/callbacks.c
+++ b/crmd/callbacks.c
@@ -162,6 +162,8 @@ peer_update_callback(enum crm_status_type type, crm_node_t * node, const void *d
             } else if (safe_str_eq(node->uname, fsa_our_dc) && crm_is_peer_active(node) == FALSE) {
                 /* Did the DC leave us? */
                 crm_notice("Our peer on the DC (%s) is dead", fsa_our_dc);
+                /* FIXME: is it safe? */
+                crm_update_peer_expected(__FUNCTION__, node, CRMD_JOINSTATE_DOWN);
                 register_fsa_input(C_CRMD_STATUS_CALLBACK, I_ELECTION, NULL);
             }
             break;
@@ -169,6 +171,7 @@ peer_update_callback(enum crm_status_type type, crm_node_t * node, const void *d
 
     if (AM_I_DC) {
         xmlNode *update = NULL;
+        int flags = node_update_peer;
         gboolean alive = crm_is_peer_active(node);
         crm_action_t *down = match_down_event(0, node->uuid, NULL, appeared);
 
@@ -199,6 +202,7 @@ peer_update_callback(enum crm_status_type type, crm_node_t * node, const void *d
 
                 crm_update_peer_join(__FUNCTION__, node, crm_join_none);
                 crm_update_peer_expected(__FUNCTION__, node, CRMD_JOINSTATE_DOWN);
+                flags |= node_update_cluster | node_update_join | node_update_expected;
                 check_join_state(fsa_state, __FUNCTION__);
 
                 update_graph(transition_graph, down);
@@ -221,7 +225,7 @@ peer_update_callback(enum crm_status_type type, crm_node_t * node, const void *d
             crm_trace("Other %p", down);
         }
 
-        update = do_update_node_cib(node, node_update_peer, NULL, __FUNCTION__);
+        update = do_update_node_cib(node, flags, NULL, __FUNCTION__);
         fsa_cib_anon_update(XML_CIB_TAG_STATUS, update,
                             cib_scope_local | cib_quorum_override | cib_can_create);
         free_xml(update);
diff --git a/crmd/membership.c b/crmd/membership.c
index be1863a..d68b3aa 100644
--- a/crmd/membership.c
+++ b/crmd/membership.c
@@ -152,7 +152,7 @@ do_update_node_cib(crm_node_t * node, int flags, xmlNode * parent, const char *s
     crm_xml_add(node_state, XML_ATTR_UNAME, node->uname);
 
     if (flags & node_update_cluster) {
-        if (safe_str_eq(node->state, CRM_NODE_ACTIVE)) {
+        if (crm_is_peer_active(node)) {
             value = XML_BOOLEAN_YES;
         } else if (node->state) {
             value = XML_BOOLEAN_NO;
-- 
1.7.1






More information about the Pacemaker mailing list