[Pacemaker] About Quorum control at the time of the service stop.(no-quorum-policy=freeze)

renayama19661014 at ybb.ne.jp renayama19661014 at ybb.ne.jp
Fri Sep 10 01:22:12 EDT 2010


Hi,

We confirmed movement of no-quorum-policy=freeze in four node constitution.

Of course we understand that quorum control does not act in Heartbeat well.

We confirmed the service stop of four nodes in the next procedure.

Step1) We start four nodes.(3ACT:1STB)

Step2) We send cib.xml.

============
Last updated: Fri Sep 10 14:16:30 2010
Stack: Heartbeat
Current DC: srv04 (96faf899-13a6-4550-9d3b-b784f7241d06) - partition with quorum
Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
4 Nodes configured, unknown expected votes
7 Resources configured.
============

Online: [ srv01 srv02 srv03 srv04 ]

 Resource Group: Group01
     Dummy01    (ocf::heartbeat:Dummy): Started srv01
     Dummy01-2  (ocf::heartbeat:Dummy): Started srv01
 Resource Group: Group02
     Dummy02    (ocf::heartbeat:Dummy): Started srv02
     Dummy02-2  (ocf::heartbeat:Dummy): Started srv02
 Resource Group: Group03
     Dummy03    (ocf::heartbeat:Dummy): Started srv03
     Dummy03-2  (ocf::heartbeat:Dummy): Started srv03
 Resource Group: grpStonith1
     prmStonith1-3      (stonith:external/ssh): Started srv01
 Resource Group: grpStonith2
     prmStonith2-3      (stonith:external/ssh): Started srv02
 Resource Group: grpStonith3
     prmStonith3-3      (stonith:external/ssh): Started srv03
 Resource Group: grpStonith4
     prmStonith4-3      (stonith:external/ssh): Started srv04

Step3) We stop the first node after being stable.

[root at srv02 ~]# crm_mon -1 
============
Last updated: Fri Sep 10 14:17:07 2010
Stack: Heartbeat
Current DC: srv04 (96faf899-13a6-4550-9d3b-b784f7241d06) - partition with quorum
Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
4 Nodes configured, unknown expected votes
7 Resources configured.
============

Online: [ srv02 srv03 srv04 ]
OFFLINE: [ srv01 ]

 Resource Group: Group01
     Dummy01    (ocf::heartbeat:Dummy): Started srv04 ---->FO
     Dummy01-2  (ocf::heartbeat:Dummy): Started srv04 ---->FO
 Resource Group: Group02
     Dummy02    (ocf::heartbeat:Dummy): Started srv02
     Dummy02-2  (ocf::heartbeat:Dummy): Started srv02
 Resource Group: Group03
     Dummy03    (ocf::heartbeat:Dummy): Started srv03
     Dummy03-2  (ocf::heartbeat:Dummy): Started srv03
 Resource Group: grpStonith1
     prmStonith1-3      (stonith:external/ssh): Started srv03
 Resource Group: grpStonith2
     prmStonith2-3      (stonith:external/ssh): Started srv02
 Resource Group: grpStonith3
     prmStonith3-3      (stonith:external/ssh): Started srv03
 Resource Group: grpStonith4
     prmStonith4-3      (stonith:external/ssh): Started srv04


Step4) Furthermore, we stop the next node after being stable. 
 * Because a notice of ccm which does not have Quorum is late, two remaining node nodes move the
resource.

[root at srv03 ~]# crm_mon -1 
============
Last updated: Fri Sep 10 14:17:59 2010
Stack: Heartbeat
Current DC: srv04 (96faf899-13a6-4550-9d3b-b784f7241d06) - partition with quorum
Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
4 Nodes configured, unknown expected votes
7 Resources configured.
============

Online: [ srv03 srv04 ]
OFFLINE: [ srv01 srv02 ]

 Resource Group: Group01
     Dummy01    (ocf::heartbeat:Dummy): Started srv04
     Dummy01-2  (ocf::heartbeat:Dummy): Started srv04
 Resource Group: Group02
     Dummy02    (ocf::heartbeat:Dummy): Started srv04 ---->FO
     Dummy02-2  (ocf::heartbeat:Dummy): Started srv04 ---->FO
 Resource Group: Group03
     Dummy03    (ocf::heartbeat:Dummy): Started srv03
     Dummy03-2  (ocf::heartbeat:Dummy): Started srv03
 Resource Group: grpStonith1
     prmStonith1-3      (stonith:external/ssh): Started srv03
 Resource Group: grpStonith2
     prmStonith2-3      (stonith:external/ssh): Started srv04
 Resource Group: grpStonith3
     prmStonith3-3      (stonith:external/ssh): Started srv03
 Resource Group: grpStonith4
     prmStonith4-3      (stonith:external/ssh): Started srv04

Step5) We stop one node after being more stable.
 * We stopped it since I became have-quorum=0 of cib.

[root at srv03 ~]# cibadmin -Q | more
<cib epoch="102" num_updates="3" admin_epoch="0" validate-with="pacemaker-1.0" crm_feature_set="3.0.1"
have-quorum="0" dc-uuid="96faf899-13a6-4550-9d3b-b784f
7241d06">

Step6) Some resources moved to the last node.

[root at srv04 ~]# crm_mon -1 
============
Last updated: Fri Sep 10 14:19:43 2010
Stack: Heartbeat
Current DC: srv04 (96faf899-13a6-4550-9d3b-b784f7241d06) - partition WITHOUT quorum
Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b
4 Nodes configured, unknown expected votes
7 Resources configured.
============

Online: [ srv04 ]
OFFLINE: [ srv01 srv02 srv03 ]

 Resource Group: Group01
     Dummy01    (ocf::heartbeat:Dummy): Started srv04
     Dummy01-2  (ocf::heartbeat:Dummy): Started srv04
 Resource Group: Group02
     Dummy02    (ocf::heartbeat:Dummy): Started srv04
     Dummy02-2  (ocf::heartbeat:Dummy): Started srv04
 Resource Group: Group03
     Dummy03    (ocf::heartbeat:Dummy): Started srv04 ---->Why FO?
     Dummy03-2  (ocf::heartbeat:Dummy): Started srv04 ---->Why FO?
 Resource Group: grpStonith1
     prmStonith1-3      (stonith:external/ssh): Started srv04
 Resource Group: grpStonith2
     prmStonith2-3      (stonith:external/ssh): Started srv04
 Resource Group: grpStonith4
     prmStonith4-3      (stonith:external/ssh): Started srv04


We thought that the resource that I left in a left node in Step5 did not move last.
Because the reason is because it appoints no-quorum-policy=freeze.

However, the starting resource seems to move at the time of no-quorum-policy=freeze when I watch a
source code.

(snip)
action_t *
custom_action(resource_t *rsc, char *key, const char *task,
	      node_t *on_node, gboolean optional, gboolean save_action,
	      pe_working_set_t *data_set)
{
	action_t *action = NULL;
	GListPtr possible_matches = NULL;
	CRM_CHECK(key != NULL, return NULL);
	CRM_CHECK(task != NULL, return NULL);
(snip)
		} else if(is_set(data_set->flags, pe_flag_have_quorum) == FALSE
			&& data_set->no_quorum_policy == no_quorum_freeze) {
			crm_debug_3("Check resource is already active");
			if(rsc->fns->active(rsc, TRUE) == FALSE) {
				action->runnable = FALSE;
				crm_debug("%s\t%s (cancelled : quorum freeze)",
					  action->node->details->uname,
					  action->uuid);
			}

		} else {

(snip)

Is this specifications of right no-quorum-policy=freeze movement? 
Is there detailed explanation of the no-quorum-policy=freeze movement somewhere?

Best Regards,
Hideo Yamauchi.





More information about the Pacemaker mailing list