[Pacemaker] trigger STONITH for testing purposes
Yan Gao
ygao at novell.com
Wed Jun 3 02:14:06 EDT 2009
On Fri, 2009-05-22 at 12:33 +0200, Andrew Beekhof wrote:
> On Wed, May 20, 2009 at 6:39 PM, Bob Haxo <bhaxo at sgi.com> wrote:
> > Hi Andrew,
> >
> > I'd say you removed no-quorum-policy=ignore
> >
> > Actually, the pair of no_quorum_policy and no-quorum-policy are set to
> > "ignore", and expected-quorum-votes is set to "2":
> >
> > <crm_config>
> > <cluster_property_set id="cib-bootstrap-options">
> > ...
> > <nvpair id="cib-bootstrap-options-expected-quorum-votes"
> > name="expected-quorum-votes" value="2"/>
> > <nvpair id="cib-bootstrap-options-no_quorum_policy"
> > name="no_quorum_policy" value="ignore"/>
> > <nvpair id="nvpair-1d2c923d-7619-4b45-989a-698357f9f8cb"
> > name="no-quorum-policy" value="ignore"/>
> > ...
> > </cluster_property_set>
> > </crm_config>
> >
> > Removing the no-quorum-policy=ignore and no_quorum_policy=ignore (as in,
> > deleting the variables) left the cluster unable to failover with either an
> > ifdown iface or with a node reboot. The state displayed by the GUI did not
> > agree with the state displayed by crm_mon (the GUI showed the ifdown or
> > rebooted node as still controlling resources, whereas crm_mon showed the
> > resources unavailable ... both showed the inaccessible node as offline).
>
> Assuming stonith-enabled was set to false, crm_mon is correct as the
> cluster assumes that the node is cleanly down*.
> You should file a bug for the GUI in that case.
It happens when a node is uncleanly offline, while the resources are
still seen running on the node (according to rsc->running_on) , and the
resources's role is still "Started".
Changed in mgmtd:
http://hg.clusterlabs.org/pacemaker/pygui/rev/f6b91f133ce8
In that case, regards the resources status is "unclean":
..
if (g_list_length(rsc->running_on) > 0
&& rsc->fns->active(rsc, TRUE) == FALSE) {
strncat(buf, "unclean", sizeof(buf)-strlen(buf)-1);
..
Andrew,
If we execute crm_mon without "-r", the resources have ever been running
on the uncleanly offline node will be hidden. While with "-r", the
primitive resources will be shown as "Started" on that node.
"crm_resource -W" has the same behavior.
That's inconsistent. Perhaps we also need to consider if resources are
"active" when those options are enabled?
--
Regards,
Yan Gao
China R&D Software Engineer
ygao at novell.com
Novell, Inc.
Making IT Work As One™
More information about the Pacemaker
mailing list