[Pacemaker] Cluster with two STONITH devices

Wed Apr 8 16:16:07 UTC 2015

(I'm a bit confused because I received an auto-reply form
pacemaker-bounces at oss.clusterlabs.org saying this list is inactive now but
I just received a digest with my mail. I happens that I have resent the
email to the new list with a bit more information, which was missing in the
first message. So here it is that extra bit, anyway).

I also have noticed this pattern (with both STONITH resources running):
1. With the cluster running without errors, I run "stop docker" in node
cluster-a-1.
2. This leads to the vCenter STONITH to act as expected.
3. After the cluster is running again without errors, I run again "stop
docker" in node cluster-a-1.
4. Now, the vCenter STONITH doesn't run and, instead, it is the IPMI
STONITH that runs. This is unexpected for me, as I was expecting to see the
vCenter STONITH to run again.

On Wed, Apr 8, 2015 at 4:20 PM, Jorge Lopes <jmclopes at gmail.com> wrote:

> Hi all.
>
> I'm having difficulties orchestrating two STONITH devices in my cluster. I
> have been struggling with this in past days and I need some help, please.
>
> A simplified version of my cluster and its goals is as follows:
> - The cluster has two physical servers, each with two nodes (VMWare
> virtual machines): overall, there are 4 nodes in this simplified version.
> - There are two resource groups: group-cluster-a and group-cluster-b.
> - To achieve a good CPU balance in the physical servers, the cluster is
> asymmetric, with one group running in one server and the other group
> running on the other server.
> - If the VM of one host becomes not usable, then its resources are started
> in its sister VM deployed in the other physical host.
> - If one physical host becomes not usable, then all resources are started
> in the other physical host.
> - Two STONITH levels are used to fence the problematic nodes.
>
> The resources have the following behavior:
> - If the resource monitor detects a problem, then Pacemaker tries to
> restart the resource in the same node.
> - If it fails, then STONITH takes place (vcenter reboots the VM) and
> Pacemaker starts the resource in the sister VM present in the other
> physical host.
> - If restarting the VM fails, I want to power off the physical server and
> Pacemaker will start all resources in the other physical host.
>
>
> The HA stack is:
> Ubuntu 14.04 (the node OS, which is a visualized guest running in VMWare
> ESXi 5.5)
> Pacemaker 1.1.12
> Corosync  2.3.4
> CRM 2.1.2
>
> The 4 nodes are:
> cluster-a-1
> cluster-a-2
> cluster-b-1
> cluster-b-2
>
> The relevant configuration is:
>
> property symmetric-cluster=false
> property stonith-enabled=true
> property no-quorum-policy=stop
>
> group group-cluster-a vip-cluster-a docker-web
> location loc-group-cluster-a-1 group-cluster-a inf: cluster-a-1
> location loc-group-cluster-a-2 group-cluster-a 500: cluster-a-2
>
> group group-cluster-b vip-cluster-b docker-srv
> location loc-group-cluster-b-1 group-cluster-b 500: cluster-b-1
> location loc-group-cluster-b-2 group-cluster-b inf: cluster-b-2
>
>
> # stonith vcenter definitions for host 1
> # run in any of the host2 VM
> primitive stonith-vcenter-host1 stonith:external/vcenter \
>   params \
>     VI_SERVER="192.168.40.20" \
>     VI_CREDSTORE="/etc/vicredentials.xml" \
>     HOSTLIST="cluster-a-1=cluster-a-1;cluster-a-2=cluster-a-2" \
>     RESETPOWERON="1" \
>   priority="2" \
>   pcmk_host_check="static-list" \
>   pcmk_host_list="cluster-a-1 cluster-a-2" \
>   op monitor interval="60s"
>
> location loc1-stonith-vcenter-host1 stonith-vcenter-host1 500: cluster-b-1
> location loc2-stonith-vcenter-host1 stonith-vcenter-host1 501: cluster-b-2
>
> # stonith vcenter definitions for host 2
> # run in any of the host1 VM
> primitive stonith-vcenter-host2 stonith:external/vcenter \
>   params \
>     VI_SERVER="192.168.40.21" \
>     VI_CREDSTORE="/etc/vicredentials.xml" \
>     HOSTLIST="cluster-b-1=cluster-b-1;cluster-b-2=cluster-b-2" \
>     RESETPOWERON="1" \
>   priority="2" \
>   pcmk_host_check="static-list" \
>   pcmk_host_list="cluster-b-1 cluster-b-2" \
>   op monitor interval="60s"
>
> location loc1-stonith-vcenter-host2 stonith-vcenter-host2 500: cluster-a-1
> location loc2-stonith-vcenter-host2 stonith-vcenter-host2 501: cluster-a-2
>
>
> # stonith IPMI definitions for host 1 (DELL with iDRAC 7 enterprise
> interface at 192.168.40.15)
> # run in any of the host2 VM
> primitive stonith-ipmi-host1 stonith:external/ipmi \
>     params hostname="host1" ipaddr="192.168.40.15" userid="root"
> passwd="mypassword" interface="lanplus" \
>     priority="1" \
>     pcmk_host_check="static-list" \
>     pcmk_host_list="cluster-a-1 cluster-a-2" \
>     op start interval="0" timeout="60s" requires="nothing" \
>     op monitor interval="3600s" timeout="20s" requires="nothing"
>
> location loc1-stonith-ipmi-host1 stonith-ipmi-host1 500: cluster-b-1
> location loc2-stonith-ipmi-host1 stonith-ipmi-host1 501: cluster-b-2
>
>
> # stonith IPMI definitions for host 2 (DELL with iDRAC 7 enterprise
> interface at 192.168.40.16)
> # run in any of the host1 VM
> primitive stonith-ipmi-host2 stonith:external/ipmi \
>     params hostname="host2" ipaddr="192.168.40.16" userid="root"
> passwd="mypassword" interface="lanplus" \
>     priority="1" \
>     pcmk_host_check="static-list" \
>     pcmk_host_list="cluster-b-1 cluster-b-2" \
>     op start interval="0" timeout="60s" requires="nothing" \
>     op monitor interval="3600s" timeout="20s" requires="nothing"
>
> location loc1-stonith-ipmi-host2 stonith-ipmi-host2 500: cluster-a-1
> location loc2-stonith-ipmi-host2 stonith-ipmi-host2 501: cluster-a-2
>
>
> What is working:
> - When an error is detected in one resource, the resource restart in the
> same node, as expected.
> - With the STONITH external/ipmi  resource *stopped*, a fail in one node
> makes the vcenter rebooting it and the resources starts in the sister node.
>
>
> What is not so good:
> - When vcenter reboots one node, then the resource starts in the other
> node as expected but then they return to the original node as soon as it
> becomes online. This makes a bit of ping-pong and I think it is a
> consequence of how the locations are defined. Any suggestion to avoid this?
> After the resource was moved to another node, I would prefer that it stays
> there, instead of returning it to the original node. I can think of playing
> with the resource affinity scores - is this way it should be done?
>
> What is wrong:
> Lets consider this scenario.
> I have a set of resources provided by a docker agent. My test consists in
> stopping the docker service in the node cluster-a-1, which makes the docker
> agent to return OCF_ERR_INSTALLED to Pacemaker (this is a change I made in
> the docker agent, when compared to the github repository version). With the
> IPMI STONITH resource stopped, this leads to the node cluster-a-1 restart,
> which is expected.
>
> But with the IPMI STONITH resource started, I notice an erratic behavior:
> - Some times, the resources at the node cluster-a-1 are stopped and no
> STONITH happens. Also, the resources are not moved to the node cluster-a-2.
> In this situation, if I manually restart the node cluster-a-1 (virtual
> machine restart), then the IPMI STONITH takes place and restarts the
> corresponding physical server.
> - Sometimes, the IPMI STONITH starts before the vCenter STONITH, which is
> not expected because the vCenter STONITH has higher priority.
>
> I might have something wrong in my stonith definition, but I can't figure
> what.
> Any idea how to correct this?
>
> And how can I set external/ipmi to power off the physical host, instead of
> rebooting it?
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20150408/4d3a656e/attachment.htm>