[Pacemaker] DRBD active/passive on Pacemaker+CMAN cluster unexpectedly performs STONITH when promoting
Giuseppe Ragusa
giuseppe.ragusa at hotmail.com
Thu Jul 3 02:05:36 UTC 2014
Hi all,
I deployed a 2 nodes (physical) RHCS Pacemaker cluster on CentOS 6.5 x86_64 (fully up-to-date) with:
cman-3.0.12.1-59.el6_5.2.x86_64
pacemaker-1.1.10-14.el6_5.3.x86_64
pcs-0.9.90-2.el6.centos.3.noarch
qemu-kvm-0.12.1.2-2.415.el6_5.10.x86_64
qemu-kvm-tools-0.12.1.2-2.415.el6_5.10.x86_64
drbd-utils-8.9.0-1.el6.x86_64
drbd-udev-8.9.0-1.el6.x86_64
drbd-rgmanager-8.9.0-1.el6.x86_64
drbd-bash-completion-8.9.0-1.el6.x86_64
drbd-pacemaker-8.9.0-1.el6.x86_64
drbd-8.9.0-1.el6.x86_64
drbd-km-2.6.32_431.20.3.el6.x86_64-8.4.5-1.x86_64
kernel-2.6.32-431.20.3.el6.x86_64
The aim is to run KVM virtual machines backed by DRBD (8.4.5) in an active/passive mode (no dual primary and so no live migration).
Just to err on the side of consistency against HA (and to pave the way for a possible dual-primary live-migration-capable setup), I configured DRBD for resource-and-stonith with rhcs_fence (that's why I installed drbd-rgmanager) as fence-peer handler and stonith devices configured in Pacemaker (pcmk-redirect in cluster.conf).
The setup "almost" works (all seems ok with: "pcs status", "crm_mon -Arf1", "corosync-cfgtool -s", "corosync-objctl | grep member") , but every time it needs a resource promotion (to Master, i.e. becoming primary) it either fails or fences the other node (the one supposed to become Slave i.e. secondary) and only then succeeds.
It happens, for example both on initial resource definition (when attempting first start) and on node entering standby (when trying to automatically move the resources by stopping then starting them).
I collected a full "pcs cluster report" and I can provide a CIB dump, but I will initially paste here an excerpt from my configuration just in case it happens to be a simple configuration error that someone can spot on the fly ;> (hoping...)
Keep in mind that the setup has separated redundant network connections for LAN (1 Gib/s LACP to switches), Corosync (1 Gib/s roundrobin back-to-back) and DRBD (10 Gib/s roundrobin back-to-back) and that FQDNs are correctly resolved through /etc/hosts
DRBD:
/etc/drbd.d/global_common.conf:
------------------------------------------------------------------------------------------------------
global {
usage-count no;
}
common {
protocol C;
disk {
on-io-error detach;
fencing resource-and-stonith;
disk-barrier no;
disk-flushes no;
al-extents 3389;
c-plan-ahead 200;
c-fill-target 15M;
c-max-rate 100M;
c-min-rate 10M;
}
net {
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
csums-alg sha1;
data-integrity-alg sha1;
max-buffers 8000;
max-epoch-size 8000;
unplug-watermark 16;
sndbuf-size 0;
verify-alg sha1;
}
startup {
wfc-timeout 300;
outdated-wfc-timeout 80;
degr-wfc-timeout 120;
}
handlers {
fence-peer "/usr/lib/drbd/rhcs_fence";
}
}
------------------------------------------------------------------------------------------------------
Sample DRBD resource (there are others, similar)
/etc/drbd.d/dc_vm.res:
------------------------------------------------------------------------------------------------------
resource dc_vm {
device /dev/drbd1;
disk /dev/VolGroup00/dc_vm;
meta-disk internal;
on cluster1.verolengo.privatelan {
address ipv4 172.16.200.1:7790;
}
on cluster2.verolengo.privatelan {
address ipv4 172.16.200.2:7790;
}
}
------------------------------------------------------------------------------------------------------
RHCS:
/etc/cluster/cluster.conf
------------------------------------------------------------------------------------------------------
<?xml version="1.0"?>
<cluster name="vclu" config_version="14">
<cman two_node="1" expected_votes="1" keyfile="/etc/corosync/authkey" transport="udpu" port="5405"/>
<totem consensus="60000" join="6000" token="100000" token_retransmits_before_loss_const="20" rrp_mode="passive" secauth="on"/>
<clusternodes>
<clusternode name="cluster1.verolengo.privatelan" votes="1" nodeid="1">
<altname name="clusterlan1.verolengo.privatelan" port="6405"/>
<fence>
<method name="pcmk-redirect">
<device name="pcmk" port="cluster1.verolengo.privatelan"/>
</method>
</fence>
</clusternode>
<clusternode name="cluster2.verolengo.privatelan" votes="1" nodeid="2">
<altname name="clusterlan2.verolengo.privatelan" port="6405"/>
<fence>
<method name="pcmk-redirect">
<device name="pcmk" port="cluster2.verolengo.privatelan"/>
</method>
</fence>
</clusternode>
</clusternodes>
<fencedevices>
<fencedevice name="pcmk" agent="fence_pcmk"/>
</fencedevices>
<fence_daemon clean_start="0" post_fail_delay="30" post_join_delay="30"/>
<logging debug="on"/>
<rm disabled="1">
<failoverdomains/>
<resources/>
</rm>
</cluster>
------------------------------------------------------------------------------------------------------
Pacemaker:
PROPERTIES:
pcs property set default-resource-stickiness=100
pcs property set no-quorum-policy=ignore
STONITH:
pcs stonith create ilocluster1 fence_ilo2 action="off" delay="10" \
ipaddr="ilocluster1.verolengo.privatelan" login="cluster2" passwd="test" power_wait="4" \
pcmk_host_check="static-list" pcmk_host_list="cluster1.verolengo.privatelan" op monitor interval=60s
pcs stonith create ilocluster2 fence_ilo2 action="off" \
ipaddr="ilocluster2.verolengo.privatelan" login="cluster1" passwd="test" power_wait="4" \
pcmk_host_check="static-list" pcmk_host_list="cluster2.verolengo.privatelan" op monitor interval=60s
pcs stonith create pdu1 fence_apc action="off" \
ipaddr="pdu1.verolengo.privatelan" login="cluster" passwd="test" \
pcmk_host_map="cluster1.verolengo.privatelan:3,cluster1.verolengo.privatelan:4,cluster2.verolengo.privatelan:6,cluster2.verolengo.privatelan:7" \
pcmk_host_check="static-list" pcmk_host_list="cluster1.verolengo.privatelan,cluster2.verolengo.privatelan" op monitor interval=60s
pcs stonith level add 1 cluster1.verolengo.privatelan ilocluster1
pcs stonith level add 2 cluster1.verolengo.privatelan pdu1
pcs stonith level add 1 cluster2.verolengo.privatelan ilocluster2
pcs stonith level add 2 cluster2.verolengo.privatelan pdu1
pcs property set stonith-enabled=true
pcs property set stonith-action=off
SAMPLE RESOURCE:
pcs cluster cib dc_cfg
pcs -f dc_cfg resource create DCVMDisk ocf:linbit:drbd \
drbd_resource=dc_vm op monitor interval="31s" role="Master" \
op monitor interval="29s" role="Slave" \
op start interval="0" timeout="120s" \
op stop interval="0" timeout="180s"
pcs -f dc_cfg resource master DCVMDiskClone DCVMDisk \
master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 \
notify=true target-role=Started is-managed=true
pcs -f dc_cfg resource create DCVM ocf:heartbeat:VirtualDomain \
config=/etc/libvirt/qemu/dc.xml migration_transport=tcp migration_network_suffix=-10g \
hypervisor=qemu:///system meta allow-migrate=false target-role=Started is-managed=true \
op start interval="0" timeout="120s" \
op stop interval="0" timeout="120s" \
op monitor interval="60s" timeout="120s"
pcs -f dc_cfg constraint colocation add DCVM DCVMDiskClone INFINITY with-rsc-role=Master
pcs -f dc_cfg constraint order promote DCVMDiskClone then start DCVM
pcs -f dc_cfg constraint location DCVM prefers cluster2.verolengo.privatelan=50
pcs cluster cib-push firewall_cfg
Since I know that pcs still has some rough edges, I installed crmsh too, but never actually used it.
Many thanks in advance for your attention.
Kind regards,
Giuseppe Ragusa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140703/3374cab0/attachment-0003.html>
More information about the Pacemaker
mailing list