[ClusterLabs] Unfencing cause resource restarts
Ken Gaillot
kgaillot at redhat.com
Tue Oct 11 14:40:52 UTC 2016
On 10/11/2016 07:06 AM, Pavel Levshin wrote:
> Hi!
>
>
> In continuation of prevoius mails, now I have more complex setup. Our
> hardware are capable of two STONITH methods: ILO and SCSI persistent
> reservations on shared storage. First method works fine, nevertheless,
> sometimes in the past we faced problems with inaccessible ILO devices or
> something... So, we would like to have SCSI fencing as an additional method.
>
> The problem: when a node 2 recovers, some resources are just stopped and
> restarted on node 1. As far as I understand, primitive resources are
> affected, but clone instances are not affected.
>
> In the example below, when bvnode2 recovers, vm_smartbv1 is restarted on
> bvnode1, and vm_smartbv2 is live-migrated without interruption to
> bvnode2. All other resources are clones working on bvnode1 and they are
> unaffected.
>
> If I set "meta requires=fencing" for vm resources, they are not
> restarted anymore. But why unfencing of bvnode2 affects resources
> running on bvnode1?
That does seem odd.
Something I notice in the config below is that only the ILO devices are
listed in the fence topology, and the only fence level is "10". Valid
indexes are 1 to 9, so this should have produced a log error about "Bad
topology".
If you want the storage fencing as a fallback in case ILO fails, you
want the devices in two levels, e.g. level 1 = ILO, level 2 = storage.
>
> ====
>
> Current cluster status:
>
> Online: [ bvnode1 bvnode2 ]
>
>
>
> ilo.bvnode2 (stonith:fence_ilo4): Started bvnode1
>
> ilo.bvnode1 (stonith:fence_ilo4): Stopped
>
> Clone Set: dlm-clone [dlm]
>
> Started: [ bvnode1 ]
>
> Stopped: [ bvnode2 ]
>
> Clone Set: clvmd-clone [clvmd]
>
> Started: [ bvnode1 ]
>
> Stopped: [ bvnode2 ]
>
> Clone Set: cluster-config-clone [cluster-config]
>
> Started: [ bvnode1 ]
>
> Stopped: [ bvnode2 ]
>
> vm_smartbv1 (ocf::heartbeat:VirtualDomain): Started bvnode1
>
> vm_smartbv2 (ocf::heartbeat:VirtualDomain): Started bvnode1
>
> Clone Set: libvirtd-clone [libvirtd]
>
> Started: [ bvnode1 ]
>
> Stopped: [ bvnode2 ]
>
> storage.bvnode1 (stonith:fence_mpath): Started bvnode1
>
> storage.bvnode2 (stonith:fence_mpath): Started bvnode1
>
>
>
> Transition Summary:
>
> * Start ilo.bvnode1 (bvnode2)
>
> * Start dlm:1 (bvnode2)
>
> * Start clvmd:1 (bvnode2)
>
> * Start cluster-config:1 (bvnode2)
>
> * Restart vm_smartbv1 (Started bvnode1)
>
> * Migrate vm_smartbv2 (Started bvnode1 -> bvnode2)
>
> * Start libvirtd:1 (bvnode2)
>
> * Move storage.bvnode2 (Started bvnode1 -> bvnode2)
> ====
>
> Cluster config:
>
> ====
> Cluster Name: smartbvcluster
>
> Corosync Nodes:
>
> bvnode1 bvnode2
>
> Pacemaker Nodes:
>
> bvnode1 bvnode2
>
>
>
> Resources:
>
> Clone: dlm-clone
>
> Meta Attrs: interleave=true ordered=true
>
> Resource: dlm (class=ocf provider=pacemaker type=controld)
>
> Operations: start interval=0s timeout=90 (dlm-start-interval-0s)
>
> stop interval=0s timeout=100 (dlm-stop-interval-0s)
>
> monitor interval=30s (dlm-monitor-interval-30s)
>
> Clone: clvmd-clone
>
> Meta Attrs: interleave=true ordered=true
>
> Resource: clvmd (class=ocf provider=heartbeat type=clvm)
>
> Operations: start interval=0s timeout=90 (clvmd-start-interval-0s)
>
> stop interval=0s timeout=90 (clvmd-stop-interval-0s)
>
> monitor interval=30s (clvmd-monitor-interval-30s)
>
> Clone: cluster-config-clone
>
> Meta Attrs: interleave=true
>
> Resource: cluster-config (class=ocf provider=heartbeat type=Filesystem)
>
> Attributes: device=/dev/vg_bv_shared/cluster-config
> directory=/opt/cluster-config fstype=gfs2 options=noatime
>
> Operations: start interval=0s timeout=60
> (cluster-config-start-interval-0s)
>
> stop interval=0s timeout=60 (cluster-config-stop-interval-0s)
>
> monitor interval=10s on-fail=fence OCF_CHECK_LEVEL=20
> (cluster-config-monitor-interval-10s)
>
> Resource: vm_smartbv1 (class=ocf provider=heartbeat type=VirtualDomain)
>
> Attributes: config=/opt/cluster-config/libvirt/qemu/smartbv1.xml
> hypervisor=qemu:///system migration_transport=tcp
>
> Meta Attrs: allow-migrate=true
>
> Operations: start interval=0s timeout=90 (vm_smartbv1-start-interval-0s)
>
> stop interval=0s timeout=90 (vm_smartbv1-stop-interval-0s)
>
> monitor interval=10 timeout=30
> (vm_smartbv1-monitor-interval-10)
>
> Resource: vm_smartbv2 (class=ocf provider=heartbeat type=VirtualDomain)
>
> Attributes: config=/opt/cluster-config/libvirt/qemu/smartbv2.xml
> hypervisor=qemu:///system migration_transport=tcp
>
> Meta Attrs: target-role=started allow-migrate=true
>
> Operations: start interval=0s timeout=90 (vm_smartbv2-start-interval-0s)
>
> stop interval=0s timeout=90 (vm_smartbv2-stop-interval-0s)
>
> monitor interval=10 timeout=30
> (vm_smartbv2-monitor-interval-10)
>
> Clone: libvirtd-clone
>
> Meta Attrs: interleave=true
>
> Resource: libvirtd (class=systemd type=libvirtd)
>
> Operations: monitor interval=60s (libvirtd-monitor-interval-60s)
>
>
>
> Stonith Devices:
>
> Resource: ilo.bvnode2 (class=stonith type=fence_ilo4)
>
> Attributes: ipaddr=ilo.bvnode2 login=hacluster passwd=s
> pcmk_host_list=bvnode2 privlvl=operator
>
> Operations: monitor interval=60s (ilo.bvnode2-monitor-interval-60s)
>
> Resource: ilo.bvnode1 (class=stonith type=fence_ilo4)
>
> Attributes: ipaddr=ilo.bvnode1 login=hacluster passwd=s
> pcmk_host_list=bvnode1 privlvl=operator
>
> Operations: monitor interval=60s (ilo.bvnode1-monitor-interval-60s)
>
> Resource: storage.bvnode1 (class=stonith type=fence_mpath)
>
> Attributes: key=ab2ee06 pcmk_reboot_action=off
> devices=/dev/mapper/mpatha pcmk_host_check=static-list
> pcmk_host_list=bvnode1
>
> Meta Attrs: provides=unfencing
>
> Operations: monitor interval=60s (storage.bvnode1-monitor-interval-60s)
>
> Resource: storage.bvnode2 (class=stonith type=fence_mpath)
>
> Attributes: key=ab2ee07 pcmk_reboot_action=off
> devices=/dev/mapper/mpatha pcmk_host_check=static-list
> pcmk_host_list=bvnode2
>
> Meta Attrs: provides=unfencing
>
> Operations: monitor interval=60s (storage.bvnode2-monitor-interval-60s)
>
> Fencing Levels:
>
>
>
> Node: bvnode1
>
> Level 10 - ilo.bvnode1
>
> Node: bvnode2
>
> Level 10 - ilo.bvnode2
>
> Location Constraints:
>
> Resource: ilo.bvnode1
>
> Disabled on: bvnode1 (score:-INFINITY)
> (id:location-ilo.bvnode1-bvnode1--INFINITY)
>
> Resource: ilo.bvnode2
>
> Disabled on: bvnode2 (score:-INFINITY)
> (id:location-ilo.bvnode2-bvnode2--INFINITY)
>
> Ordering Constraints:
>
> start dlm-clone then start clvmd-clone (kind:Mandatory)
> (id:order-dlm-clone-clvmd-clone-mandatory)
>
> start clvmd-clone then start cluster-config-clone (kind:Mandatory)
> (id:order-clvmd-clone-cluster-config-clone-mandatory)
>
> start cluster-config-clone then start libvirtd-clone (kind:Mandatory)
> (id:order-cluster-config-clone-libvirtd-clone-mandatory)
>
> stop vm_smartbv2 then stop libvirtd-clone (kind:Mandatory)
> (non-symmetrical) (id:order-vm_smartbv2-libvirtd-clone-mandatory)
>
> stop vm_smartbv1 then stop libvirtd-clone (kind:Mandatory)
> (non-symmetrical) (id:order-vm_smartbv1-libvirtd-clone-mandatory)
>
> start libvirtd-clone then start vm_smartbv2 (kind:Optional)
> (non-symmetrical) (id:order-libvirtd-clone-vm_smartbv2-Optional)
>
> start libvirtd-clone then start vm_smartbv1 (kind:Optional)
> (non-symmetrical) (id:order-libvirtd-clone-vm_smartbv1-Optional)
>
> Colocation Constraints:
>
> clvmd-clone with dlm-clone (score:INFINITY)
> (id:colocation-clvmd-clone-dlm-clone-INFINITY)
>
> cluster-config-clone with clvmd-clone (score:INFINITY)
> (id:colocation-cluster-config-clone-clvmd-clone-INFINITY)
>
> libvirtd-clone with cluster-config-clone (score:INFINITY)
> (id:colocation-libvirtd-clone-cluster-config-clone-INFINITY)
>
> vm_smartbv1 with libvirtd-clone (score:INFINITY)
> (id:colocation-vm_smartbv1-libvirtd-clone-INFINITY)
>
> vm_smartbv2 with libvirtd-clone (score:INFINITY)
> (id:colocation-vm_smartbv2-libvirtd-clone-INFINITY)
>
>
>
> Resources Defaults:
>
> No defaults set
>
> Operations Defaults:
>
> No defaults set
>
>
>
> Cluster Properties:
>
> cluster-infrastructure: corosync
>
> cluster-name: smartbvcluster
>
> dc-version: 1.1.13-10.el7_2.4-44eb2dd
>
> have-watchdog: false
>
> last-lrm-refresh: 1476099872
>
> maintenance-mode: false
>
> no-quorum-policy: freeze
>
> start-failure-is-fatal: false
>
> stonith-enabled: true
More information about the Users
mailing list