<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//EN">

<HTML>

<HEAD>

  <META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">

  <META NAME="GENERATOR" CONTENT="GtkHTML/3.16.3">

</HEAD>

<BODY>

Greetings ... Happy New Year!<BR>

<BR>

I am testing a configuration that is created from example in &quot;Chapter 6. Configuring a GFS2 File System in a Cluster&quot; of the &quot;Red Hat Enterprise Linux 7.0 Beta Global File System 2&quot; document.&nbsp; Only addition is stonith:fence_ipmilan.&nbsp; After encountering this issue when I configured with &quot;crm&quot;, I re-configured using &quot;pcs&quot;. I've included the configuration below.<BR>

<BR>

I'm thinking that, in a 2-node cluster, if I run &quot;stonith_admin -F &lt;peer-node&gt;&quot;, then &lt;peer-node&gt; should reboot and cleanly rejoin the cluster.&nbsp; This is not happening.&nbsp; <BR>

<BR>

What ultimately happens is that after the initially fenced node reboots, the system from which the stonith_admin -F command was run is fenced and reboots. The fencing stops there, leaving the cluster in an appropriate state.<BR>

<BR>

The issue seems to reside with clvmd/lvm.&nbsp; With the reboot of the initially fenced node, the clvmd resource fails on the surviving node, with a maximum of errors.&nbsp; I hypothesize there is an issue with locks, but have insufficient knowledge of clvmd/lvm locks to prove or disprove this hypothesis.<BR>

<BR>

Have I missed something ...<BR>

<BR>

1) Is this expected behavior, and always the reboot of the fencing node happens? <BR>

<BR>

2) Or, maybe I didn't correctly duplicate the Chapter 6 example?<BR>

<BR>

3) Or, perhaps something is wrong or omitted from the Chapter 6 example?<BR>

<BR>

Suggestions will be much appreciated.<BR>

<BR>

Thanks,<BR>

Bob Haxo<BR>

<BR>

RHEL6.5<BR>

pacemaker-cli-1.1.10-14.el6_5.1.x86_64<BR>

crmsh-1.2.5-55.1sgi709r3.rhel6.x86_64<BR>

pacemaker-libs-1.1.10-14.el6_5.1.x86_64<BR>

cman-3.0.12.1-59.el6_5.1.x86_64<BR>

pacemaker-1.1.10-14.el6_5.1.x86_64<BR>

corosynclib-1.4.1-17.el6.x86_64<BR>

corosync-1.4.1-17.el6.x86_64<BR>

pacemaker-cluster-libs-1.1.10-14.el6_5.1.x86_64<BR>

<BR>

Cluster Name: mici<BR>

Corosync Nodes:<BR>

<BR>

Pacemaker Nodes:<BR>

 mici-admin mici-admin2<BR>

<BR>

Resources:<BR>

 Clone: clusterfs-clone<BR>

&nbsp; Meta Attrs: interleave=true target-role=Started<BR>

&nbsp; Resource: clusterfs (class=ocf provider=heartbeat type=Filesystem)<BR>

&nbsp;&nbsp; Attributes: device=/dev/vgha2/lv_clust2 directory=/images fstype=gfs2 options=defaults,noatime,nodiratime<BR>

&nbsp;&nbsp; Operations: monitor on-fail=fence interval=30s (clusterfs-monitor-interval-30s)<BR>

 Clone: clvmd-clone<BR>

&nbsp; Meta Attrs: interleave=true ordered=true target-role=Started<BR>

&nbsp; Resource: clvmd (class=lsb type=clvmd)<BR>

&nbsp;&nbsp; Operations: monitor on-fail=fence interval=30s (clvmd-monitor-interval-30s)<BR>

 Clone: dlm-clone<BR>

&nbsp; Meta Attrs: interleave=true ordered=true<BR>

&nbsp; Resource: dlm (class=ocf provider=pacemaker type=controld)<BR>

&nbsp;&nbsp; Operations: monitor on-fail=fence interval=30s (dlm-monitor-interval-30s)<BR>

<BR>

Stonith Devices:<BR>

 Resource: p_ipmi_fencing_1 (class=stonith type=fence_ipmilan)<BR>

&nbsp; Attributes: ipaddr=128.##.##.78 login=XXXXX passwd=XXXXX lanplus=1 action=reboot pcmk_host_check=static-list pcmk_host_list=mici-admin<BR>

&nbsp; Meta Attrs: target-role=Started<BR>

&nbsp; Operations: monitor start-delay=30 interval=60s timeout=30 (p_ipmi_fencing_1-monitor-60s)<BR>

 Resource: p_ipmi_fencing_2 (class=stonith type=fence_ipmilan)<BR>

&nbsp; Attributes: ipaddr=128.##.##.220 login=XXXXX passwd=XXXXX lanplus=1 action=reboot pcmk_host_check=static-list pcmk_host_list=mici-admin2<BR>

&nbsp; Meta Attrs: target-role=Started<BR>

&nbsp; Operations: monitor start-delay=30 interval=60s timeout=30 (p_ipmi_fencing_2-monitor-60s)<BR>

Fencing Levels:<BR>

<BR>

Location Constraints:<BR>

&nbsp; Resource: p_ipmi_fencing_1<BR>

&nbsp;&nbsp;&nbsp; Disabled on: mici-admin (score:-INFINITY) (id:location-p_ipmi_fencing_1-mici-admin--INFINITY)<BR>

&nbsp; Resource: p_ipmi_fencing_2<BR>

&nbsp;&nbsp;&nbsp; Disabled on: mici-admin2 (score:-INFINITY) (id:location-p_ipmi_fencing_2-mici-admin2--INFINITY)<BR>

Ordering Constraints:<BR>

&nbsp; start dlm-clone then start clvmd-clone (Mandatory) (id:order-dlm-clone-clvmd-clone-mandatory)<BR>

&nbsp; start clvmd-clone then start clusterfs-clone (Mandatory) (id:order-clvmd-clone-clusterfs-clone-mandatory)<BR>

Colocation Constraints:<BR>

&nbsp; clusterfs-clone with clvmd-clone (INFINITY) (id:colocation-clusterfs-clone-clvmd-clone-INFINITY)<BR>

&nbsp; clvmd-clone with dlm-clone (INFINITY) (id:colocation-clvmd-clone-dlm-clone-INFINITY)<BR>

<BR>

Cluster Properties:<BR>

 cluster-infrastructure: cman<BR>

 dc-version: 1.1.10-14.el6_5.1-368c726<BR>

 last-lrm-refresh: 1388530552<BR>

 no-quorum-policy: ignore<BR>

 stonith-enabled: true<BR>

Node Attributes:<BR>

 mici-admin: standby=off<BR>

 mici-admin2: standby=off<BR>

<BR>

<BR>

Last updated: Tue Dec 31 17:15:55 2013<BR>

Last change: Tue Dec 31 16:57:37 2013 via cibadmin on mici-admin<BR>

Stack: cman<BR>

Current DC: mici-admin2 - partition with quorum<BR>

Version: 1.1.10-14.el6_5.1-368c726<BR>

2 Nodes configured<BR>

8 Resources configured<BR>

<BR>

Online: [ mici-admin mici-admin2 ]<BR>

<BR>

Full list of resources:<BR>

<BR>

p_ipmi_fencing_1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (stonith:fence_ipmilan):&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Started mici-admin2<BR>

p_ipmi_fencing_2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (stonith:fence_ipmilan):&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Started mici-admin<BR>

 Clone Set: clusterfs-clone [clusterfs]<BR>

&nbsp;&nbsp;&nbsp;&nbsp; Started: [ mici-admin mici-admin2 ]<BR>

 Clone Set: clvmd-clone [clvmd]<BR>

&nbsp;&nbsp;&nbsp;&nbsp; Started: [ mici-admin mici-admin2 ]<BR>

 Clone Set: dlm-clone [dlm]<BR>

&nbsp;&nbsp;&nbsp;&nbsp; Started: [ mici-admin mici-admin2 ]<BR>

<BR>

Migration summary:<BR>

* Node mici-admin:<BR>

* Node mici-admin2:<BR>

<BR>

=====================================================<BR>

crm_mon&nbsp; after the fenced node reboots.&nbsp; Shows the failure of clvmd that then<BR>

occurs, which in turn triggers a fencing of that nnode<BR>

<BR>

Last updated: Tue Dec 31 17:06:55 2013<BR>

Last change: Tue Dec 31 16:57:37 2013 via cibadmin on mici-admin<BR>

Stack: cman<BR>

Current DC: mici-admin - partition with quorum<BR>

Version: 1.1.10-14.el6_5.1-368c726<BR>

2 Nodes configured<BR>

8 Resources configured<BR>

<BR>

Node mici-admin: UNCLEAN (online)<BR>

Online: [ mici-admin2 ]<BR>

<BR>

Full list of resources:<BR>

<BR>

p_ipmi_fencing_1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (stonith:fence_ipmilan):&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Stopped<BR>

p_ipmi_fencing_2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (stonith:fence_ipmilan):&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Started mici-admin<BR>

 Clone Set: clusterfs-clone [clusterfs]<BR>

&nbsp;&nbsp;&nbsp;&nbsp; Started: [ mici-admin ]<BR>

&nbsp;&nbsp;&nbsp;&nbsp; Stopped: [ mici-admin2 ]<BR>

 Clone Set: clvmd-clone [clvmd]<BR>

&nbsp;&nbsp;&nbsp;&nbsp; clvmd&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (lsb:clvmd):&nbsp;&nbsp;&nbsp; FAILED mici-admin<BR>

&nbsp;&nbsp;&nbsp;&nbsp; Stopped: [ mici-admin2 ]<BR>

 Clone Set: dlm-clone [dlm]<BR>

&nbsp;&nbsp;&nbsp;&nbsp; Started: [ mici-admin mici-admin2 ]<BR>

<BR>

Migration summary:<BR>

* Node mici-admin:<BR>

&nbsp;&nbsp; clvmd: migration-threshold=1000000 fail-count=1 last-failure='Tue Dec 31 17:04:29 2013'<BR>

* Node mici-admin2:<BR>

<BR>

Failed actions:<BR>

&nbsp;&nbsp;&nbsp; clvmd_monitor_30000 on mici-admin 'unknown error' (1): call=60, status=Timed Out, la<BR>

st-rc-change='Tue Dec 31 17:04:29 2013', queued=0ms, exec=0ms<BR>

<BR>

<BR>

<BR>

<BR>

<BR>

<BR>

<BR>

</BODY>

</HTML>