<div dir="ltr"><div><div><div><div><div><div>Hi, Digimer:<br></div>Below is the output of drbdadm dump:<br></div># /etc/drbd.conf<br>common {<br>    protocol               C;<br>    net {<br>        after-sb-0pri    discard-zero-changes;<br>
        after-sb-1pri    consensus;<br>        after-sb-2pri    disconnect;<br>        cram-hmac-alg    sha512;<br>        shared-secret    acde;<br>    }<br>    disk {<br>        on-io-error      detach;<br>        fencing          resource-and-stonith;<br>
    }<br>    syncer {<br>        rate             33M;<br>    }<br>    startup {<br>        wfc-timeout      120;<br>    }<br>    handlers {<br>        fence-peer       /usr/lib/drbd/crm-fence-peer.sh;<br>        after-resync-target /usr/lib/drbd/crm-unfence-peer.sh;<br>
    }<br>}<br><br># resource r0 on suse4: not ignored, not stacked<br>resource r0 {<br>    on suse2 {<br>        device           /dev/drbd0 minor 0;<br>        disk             /dev/sdc1;<br>        address          ipv4 XXX:7789;<br>
        meta-disk        internal;<br>    }<br>    on suse4 {<br>        device           /dev/drbd0 minor 0;<br>        disk             /dev/sdc1;<br>        address          ipv4 YYY:7789;<br>        meta-disk        internal;<br>
    }<br>}<br></div>And for crm configure, please find below configuration:<br>primitive drbd1 ocf:linbit:drbd \<br>        params drbd_resource=&quot;r0&quot; \<br>        op monitor interval=&quot;15s&quot;<br>primitive fs1 ocf:heartbeat:Filesystem \<br>
        op monitor interval=&quot;15s&quot; \<br>        params device=&quot;/dev/drbd0&quot; directory=&quot;/opt/drbd&quot; fstype=&quot;ext3&quot; \<br>        meta target-role=&quot;Started&quot;<br>primitive suse2-stonith stonith:external/ipmi \<br>
        params hostname=&quot;suse2&quot; ipaddr=&quot;XXX&quot; userid=&quot;admin&quot; passwd=&quot;xxx&quot; interface=&quot;lan&quot;<br>primitive suse4-stonith stonith:external/ipmi \<br>        params hostname=&quot;suse4&quot; ipaddr=&quot;YYY&quot; userid=&quot;admin&quot; passwd=&quot;yyy&quot; interface=&quot;lan&quot;<br>
ms ms_drbd1 drbd1 \<br>        meta master-max=&quot;1&quot; master-node-max=&quot;1&quot; clone-max=&quot;2&quot; clone-node-max=&quot;1&quot; notify=&quot;true&quot; target-role=&quot;Started&quot;<br>location drbd-fence-by-handler-ms_drbd1 ms_drbd1 \<br>
        rule $id=&quot;drbd-fence-by-handler-rule-ms_drbd1&quot; $role=&quot;Master&quot; -inf: #uname ne suse4<br>location st-suse2 suse2-stonith -inf: suse2<br>location st-suse4 suse4-stonith -inf: suse4<br>colocation fs_on_drbd inf: fs1 ms_drbd1:Master<br>
        dc-version=&quot;1.1.6-b988976485d15cb702c9307df55512d323831a5e&quot; \<br>        cluster-infrastructure=&quot;openais&quot; \<br>        expected-quorum-votes=&quot;3&quot; \<br>        stonith-enabled=&quot;true&quot; \<br>
        last-lrm-refresh=&quot;1378051434&quot;<br>rsc_defaults $id=&quot;rsc-options&quot; \<br>        resource-stickiness=&quot;100&quot;<br></div>I think drbd-fence-by-handler-rule-ms_drbd1 rule is generated by crm-fence-peer.sh. And this keeps existing as the crm-unfence-peer.sh is never called since last fail over.<br>
</div>What&#39;s wrong with my configuration?<br></div>Thanks.<br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, Sep 2, 2013 at 9:42 PM, Digimer <span dir="ltr">&lt;<a href="mailto:lists@alteeve.ca" target="_blank">lists@alteeve.ca</a>&gt;</span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On 02/09/13 08:55, Xiaomin Zhang wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi, guy:<br>
I followed the standard way to enable the IPMI based STONITH for a<br>
service which relies on DRBD primary-secondary replication.<br>
Besides below pacemaker configuration (of cause, STONITH is enabled for<br>
pacemaker):<br>
<br>
primitive suse2-stonith stonith:external/ipmi \<br>
         params hostname=&quot;suse2&quot; ipaddr=&quot;XXX&quot; userid=&quot;admin&quot;<br>
passwd=&quot;xxx&quot; interface=&quot;lan&quot;<br>
primitive suse4-stonith stonith:external/ipmi \<br>
         params hostname=&quot;suse4&quot; ipaddr=&quot;YYY&quot; userid=&quot;admin&quot;<br>
passwd=&quot;yyy&quot; interface=&quot;lan&quot;<br>
location st-suse2 suse2-stonith -inf: suse2<br>
location st-suse4 suse4-stonith -inf: suse4<br>
<br>
I also use &#39;resource-and-stonith&#39; as DRBD global configuration.<br>
This configuration works for many times with below failure tests:<br>
1.  iptables -A INPUT -j DROP<br>
2.  echo c &gt; /proc/sysrq-trigger<br>
3.  /etc/init.d/network stop<br>
4.  reboot<br>
The failed node will be power cycled the counterpart by IPMI command.<br>
However, I still get DRBD SplitBrain issue for some time. Does that mean<br>
IPMI is still not so reliable for DATA integration?<br>
<br>
And I was also so confused that for many times, crm-unfence-peer.sh. is<br>
not called after crm-fence-peer.sh. Does this imply that I have<br>
something misconfigured?<br>
Your advice is really appreciated.<br>
Thanks in advance.<br>
</blockquote>
<br></div></div>
I don&#39;t think that using the firewall to block traffic is a good way to test. That said, if the failure triggers a reboot, then it&#39;s working.<br>
<br>
Did you setup the fence-handler in DRBD to use &#39;crm-fence-peer.sh&#39;?<br>
<br>
Please share your &#39;crm configure show&#39; and &#39;drbdadm dump&#39;.<span class="HOEnZb"><font color="#888888"><br>
<br>
-- <br>
Digimer<br>
Papers and Projects: <a href="https://alteeve.ca/w/" target="_blank">https://alteeve.ca/w/</a><br>
What if the cure for cancer is trapped in the mind of a person without access to education?<br>
</font></span></blockquote></div><br></div>