[Pacemaker] strange drbd migration fail

Matthew O'Connor matt at ecsorl.com
Tue Jul 16 12:29:05 EDT 2013


Hi,

Probably safe to disregard this issue...  I found I was somehow not
building the latest 1.1.9.  After building and installing 1.1.9-cad5efc
the problem appears to have gone away.

On 07/15/2013 05:25 PM, Matthew O'Connor wrote:
> I have run into a strange problem with a DRBD resource migrating
> master role from one node to the other.  3-node cluster, Pacemaker
> v1.1.9, Corosync v1.4.5, DRBD 8.3.11.  Both Pacemaker and Corosync are
> built from source.  Two of the nodes are running DRBD resources
> between them in simple single-master relationships. 
>
> The nodes are called c3, c4, and c5; c5 is location-constrained to
> never receive DRBD resource clones.  I have a drbd resource called
> ms_drbd-p_dummy1, and a dummy resource called p_dummy1.  The resource
> is colocated with the drbd master, and ordered such that master is
> promoted before the resource is started.  The config generally follows
> accepted online examples (see below).
>
> When c3 is master, I attempt to migrate to c4 by issuing "resource
> migrate p_dummy1".  I see a fleeting FAILED notice in crm_mon for c3,
> then the cluster then starts into this cycle where it keeps bringing
> the relevant DRBD resource up and down very quickly.  Syslog shows the
> connection being setup and torn-down over and over.  The up-down cycle
> is broken by issuing "resource unmigrate p_dummy1" after which c4
> becomes DRBD master and c3 its slave.  That is to say, the migration
> works but only after the unmigrate is issued.  Migrating from c4 to c3
> works every time.
>
> Putting either node into standby works fine, resources migrate without
> issue in that case.  I have fencing enabled and tested, but it's not
> being called into action here.  I also tried re-creating my DRBD
> resource and resyncing, with no change to the results.  I can manually
> shift either node to primary using drbdadm while the resource is
> unmanaged by Pacemaker.  I have also duplicated this behavior with one
> of my other DRBD resources and a second dummy resource.  Finally, I
> confirmed this between a new drbd and dummy resource set between c4
> and c5 (where c4->c5 transition fails until unmigrate is issued, but
> c5->c4 migrate works fine).
>
> An attempt to manually demote ms_drbd-aoe1 resulted in Pacemaker
> reporting a failure, even though /proc/drbd subsequently showed both
> nodes in Secondary.
>
> This syslog fragment shows the attempt, failure, unmigrate and the
> eventual success of migration: http://pastebin.com/tBtydG1f
>
> Key configuration elements:
>
> primitive p_drbd-aoe1 ocf:linbit:drbd \
>         params drbd_resource="aoe1" \
>         op start interval="0" timeout="5m" \
>         op promote interval="0" timeout="90s" \
>         op demote interval="0" timeout="90s" \
>         op stop interval="0" timeout="3m" \
>         op monitor interval="20" role="Slave" timeout="20" \
>         op monitor interval="10" role="Master" timeout="20"
>
> primitive p_dummy1 ocf:heartbeat:Dummy
>
> ms ms_drbd-aoe1 p_drbd-aoe1 \
>         meta master-max="1" notify="true" clone-max="2"
> master-node-max="1" clone-node-max="1" target-role="Started"
> is-managed="true"
>
> colocation colo_dummy inf: p_dummy1 ms_drbd-aoe1:Master
> order o_dummy inf: ms_drbd-aoe1:promote p_dummy1:start
>
> Any ideas?
>
> Thanks!!
>
>
>
> -- 
> Thank you!
>   Matthew O'Connor
>   (GPG Key ID: 55F981C4)
>
>
> CONFIDENTIAL NOTICE: The information contained in this electronic message is legally privileged, confidential and exempt from disclosure under applicable law. It is intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return e-mail and delete the original message and any copies of it from your computer system. Thank you.
>  
> EXPORT CONTROL WARNING:  This document may contain technical data that is subject to the International Traffic in Arms Regulations (ITAR) controls and may not be exported or otherwise disclosed to any foreign person or firm, whether in the US or abroad, without first complying with all requirements of the ITAR, 22 CFR 120-130, including the requirement for obtaining an export license if applicable. In addition, this document may contain technology that is subject to the Export Administration Regulations (EAR) and may not be exported or otherwise disclosed to any non-U.S. person, whether in the US or abroad, without first complying with all requirements of the EAR, 15 CFR 730-774, including the requirement for obtaining an export license if applicable. Violation of these export laws is subject to severe criminal penalties.
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-- 
Thank you!
  Matthew O'Connor
  (GPG Key ID: 55F981C4)


CONFIDENTIAL NOTICE: The information contained in this electronic message is legally privileged, confidential and exempt from disclosure under applicable law. It is intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender immediately by return e-mail and delete the original message and any copies of it from your computer system. Thank you.
 
EXPORT CONTROL WARNING:  This document may contain technical data that is subject to the International Traffic in Arms Regulations (ITAR) controls and may not be exported or otherwise disclosed to any foreign person or firm, whether in the US or abroad, without first complying with all requirements of the ITAR, 22 CFR 120-130, including the requirement for obtaining an export license if applicable. In addition, this document may contain technology that is subject to the Export Administration Regulations (EAR) and may not be exported or otherwise disclosed to any non-U.S. person, whether in the US or abroad, without first complying with all requirements of the EAR, 15 CFR 730-774, including the requirement for obtaining an export license if applicable. Violation of these export laws is subject to severe criminal penalties.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130716/58e75911/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5029 bytes
Desc: S/MIME Cryptographic Signature
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130716/58e75911/attachment-0003.p7s>


More information about the Pacemaker mailing list