[Pacemaker] How to properly define DRBD resources

Michał Margula alchemyx at uznam.net.pl
Mon Dec 16 08:21:52 UTC 2013


Hello,

Thanks to this mailing list I did some changes to our cluster 
configuration because it was having some trouble (declaring other node 
unclean quite often).

I did change mode of bond0 interface between nodes to mode "1" 
(active-backup). Also previously I had two drbd resources - r1 and r2, 
and they were used as PVs for LVM.

Now it is other way round - I have VG named "local" which hosts LVs 
which are used as devices for DRBD. And it works this way smoothly 
until... it joins cluster.

It tries to promote DRBD and then it demotes it quickly, starts and 
stops. And it goes and goes in seemingly endless loop. But if I try to 
"help it" manually by drdbadm from console by doing

drbdadm up X
drdbadm primary X

It shows in crm_mon as master and from now on it works fine. This is how 
I define DRBD resource:

<master id="ms-DRBD-ingold-root">
<meta_attributes id="ms-DRBD-ingold-root-meta_attributes">
   <nvpair id="ms-DRBD-ingold-root-meta_attributes-resource-stickiness" 
name="resource-stickiness" value="100"/>
   <nvpair id="ms-DRBD-ingold-root-meta_attributes-notify" name="notify" 
value="true"/>
   <nvpair id="ms-DRBD-ingold-root-meta_attributes-master-max" 
name="master-max" value="2"/>
   <nvpair id="ms-DRBD-ingold-root-meta_attributes-clone-max" 
name="clone-max" value="2"/>
   <nvpair id="ms-DRBD-ingold-root-meta_attributes-clone-node-max" 
name="clone-node-max" value="1"/>
   <nvpair id="ms-DRBD-ingold-root-meta_attributes-interleave" 
name="interleave" value="true"/>
   <nvpair id="ms-DRBD-ingold-root-meta_attributes-globally-unique" 
name="globally-unique" value="False"/>
   <nvpair id="ms-DRBD-ingold-root-meta_attributes-target-role" 
name="target-role" value="Stopped"/>
</meta_attributes>
<primitive class="ocf" id="primitive-DRBD-ingold-root" provider="linbit" 
type="drbd">
   <instance_attributes id="primitive-DRBD-ingold-root-instance_attributes">
     <nvpair 
id="primitive-DRBD-ingold-root-instance_attributes-drbd_resource" 
name="drbd_resource" value="drbd26-ingold-root"/>
   </instance_attributes>
   <operations>
     <op id="primitive-DRBD-ingold-root-start-0" interval="0" 
name="start" timeout="240s"/>
     <op id="primitive-DRBD-ingold-root-stop-0" interval="0" name="stop" 
timeout="100s"/>
     <op id="primitive-DRBD-ingold-root-monitor-20" interval="20" 
name="monitor" role="Master" timeout="20s"/>
     <op id="primitive-DRBD-ingold-root-monitor-30" interval="30" 
name="monitor" role="Slave" timeout="20s">
       <instance_attributes 
id="primitive-DRBD-ingold-root-monitor-30-instance_attributes">
	<nvpair 
id="primitive-DRBD-ingold-root-monitor-30-instance_attributes-target-role" 
name="target-role" value="Master"/>
       </instance_attributes>
     </op>
   </operations>
</primitive>
</master>

Also I have few groupped DRBD resources and in that case it happened 
that one member of a group got Master and another one was Stopped. It is 
how I define such group:

<master id="ms-DRBD-bilbo">
<meta_attributes id="ms-DRBD-bilbo-meta_attributes">
   <nvpair id="ms-DRBD-bilbo-meta_attributes-resource-stickiness" 
name="resource-stickiness" value="100"/>
   <nvpair id="ms-DRBD-bilbo-meta_attributes-notify" name="notify" 
value="true"/>
   <nvpair id="ms-DRBD-bilbo-meta_attributes-master-max" 
name="master-max" value="2"/>
   <nvpair id="ms-DRBD-bilbo-meta_attributes-clone-max" name="clone-max" 
value="2"/>
   <nvpair id="ms-DRBD-bilbo-meta_attributes-clone-node-max" 
name="clone-node-max" value="1"/>
   <nvpair id="ms-DRBD-bilbo-meta_attributes-interleave" 
name="interleave" value="true"/>
   <nvpair id="ms-DRBD-bilbo-meta_attributes-globally-unique" 
name="globally-unique" value="False"/>
   <nvpair id="ms-DRBD-bilbo-meta_attributes-target-role" 
name="target-role" value="Master"/>
   <nvpair id="ms-DRBD-bilbo-meta_attributes-is-managed" 
name="is-managed" value="true"/>
</meta_attributes>
<group id="group-DRBD-bilbo">
   <primitive class="ocf" id="primitive-DRBD-bilbo-root" 
provider="linbit" type="drbd">
     <instance_attributes 
id="primitive-DRBD-bilbo-root-instance_attributes">
       <nvpair 
id="primitive-DRBD-bilbo-root-instance_attributes-drbd_resource" 
name="drbd_resource" value="drbd19-bilbo-root"/>
     </instance_attributes>
     <operations>
       <op id="primitive-DRBD-bilbo-root-start-0" interval="0" 
name="start" timeout="240s"/>
       <op id="primitive-DRBD-bilbo-root-stop-0" interval="0" 
name="stop" timeout="100s"/>
       <op id="primitive-DRBD-bilbo-root-monitor-20" interval="20" 
name="monitor" role="Master" timeout="20s"/>
       <op id="primitive-DRBD-bilbo-root-monitor-30" interval="30" 
name="monitor" role="Slave" timeout="20s">
	<instance_attributes 
id="primitive-DRBD-bilbo-root-monitor-30-instance_attributes">
	  <nvpair 
id="primitive-DRBD-bilbo-root-monitor-30-instance_attributes-target-role" name="target-role" 
value="Master"/>
	</instance_attributes>
       </op>
     </operations>
     <meta_attributes id="primitive-DRBD-bilbo-root-meta_attributes">
       <nvpair 
id="primitive-DRBD-bilbo-root-meta_attributes-target-role" 
name="target-role" value="Started"/>
       <nvpair id="primitive-DRBD-bilbo-root-meta_attributes-is-managed" 
name="is-managed" value="true"/>
     </meta_attributes>
   </primitive>
   <primitive class="ocf" id="primitive-DRBD-bilbo-squid" 
provider="linbit" type="drbd">
     <instance_attributes 
id="primitive-DRBD-bilbo-squid-instance_attributes">
       <nvpair 
id="primitive-DRBD-bilbo-squid-instance_attributes-drbd_resource" 
name="drbd_resource" value="drbd20-bilbo-squid"/>
     </instance_attributes>
     <operations>
       <op id="primitive-DRBD-bilbo-squid-start-0" interval="0" 
name="start" timeout="240s"/>
       <op id="primitive-DRBD-bilbo-squid-stop-0" interval="0" 
name="stop" timeout="100s"/>
       <op id="primitive-DRBD-bilbo-squid-monitor-20" interval="20" 
name="monitor" role="Master" timeout="20s"/>
       <op id="primitive-DRBD-bilbo-squid-monitor-30" interval="30" 
name="monitor" role="Slave" timeout="20s">
	<instance_attributes 
id="primitive-DRBD-bilbo-squid-monitor-30-instance_attributes">
	  <nvpair 
id="primitive-DRBD-bilbo-squid-monitor-30-instance_attributes-target-role" 
name="target-role" value="Master"/>
	</instance_attributes>
       </op>
     </operations>
   </primitive>
</group>
</master>

So what I am doing wrong? Now our cluster is offline (corosync is 
stopped) because it was promoting, demoting, starting, stopping DRBD 
services and finnaly such node got declared as Unclean and then shooting 
started.

Funny thing is that if I start drbd manually (by /etc/init.d/drbd start) 
it starts almost instantly and all resources are up and Primary/Primary. 
Of cours I don't use that init.d when I use corosync.

Any help will be greatly apperciated.

Thank you!


-- 
Michał Margula, alchemyx at uznam.net.pl, http://alchemyx.uznam.net.pl/
"W życiu piękne są tylko chwile" [Ryszard Riedel]




More information about the Pacemaker mailing list