[Pacemaker] weird drbd/cluster behaviour

Digimer lists at alteeve.ca
Wed Jun 26 11:35:19 EDT 2013


I don't see fencing/stonith configured. Without it, your cluster will
not be stable. You will get DRBD split-brains easily and depending in
what you use DRBD for, you could corrupt your data.

On 06/25/2013 09:25 AM, Саша Александров wrote:
> Hi all!
> 
> I am setting up a new cluster on OracleLinux 6.4 (well, it is CentOS 6.4).
> I went through http://clusterlabs.org/quickstart-redhat.html
> Then I installed DRBD 8.4.2 from elrepo.
> This setup is unusable :-( with DRBD 8.4.2.
> I created three DRBD resources:
> 
> cat /proc/drbd
> version: 8.4.2 (api:1/proto:86-101)
> GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by
> root at flashfon1, 2013-06-24 22:08:41
>  0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
>     ns:97659171 nr:0 dw:36 dr:97660193 al:1 bm:5961 lo:0 pe:0 ua:0 ap:0
> ep:1 wo:f oos:0
>  1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
>     ns:292421653 nr:16 dw:16 dr:292422318 al:0 bm:17848 lo:0 pe:0 ua:0
> ap:0 ep:1 wo:f oos:0
>  2: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
>     ns:292421600 nr:8 dw:8 dr:292422265 al:0 bm:17848 lo:0 pe:0 ua:0
> ap:0 ep:1 wo:f oos:0
> 
> It appeared that drbd resource-agent script did not work. Debugging
> showed that check_crm_feature_set() function always returned zeroes. Ok,
> just added 'exit' as its first line for now.
> 
> Next, I created three drbd resources in pacemaker, three master-slave
> sets, three filesystem resources (and ip resources, but they are no
> problem):
> 
>  pcs status
> Last updated: Tue Jun 25 21:20:17 2013
> Last change: Tue Jun 25 02:46:25 2013 via crm_resource on flashfon1
> Stack: cman
> Current DC: flashfon1 - partition with quorum
> Version: 1.1.8-7.el6-394e906
> 2 Nodes configured, unknown expected votes
> 11 Resources configured.
> 
> 
> Online: [ flashfon1 flashfon2 ]
> 
> Full list of resources:
> 
>  Master/Slave Set: ms_wsoft [drbd_wsoft]
>      Masters: [ flashfon1 ]
>      Slaves: [ flashfon2 ]
>  Master/Slave Set: ms_oradata [drbd_oradata]
>      Slaves: [ flashfon1 flashfon2 ]
>  Master/Slave Set: ms_flash [drbd_flash]
>      Slaves: [ flashfon1 flashfon2 ]
>  Resource Group: WcsGroup
>      wcs_vip_local      (ocf::heartbeat:IPaddr2):       Started flashfon1
>      wcs_fs     (ocf::heartbeat:Filesystem):    Started flashfon1
>  Resource Group: OraGroup
>      ora_vip_local      (ocf::heartbeat:IPaddr2):       Started flashfon1
>      oradata_fs (ocf::heartbeat:Filesystem):    Stopped
>      oraflash_fs        (ocf::heartbeat:Filesystem):    Stopped
> 
> See, only one master-slave set is recognizing DRBD state!
> 
> Resources are configured identically in CIB (except for drbd resource
> name parameter):
> 
>       <master id="ms_wsoft">
>         <primitive class="ocf" id="drbd_wsoft" provider="linbit"
> type="drbd">
>           <instance_attributes id="drbd_wsoft-instance_attributes">
>             <nvpair id="drbd_wsoft-instance_attributes-drbd_resource"
> name="drbd_resource" value="wsoft"/>
>           </instance_attributes>
>           <operations>
>             <op id="drbd_wsoft-interval-60s" interval="60s" name="monitor"/>
>           </operations>
>         </primitive>
>         <meta_attributes id="ms_wsoft-meta_attributes">
>           <nvpair id="ms_wsoft-meta_attributes-master-max"
> name="master-max" value="1"/>
>           <nvpair id="ms_wsoft-meta_attributes-master-node-max"
> name="master-node-max" value="1"/>
>           <nvpair id="ms_wsoft-meta_attributes-clone-max"
> name="clone-max" value="2"/>
>           <nvpair id="ms_wsoft-meta_attributes-clone-node-max"
> name="clone-node-max" value="1"/>
>           <nvpair id="ms_wsoft-meta_attributes-notify" name="notify"
> value="true"/>
>         </meta_attributes>
>       </master>
>       <master id="ms_oradata">
>         <primitive class="ocf" id="drbd_oradata" provider="linbit"
> type="drbd">
>           <instance_attributes id="drbd_oradata-instance_attributes">
>             <nvpair id="drbd_oradata-instance_attributes-drbd_resource"
> name="drbd_resource" value="oradata"/>
>           </instance_attributes>
>           <operations>
>             <op id="drbd_oradata-interval-60s" interval="60s"
> name="monitor"/>
>           </operations>
>         </primitive>
>         <meta_attributes id="ms_oradata-meta_attributes">
>           <nvpair id="ms_oradata-meta_attributes-master-max"
> name="master-max" value="1"/>
>           <nvpair id="ms_oradata-meta_attributes-master-node-max"
> name="master-node-max" value="1"/>
>           <nvpair id="ms_oradata-meta_attributes-clone-max"
> name="clone-max" value="2"/>
>           <nvpair id="ms_oradata-meta_attributes-clone-node-max"
> name="clone-node-max" value="1"/>
>           <nvpair id="ms_oradata-meta_attributes-notify" name="notify"
> value="true"/>
>         </meta_attributes>
>       </master>
> 
> I am stuck. :-((((
> 
> Best regards,
> Alexandr A. Alexandrov
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 


-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?




More information about the Pacemaker mailing list