[Pacemaker] Help with DRBD resources on Pacemaker

Thu Aug 23 23:17:03 UTC 2012

Please forgive me for the confusing reply I sent before. My crappy email 
cliente isn´t adding the char > on the original message so Ill do it by hand 
:-(

> Hi,
>
> I would just try to remove the last 2 lines in the "net" column of DRBD 
> Config. I had the same behavior with the option "ping-timeout" in the 
> "net" column.
> Starting DRBD via init-script worked but not managed via pacemaker.
> Just try it!
> I am still investigating the reason for that.
>

I tryed removing the lines you sugested but the results are the same :-(

> Additionaly i would remove the line with "become primary on both " in the 
> DRBD Config since Pacemaker will promote the DRBD Resources itself to 
> Master/Primary.
>

I need this statement because it will be running a OCFS2 filesystem and both
hosts will be using those directories for HTTP/HTTPS services

> Regards
>
> Matthias
>

Regards,
Carlos.

> Carlos Xavier <cbastos at connection.com.br> schrieb:
>
>
> Hi.
>
> I configured one cluster using one resouce of DRBD and it works fine,
> althoug I had to downgrade the kernel to the version 2.6.37.6 to get it
> stable.
> Now IÂ´m configuring another cluster but two DRBD resources are required.
> They were created and if I use the system init script to start them both 
> get
> up
>
> apolo:~ # cat /proc/drbd
> version: 8.3.9 (api:88/proto:86-95)
> srcversion: A67EB2D25C5AFBFF3D8B788
>  0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
>     ns:0 nr:0 dw:0 dr:672 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>  1: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
>     ns:0 nr:0 dw:0 dr:680 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>
> When I try to start them using the cluster stack i get some random results
> like this one:
>
> apolo:~ # cat /proc/drbd
> version: 8.3.9 (api:88/proto:86-95)
> srcversion: A67EB2D25C5AFBFF3D8B788
>  0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
>     ns:0 nr:0 dw:0 dr:672 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>  1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
>     ns:0 nr:0 dw:0 dr:680 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>
> crm(live)resource# status
>  Master/Slave Set: msDRBD_0 [resDRBD_0]
>      Masters: [ apolo diana ]
>  Master/Slave Set: msDRBD_1 [resDRBD_1]
>      Masters: [ apolo ]
>      Stopped: [ resDRBD_1:1 ]
>
> This was solved with a cleanup on the resource resDRBD_1:1
>
> crm(live)resource# cleanup resDRBD_1:1
> Cleaning up resDRBD_1:1 on apolo
> Cleaning up resDRBD_1:1 on diana
> Waiting for 3 replies from the CRMd... OK
> crm(live)resource# status
>  Master/Slave Set: msDRBD_0 [resDRBD_0]
>      Masters: [ apolo diana ]
>  Master/Slave Set: msDRBD_1 [resDRBD_1]
>      Masters: [ apolo diana ]
>
>
> apolo:~ # cat /proc/drbd
> version: 8.3.9 (api:88/proto:86-95)
> srcversion: A67EB2D25C5AFBFF3D8B788
>  0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
>     ns:0 nr:0 dw:0 dr:672 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>  1: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
>     ns:0 nr:0 dw:0 dr:680 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>
> Trying to test a litle more the configuration I took down the resources 
> and
> tryed to get them up again
>
> crm(live)resource# stop resDRBD_1
> crm(live)resource# stop resDRBD_0
> crm(live)resource# status
>  Master/Slave Set: msDRBD_0 [resDRBD_0]
>      Stopped: [ resDRBD_0:0 resDRBD_0:1 ]
>  Master/Slave Set: msDRBD_1 [resDRBD_1]
>      Stopped: [ resDRBD_1:0 resDRBD_1:1 ]
> crm(live)resource# start resDRBD_1
> crm(live)resource# start resDRBD_0
> crm(live)resource# status
>  Master/Slave Set: msDRBD_0 [resDRBD_0]
>      Masters: [ apolo diana ]
>  Master/Slave Set: msDRBD_1 [resDRBD_1]
>      Masters: [ apolo diana ]
>
> The crm says its all ok but if you go on the command line what you see is 
> a
> split brain:
>
> apolo:~ # cat /proc/drbd
> version: 8.3.9 (api:88/proto:86-95)
> srcversion: A67EB2D25C5AFBFF3D8B788
>  0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
>     ns:0 nr:0 dw:0 dr:672 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>  1: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
>     ns:0 nr:0 dw:0 dr:680 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
>
> Those are my DRBD resource configurations and the cluster configuration:
> https://dl.dropbox.com/u/96446079/backup.res
> https://dl.dropbox.com/u/96446079/export.res
> https://dl.dropbox.com/u/96446079/crm_resources.txt
>
> Can you help me to fix this issue?
>
> Best regards,
> Carlos Xavier.
> _______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

________________________________

Disclaimer:
Aus Rechts- und Sicherheitsgründen ist die in dieser E-Mail gegebene 
Information nicht rechtsverbindlich.
Eine rechtsverbindliche Bestätigung reichen wir Ihnen gerne auf Anforderung 
in schriftlicher Form nach.
Beachten Sie bitte, dass jede Form der unautorisierten Nutzung, 
Veröffentlichung, Vervielfältigung oder Weitergabe des Inhalts dieser E-Mail 
nicht gestattet ist.
Diese Nachricht ist ausschließlich für den bezeichneten Adressaten oder 
dessen Vertreter bestimmt.
Sollten Sie nicht der vorgesehene Adressat dieser E-Mail oder dessen 
Vertreter sein, so bitten wir Sie, sich mit dem Absender der E-Mail in 
Verbindung zu setzen und/oder diese Nachricht mit allen Anhängen zu löschen.

_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org