[Pacemaker] Build dlm_controld for pacemaker stack (dlm_controld.pcmk)

Vladislav Bogdanov bubble at hoster-ok.com
Tue Oct 30 03:15:48 EDT 2012


29.10.2012 19:51, Bernardo Cabezas Serra wrote:
> Hello,
> 
> disclaimer: I have posted this issue to linux-ha list too a couple of
> days ago. I'm sorry if this is not the correct list, and thanks if you
> can give me a hint about which cluster stack should I use for ocfs2 by now.
> 
> I'm trying to compile all stack for corosync + pacemaker + dlm + ocfs2
> (with dlm_controld.pmk), without cman stack. I'm following the "From
> source" Pacemaker guide.
> 
> After some days trying to compile the correct combination of
> sources/versions, I have no success, and I'm not sure if at this moment
> this is possible.
> 
> The fist problem was that cluster removed support for dlm_controld with
> pacemaker stack. Last version with support was 3.0.17.
> But this was done some years ago, and as far as I have been able to
> understand, things are still broken.
> 
> 
> The most relevant info found about this issue are these threads from
> Andrew Beekhof and Vladislav Bogdanov, wich suggest to compile
> dlm_controld from Cluster, applying some patches. They report it worked
> (whith some remaining issues):
> 
> http://oss.clusterlabs.org/pipermail/pacemaker/2009-October/003064.html
> http://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg09959.html
> 
> But most recent issue about this is a year ago, and seems that things
> are still broken.
> I haven't been able to compile, with lots of errors, so I'm currently
> asking if this is the right way, becouse seems that nobody else is
> willing to use this...

I still run that on two clusters with post-1.1.7 pacemaker, but not with
1.1.8. Just looked, it is fe859a7 (Apr 11, two weeks after 1.1.7) with
two dozens of cherry-picked patches. I should migrate them to
corosync2/pacemaker-1.1.8 shortly.

What exactly errors do you see? Pacemaker APIs used there received some
changes between 1.1.7 and 1.1.8. I have one more patch which I tried
with pacemaker master Aug 22 (close enough to 1.1.8, but some APIs
changed again after that point). That version did not work for me with
corosync14 because of bug fixed after that and I decided to move to
corosync2 right after that failure to be more upstream-compatible. I
can't say if it help you, but you may want to try. Should I post it?

Main issue with "pcmk" version of all that daemons is that fs control
daemons and dlm_controld require ability to request fencing. Originally
it was done by calling some high-level pacemaker APIs
(crm_terminate_member_no_mainloop()) and that did not work in some
circumstances.
Andrew developed brand-new stonith-ng subsystem which is used for
fencing in that version of dlm_controld you talk about (with my patches
on top of Andrew's patches).
I suspect that ocfs_controld.pcmk (like gfs_controld.pcmk in 3.0.17)
still uses that old way. If that is true, then it can't work reliably. I
tried to port gfs_controld to use stonith-ng with corosync1/openais
(included in the patch I talk about above), but I did not test it at all
(although I just ported it to corosync2/dlm4 and it works in a testing
setup, see my answer to David).

Vladislav

> 
> 
> At cluster page, they state that now DLM code has been separated from
> cluster:
> https://fedorahosted.org/cluster/wiki/HomePage
> 
> But this dlm project (that seems to have pcmk support), depends on
> corosync 2.0, so it can't run with last pacemaker (1.1.8). (can it?)
> http://git.fedorahosted.org/git/dlm.git
> 
> Before spending more time with this, I wanted to ask for the right way
> to do things.
> So Questions are:
> 
> (1) Is it by now possible to get an ocfs2 corosync + pacemaker cluster,
> without cman, and dlm_controld with pcmk stack? (if yes which
> repos/versions)?
> (2) What is the future roadmap about this? Will future corosync2.0
> cluster have dlm issues addressed?
> 
> Also, I have read (also Andrew post) that OCFS2 cluster could have
> problems on top of corosync 2.0, as OCFS2 has't ben ported (GFS2 was
> ported).
> http://www.gossamer-threads.com/lists/linuxha/pacemaker/78538
> so:
> (3) Is GSF2 a better future option in terms of support, for linux-ha
> clusters?
> 
> 
> More details about pcmk dlm_controld:
> I found that Suse have always been mantaining  cman-free cluster stack,
> so I have tried to find dlm in its packages.
> Found:
> http://rpmfind.net//linux/RPM/opensuse/factory/x86_64/libdlm-3.00.01-24.5.x86_64.html
> 
> 
> But also I have had lots of compilation problems, trying several
> pacemaker, versions, also the suse-patched ones. Haven't been able to
> successfully complie a dlm_controld.
> 
> 
> Thanks and Regards,
> Bernardo
> -- 
> APSL
> APSL
> *Bernardo Cabezas Serra*
> *Responsable Sistemas*
> Ada Byron, edificio NTIC 2ºA
> 07121 ParcBit
> Mail: bcabezas at apsl.net
> Skype: bernat.cabezas
> Tel: 971439771
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 





More information about the Pacemaker mailing list