[Pacemaker] Build dlm_controld for pacemaker stack (dlm_controld.pcmk)
Vladislav Bogdanov
bubble at hoster-ok.com
Thu Nov 1 06:13:09 UTC 2012
01.11.2012 02:53, Andrew Beekhof wrote:
> On Tue, Oct 30, 2012 at 6:15 PM, Vladislav Bogdanov
> <bubble at hoster-ok.com> wrote:
>> 29.10.2012 19:51, Bernardo Cabezas Serra wrote:
>>> Hello,
>>>
>>> disclaimer: I have posted this issue to linux-ha list too a couple of
>>> days ago. I'm sorry if this is not the correct list, and thanks if you
>>> can give me a hint about which cluster stack should I use for ocfs2 by now.
>>>
>>> I'm trying to compile all stack for corosync + pacemaker + dlm + ocfs2
>>> (with dlm_controld.pmk), without cman stack. I'm following the "From
>>> source" Pacemaker guide.
>>>
>>> After some days trying to compile the correct combination of
>>> sources/versions, I have no success, and I'm not sure if at this moment
>>> this is possible.
>>>
>>> The fist problem was that cluster removed support for dlm_controld with
>>> pacemaker stack. Last version with support was 3.0.17.
>>> But this was done some years ago, and as far as I have been able to
>>> understand, things are still broken.
>>>
>>>
>>> The most relevant info found about this issue are these threads from
>>> Andrew Beekhof and Vladislav Bogdanov, wich suggest to compile
>>> dlm_controld from Cluster, applying some patches. They report it worked
>>> (whith some remaining issues):
>>>
>>> http://oss.clusterlabs.org/pipermail/pacemaker/2009-October/003064.html
>>> http://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg09959.html
>>>
>>> But most recent issue about this is a year ago, and seems that things
>>> are still broken.
>>> I haven't been able to compile, with lots of errors, so I'm currently
>>> asking if this is the right way, becouse seems that nobody else is
>>> willing to use this...
>>
>> I still run that on two clusters with post-1.1.7 pacemaker, but not with
>> 1.1.8. Just looked, it is fe859a7 (Apr 11, two weeks after 1.1.7) with
>> two dozens of cherry-picked patches. I should migrate them to
>> corosync2/pacemaker-1.1.8 shortly.
>>
>> What exactly errors do you see? Pacemaker APIs used there received some
>> changes between 1.1.7 and 1.1.8. I have one more patch which I tried
>> with pacemaker master Aug 22 (close enough to 1.1.8, but some APIs
>> changed again after that point). That version did not work for me with
>> corosync14 because of bug fixed after that and I decided to move to
>> corosync2 right after that failure to be more upstream-compatible. I
>> can't say if it help you, but you may want to try. Should I post it?
>>
>> Main issue with "pcmk" version of all that daemons is that fs control
>> daemons and dlm_controld require ability to request fencing. Originally
>> it was done by calling some high-level pacemaker APIs
>> (crm_terminate_member_no_mainloop()) and that did not work in some
>> circumstances.
>> Andrew developed brand-new stonith-ng subsystem which is used for
>> fencing in that version of dlm_controld you talk about (with my patches
>> on top of Andrew's patches).
>
> crm_terminate_member_no_mainloop() should still work though
> (specifically because I knew the old controld's used it).
> You just need the (new) compatibility header and the result will be
> /very/ reliable - there is no crmd/pengine involvement anymore, you go
> straight to the fencing daemon.
>
Good to know, thanks.
Then that part may be left as is. But, anyway, membership/quorum
information should be obtained from corosync to be consistent.
>> I suspect that ocfs_controld.pcmk (like gfs_controld.pcmk in 3.0.17)
>> still uses that old way. If that is true, then it can't work reliably. I
>> tried to port gfs_controld to use stonith-ng with corosync1/openais
>> (included in the patch I talk about above), but I did not test it at all
>> (although I just ported it to corosync2/dlm4 and it works in a testing
>> setup, see my answer to David).
>>
>> Vladislav
>>
>>>
>>>
>>> At cluster page, they state that now DLM code has been separated from
>>> cluster:
>>> https://fedorahosted.org/cluster/wiki/HomePage
>>>
>>> But this dlm project (that seems to have pcmk support), depends on
>>> corosync 2.0, so it can't run with last pacemaker (1.1.8). (can it?)
>>> http://git.fedorahosted.org/git/dlm.git
>>>
>>> Before spending more time with this, I wanted to ask for the right way
>>> to do things.
>>> So Questions are:
>>>
>>> (1) Is it by now possible to get an ocfs2 corosync + pacemaker cluster,
>>> without cman, and dlm_controld with pcmk stack? (if yes which
>>> repos/versions)?
>>> (2) What is the future roadmap about this? Will future corosync2.0
>>> cluster have dlm issues addressed?
>>>
>>> Also, I have read (also Andrew post) that OCFS2 cluster could have
>>> problems on top of corosync 2.0, as OCFS2 has't ben ported (GFS2 was
>>> ported).
>>> http://www.gossamer-threads.com/lists/linuxha/pacemaker/78538
>>> so:
>>> (3) Is GSF2 a better future option in terms of support, for linux-ha
>>> clusters?
>>>
>>>
>>> More details about pcmk dlm_controld:
>>> I found that Suse have always been mantaining cman-free cluster stack,
>>> so I have tried to find dlm in its packages.
>>> Found:
>>> http://rpmfind.net//linux/RPM/opensuse/factory/x86_64/libdlm-3.00.01-24.5.x86_64.html
>>>
>>>
>>> But also I have had lots of compilation problems, trying several
>>> pacemaker, versions, also the suse-patched ones. Haven't been able to
>>> successfully complie a dlm_controld.
>>>
>>>
>>> Thanks and Regards,
>>> Bernardo
>>> --
>>> APSL
>>> APSL
>>> *Bernardo Cabezas Serra*
>>> *Responsable Sistemas*
>>> Ada Byron, edificio NTIC 2ºA
>>> 07121 ParcBit
>>> Mail: bcabezas at apsl.net
>>> Skype: bernat.cabezas
>>> Tel: 971439771
>>>
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
More information about the Pacemaker
mailing list