[Pacemaker] CMAN integration questions
Andrew Beekhof
andrew at beekhof.net
Thu Dec 23 12:14:21 UTC 2010
On Thu, Dec 23, 2010 at 10:41 AM, Vladislav Bogdanov
<bubble at hoster-ok.com> wrote:
> Hi Andrew,
>
> It was a big surprise for me to see all pacemaker-specific bits removed
> from dlm and gfs2 in cluster-3.1.0, so there is currently no way to use
> pacemaker on f13 with dlm/gfs2/clvm but without cman.
>
> So, would you please bring some light on details of integration with cman?
>
> Especially I need to understand how pacemaker integrates with cman's
> fencing/dlm subsystem:
> *) Do I need to configure fencing in both cman and pacemaker?
No. Just in Pacemaker.
fenced spins waiting for Pacemaker to make an API call that tells it
that fencing completed, at which point the dlm can continue.
David (author of fenced and the dlm) and I discussed this at length
and we are in agreement that is the right (and safe) intermediate
step.
Otherwise, Pacemaker has no need to interact with the dlm - the only
requirement is that they share the same view membership of membership
- which is achieved by them both talking to cman.
> Or
> pacemaker should be (is) able to fence nodes via cman's fenced interface?
> *) Is there a way to postpone any (monitor too) operations on specific
> resources until fencing domain stabilizes?
You shouldn't need to. Pacemaker still controls fencing, it just gets
the membership list from corosync's cman plugin.
> Otherwise I probably need to
> configure huge timeouts for operations and then cluster becomes not
> smart. Under 'specific resources' I mean LVM VGs and LVs together with
> gfs2 filesystems. I currently have problems with fence domain stability
> (https://bugzilla.redhat.com/show_bug.cgi?id=664958) that's why I
> noticed that LVM and Filesystem RAs go mad while DLM is being frozen.
>
> From what I see, relying on cman's quorum is not sufficient when DLM
> comes to play, pacemaker needs to know the state of fencing system as
> well. I started to think about monitoring of that subsystem from within
> RA which then sets some cluster/node attributes, but realized that this
> won't help - more tight integration is needed.
>
> And one more question/proposal about CMAN/DLM/GFS2:
> now it is possible to use DLM/GFS2 on nodes without pacemaker installed.
> I mean, if I configure additional node in cman but have no pacemaker
> started on that node, then I'm still able to mount GFS2 on that node.
> One minor problem is that rest of pacemaker cluster waits for that node
> to start pacemaker too.
Did you try this?
Its no different to running just corosync on a node.
> So all clone resources are extended with one
> more instance which "will never be started". On the other hand I see in
> pacemaker sources, that there are two types of nodes: member and ping,
> and all resource processing is done only for nodes which are members.
> Would it be too hard to add one more node type, f.e. "arbiter" (it
> participates in cman cluster so it influences quorum/fencing), which is
> only valid for CMAN clusters and is not supposed to run any resources?
> Then clones will not try to extend on that arbiter nodes, fewer
> resources, less computations, cleaner 'crm status' output.
>
> Could you please comment on this?
Before trying thought experiments, its best to get the starting point
correct :-)
>
> Thanks,
> Vladislav
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
More information about the Pacemaker
mailing list