[Pacemaker] CMAN integration questions

Vladislav Bogdanov bubble at hoster-ok.com
Fri Dec 24 12:34:59 UTC 2010


23.12.2010 14:14, Andrew Beekhof wrote:

...

>> Otherwise I probably need to
>> configure huge timeouts for operations and then cluster becomes not
>> smart. Under 'specific resources' I mean LVM VGs and LVs together with
>> gfs2 filesystems. I currently have problems with fence domain stability
>> (https://bugzilla.redhat.com/show_bug.cgi?id=664958) that's why I
>> noticed that LVM and Filesystem RAs go mad while DLM is being frozen.
>>
>> From what I see, relying on cman's quorum is not sufficient when DLM
>> comes to play, pacemaker needs to know the state of fencing system as
>> well. I started to think about monitoring of that subsystem from within
>> RA which then sets some cluster/node attributes, but realized that this
>> won't help - more tight integration is needed.

This drives me crazy...
Every fencing operation leads to fencing domain loss on all nodes with
subsequent freeze of DLM.

>>
>> And one more question/proposal about CMAN/DLM/GFS2:
>> now it is possible to use DLM/GFS2 on nodes without pacemaker installed.
>> I mean, if I configure additional node in cman but have no pacemaker
>> started on that node, then I'm still able to mount GFS2 on that node.
>> One minor problem is that rest of pacemaker cluster waits for that node
>> to start pacemaker too.
> 
> Did you try this?

OK, I have a success, VG is activated, GFS is mounted.
Without pacemaker even running on that node, only cman.
Rest of cluster feels fine (if no fencing required), but excessive clone
resources are allocated.
I understand that I can limit number of clones, but I'd not do it
because cluster will grow from 4 to 16 nodes step by step, and every
addition of node will require reconfiguration of not only cman, but
pacemaker too.
OK, I'll set this limit on all clone resources if there is no other way.

> Its no different to running just corosync on a node.
> 
>> So all clone resources are extended with one
>> more instance which "will never be started".  On the other hand I see in
>> pacemaker sources, that there are two types of nodes: member and ping,
>> and all resource processing is done only for nodes which are members.
>> Would it be too hard to add one more node type, f.e. "arbiter" (it
>> participates in cman cluster so it influences quorum/fencing), which is
>> only valid for CMAN clusters and is not supposed to run any resources?
>> Then clones will not try to extend on that arbiter nodes, fewer
>> resources, less computations, cleaner 'crm status' output.
>>
>> Could you please comment on this?
> 
> Before trying thought experiments, its best to get the starting point
> correct :-)

So, my assumption is correct, maybe it is time to discuss this concept
further? ;)


Best,
Vladislav




More information about the Pacemaker mailing list