[Pacemaker] is ccs as racy as it feels?

Tue Dec 10 10:27:15 UTC 2013

On 09/12/13 23:01, Brian J. Murrell wrote:
> So, I'm trying to wrap my head around this need to migrate to pacemaker
> +CMAN.  I've been looking at
> http://clusterlabs.org/quickstart-redhat.html and
> https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/
>
> It seems "ccs" is the tool to configure the CMAN part of things.
>
> The first URL talks about using ccs to create a local configuration and
> then "copy" that around to the rest of the cluster.  Yuck.
>
> The first URL doesn't really cover how one builds up clusters (i.e. over
> time) but assumes that you know what your cluster is going to look like
> before you build that configuration and says nothing about what to do
> when you decide to add new nodes at some later point.  I would guess
> more "ccm -f /etc/cluster/cluster.conf" and some more copying around
> again.  Does anything need to be prodded to get this new configuration
> that was just copied?  I do hope just "prodding" and not a restart of
> all services including pacemaker managed resources.
>
> The second URL talks about ricci for propagating the configuration
> around.  But it seems to assume that all configuration is done from a
> single node and then "sync'd" to the rest of the cluster with ricci in a
> "last write wins" sort of work-flow.
>
> So unlike pacemaker itself where any node can modify the configuration
> of the CIB (raciness in tools like crm aside), having multiple nodes
> using ccs feels quite dangerous in a "last-write-wins" kind of way.  Am
> I correct?
>
> This makes it quite difficult to dispatch the task of configuring the
> cluster out to the nodes that will be participating in the cluster --
> having them configure their own participation.  This distribution of
> configuration tasks all works fine for pacemaker-proper (if you avoid
> tools like crm) but feels like it's going to blow up when having
> multiple nodes trying to add themselves and their own configuration to
> the CMAN configuration -- all in parallel.
>
> Am I correct about all of this?  I hope I am not, because if I am this
> all feels like a (very) huge step backward from the days where corosync
> +pacemaker configuration could be carried out in parallel, on multiple
> nodes without having to designate particular (i.e. one per cluster)
> nodes as the single configuration point and feeding these designated
> nodes the configuration items through a single-threaded work queue all
> just to avoid the races that didn't exist using just corosync+pacemaker.
>
>

Sadly you're not wrong. But it's actually no worse than updating 
corosync.conf manually, in fact it's pretty much the same thing, so 
nothing is actually getting worse. All the CIB information is still 
properly replicated.

The main difficulty is in safely replicating information that's needed 
to boot the system. If you're not in a cluster you can't contact the 
cluster to find out how to boot, it's a catch-22. So the idea is to have 
a small amount of static information that will get the node up and 
running where it can get the majority of information (mainly the 
services to be managed) from replicated storage.

In general use we've not found it to be a huge problem (though, I'm 
still not keen on it either TBH) because most management is done by one 
person from one node. There is not really any concept of nodes trying to 
"add themselves" to a cluster, it needs to be done by a person - which 
maybe what you're unhappy with.

Chrissie