[Pacemaker] Help with N+1 configuration

Thu Jul 26 17:45:00 UTC 2012

On 07/26/2012 12:34 PM, Cal Heldenbrand wrote:
> Hi everybody,
>
> I've read through the Clusters from Scratch document, but it doesn't 
> seem to help me very well with an N+1 (shared hot spare) style cluster 
> setup.
>
> My test case, is I have 3 memcache servers.  Two are in primary use 
> (hashed 50/50 by the clients) and one is a hot failover.

It sounds like you want to do this:

1) run memcache on each node

I'd use a clone to run memcache, instead of having three memcache 
primitives as you had done. Something like this:

primitive memcache ...
clone memcache_clone memcache ordered=False

There are many parameters a clone can take, but this is a good start, 
assuming you just want to run memcache on each node, and they can be 
started in any order. You don't need to specify any location constraints 
to say where memcache can run, or to keep the memcache instances from 
running multiple times on one node. The clone handles all of that.

2) have ip1 on a node with a working memcache

primitive ip1 ...
colocation ip1_on_memcache inf: ip1 memcache_clone

3) have ip2 active on a different node with a working memcache

primitive ip2 ...
colocation ip2_on_memcache inf: ip2 memcache_clone
colocation ip2_not_on_ip1 -10000: ip2 ip1

I've chosen a score of -10000 for ip2_not_on_ip1 because I assume you 
could, if you had no other choice, run both IPs on one node. If you'd 
rather run just one IP if there is only one working memcache, you can 
make this -inf, and you can set the priority attribute on the ip 
primitives to determine which one is sacrificed.

You could also use a clone for the ip addresses, but since there are 
only 2, simply having two primitives may be easier to understand. If you 
added a third active node, you'd require four colocation constraints 
((n-1)^2, in the general case) to keep all the IPs running on different 
nodes. Your configuration would get very hairy, and you'd want to use a 
clone.

4) you have some preferences about which servers are active in a 
non-failure situation

location ip1_on_mem1 ip1 mem1: 100
location ip2_on_mem2 ip2 mem2: 100

5) (guessing you want this, most people do) if resources have migrated 
due to a failure, you'd prefer to leave them where they are, rather than 
move them again as soon as the failed node recovers. This way you can 
migrate them when the service interruption is convenient.

primitive ... meta resource-stickiness=500

or

rsc_defaults resource-stickiness=500

I prefer to set stickiness on specific primitives I want to be sticky, 
in this case, the IP addresses seem appropriate. Setting a default 
stickiness is a common suggestion, but I always find it hard to know how 
sticky things will be, since if there are colocation constraints, 
groups, etc, the stickinesses of other resources combine in 
deterministic and well defined, but complex and difficult to predict ways.

Your stickiness score must be greater than your location score (from #4) 
to have any effect.

crm_simulate is very handy for examining the scores used in placing 
resources. Start with "crm_simulate -LSs". You can also use the -u and 
-d options to simulate nodes coming online or offline. There are many 
more options -- definitely check it out. Documentation is scant 
(--help), but usage is fairly obvious after playing with it a bit.

Also, some advanced techniques allow the stickness score to be based on 
the time of day, so you can allow resources to move automatically back 
to their preferred nodes, but only at planned times. More information: 
http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/s-expression-iso8601.html