[Pacemaker] Load Balancing, Node Scores and Stickiness

Thu Oct 22 09:51:21 EDT 2009

On Thu, 2009-10-22 at 15:10 +0200, Florian Haas wrote:
> On 10/22/2009 02:37 PM, Andrew Beekhof wrote:
> >> I wondered, does it happen dynamically? If one resource starts using a
> >> lot of resources, are the other migrated to other nodes?
> > 
> > Not yet.
> > Such a feature is planned though.
> > 
> > At the moment pacemaker purely goes on the number of services it has
> > allocated to the node.
> > Total/Available RAM, CPU, HDD, none of these things are yet taken into account.
> 
> Are there any plans on how this feature would look like in more detail?
> A daemon monitoring various performance indicators and updating node
> attributes accordingly? Couldn't that be done today, as a cloneable
> resource agent?

I can see a few problems with such a feature if you wish to implement it
today.
First of all, you cannot really move services to less loaded nodes if
you cannot determine which resource causes which load. If you pick a
resource at random, you might move a "too heavy" resource to another
less loaded node and cause even more load on that node resulting in
something (else?) being moved back. It will create a pretty unstable
cluster under load.
I am also unsure if it would be wise to mix this directly into the
current node scoring. Load numbers will vary wildly and unless the
resulting attribute values are in some way stabilised over longer
periods, it will also cause unstability. (RRDTool?)
It might be possible, but it will be one hell of a complex RA :). A
daemon might be better, but both will require a LOT of configuration
just to differentiate the load of the different resources.

> Or are you referring to missing features actually evaluating such
> information, as in, rather than saying "run this resource on a node with
> at load average of X or less", being able to say "run this resource on
> the node with the currently lowest load average"?

How will that translate into repeatable node states? At this moment, if
you use a timed evaluation of the cluster state, resources should always
be assigned to the same nodes (at least, I've never seen it change
unless it was under direction of a time contraint). 

"run this resource on the node with the currently lowest load average"
is something that is very unlikely to ever return the same answer twice.

Complex indeed! Someone is going to have a considerable amount of fun
with this :D

	J.