[Pacemaker] (LRMD|PCMK)_MAX_CHILDREN?
Lars Marowsky-Bree
lmb at suse.com
Wed Sep 11 13:33:17 CEST 2013
On 2013-09-11T19:55:38, Andrew Beekhof <andrew at beekhof.net> wrote:
> > sorry for being thick, but I can't find this in the code now. Did this
> > slip through again in April?
> Apparently. But before we add it, I'd like to see if we can do something coherent.
> Having 3 (or more) different variables (batch-limit, migration-limit and this) for controlling these things doesn't seem optimal or user friendly.
Well, they're all doing something completely different.
A cluster-wide limit on operations (batch-limit) limits the total
cluster and network/storage load.
The max_children prevent a given node from being overloaded by
concurrent operations. (Reducing batch-limit to emulate this kills
cluster-wide parallelism and is not optimal.) Clearly, it's not perfect
either (since it assumes all rsc ops on a node are identical in
weight; whereas in reality we may want to limit VM start-up to 4, but
would happily see 32 IP addresses go up at once, or 48 monitors ...),
but it is an appropriate simplification.
migration-limit is indeed a special case (needed to limit nodes from
being overloaded by migrate, which were at the time the only ops that
affect two nodes at once - batch-limit="4" was too coarse a hammer). I
do recall that we discussed making it more generic - so that one could
configure cluster-/node-wide limits for certain operations of specific
resource types, but that was (rightly) judged to be a rather complex can
of worms by you.
> If anything, we should likely be putting work into auto-tuning this
> stuff instead. Somehow.
I'm not sure about how batch-limit can be auto-tuned.
migration-threshold is mostly a function of the network bandwidth, too.
MAX_CHILDREN did, sort of, auto-tune (by defaulting to number of cores,
or something similar, which was appropriate enough[1]).
It can all be made into a generic, powerful, flexible mechanism that
describes them all. But I'm afraid that it'd also be quite complex. I'm
happy to think about it, but the three limits we have/had seemed
sufficient for the real-world.
Regards,
Lars
[1] the main complaint was that it was configured via sysconfig, and not
dynamic via a node attribute as it should be. When we reintroduce it, we
may want to make nodes default to PCMK/LRMD_MAX_CHILDREN if unset in
the CIB, and otherwise have that value override the environment
variable? That'd be a benefit now that pcmk and lrmd are more closely
married.
--
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde
More information about the Pacemaker
mailing list