[Pacemaker] Multi-site support in pacemaker (tokens, deadman, CTR)
Gao,Yan
ygao at novell.com
Thu Apr 28 19:33:00 UTC 2011
Hi Lars,
Thanks for the explanation.
On 04/28/11 02:55, Lars Marowsky-Bree wrote:
> On 2011-04-26T23:34:16, Yan Gao <ygao at novell.com> wrote:
>
> Perhaps chosing the name "token" for the cluster-wide attributes was not
> a wise move, as it does invoke the "token" association from
> corosync/totem.
>
> What do you all think about switching this word to "ticket"? And have
> the Cluster Ticket Registry manage them? Less confusion later on, I
> think.
>
> I'll try the word "ticket" for the rest of the mail and we can see how
> that works out ;-)
>
> (I think the word works - you can own a ticket, grant a ticket, cancel,
> and revoke tickets ...)
Sounds fine to me:-)
>>> "Tokens" are, essentially, cluster-wide attributes (similar to node
>>> attributes, just for the whole partition).
>> Specifically, a "<tokens>" section with an attribute set (
>> "<token_set>" or something) under "/cib/configuration"?
>
> Yes; a ticket section, just like that.
All right. How about the schema:
<element name="configuration">
<interleave>
...
<element name="tickets">
<zeroOrMore>
<element name="ticket_set">
<externalRef href="nvset.rng"/>
</element>
</zeroOrMore>
</element>
...
>> - A completely new type of constraint:
>> <rsc_token id="rscX-with-tokenA" rsc="rscX" token="tokenA"
>> kind="Deadman"/>
>
> Personally, I lean towards this. (Andrew has expressed a wish to do
> without the "rsc_" prefix, so lets drop this ;-)
Well then, how about "ticket_dep" or "ticket_req"?
>
> Not sure the kind="Deadman" is actually required, but it probably makes
> sense to be able to switch off the big hammer for debugging purposes.
> ;-)
I was thinking it's for switching on/off "immediately fence once the
dependency is no longer satisfied".
>
> I don't see why any resource would depend on several tickets; but I can
> see a use case for wanting to depend on _not_ owning a ticket, similar
> to the node attributes. And the resource would need a role, obviously.
OK. The schema I can imagine:
<define name="element-ticket_dep">
<element name="ticket_dep">
<attribute name="id"><data type="ID"/></attribute>
<choice>
<oneOrMore>
<ref name="element-resource-set"/>
</oneOrMore>
<group>
<attribute name="rsc"><data type="IDREF"/></attribute>
<optional>
<attribute name="rsc-role">
<ref name="attribute-roles"/>
</attribute>
</optional>
</group>
</choice>
<attribute name="ticket"><text/></attribute>
</element>
</define>
>
> Andrew, Yan - do you think we should allow _values_ for tickets, or
> should they be strictly defined/undefined/set/unset?
I think allowing values should be helpful to distinguish different demands.
>> If so, isn't it supposed to be revoked manually by default? So the
>> short-circuited fail-over needs an admin to participate?
>
> No to both; it can be revoked manually, yes, but it isn't going to be
> always the case. I'm also not quite sure I understand where this
> question is headed; how does it matter here whether the ticket is
> revoked manually or not?
I was just thinking -- before we have the CTR, we rely on the admin
quite much.
>
>> Does it means an option for users to choose if they want an
>> immediate fencing or stopping the resources normally? Is it global
>> or particularly for a specific token , or even/just for a specific
>> dependency?
>
> Good question. This came up above already briefly ...
>
> I _think_ there should be a special value that a ticket can be set to
> that doesn't fence, but stops everything cleanly.
>
> However, while the ticket is in this state, the site _still_ owns it (no
> other site can get it yet, and were it to lose the ticket due to
> expiration, it'd still need to fence all remaining nodes so that the
> services can be started elsewhere).
>
> Perhaps the CTR doesn't even need to know about this - it's a special
> setting of the ticket at a given site. Perhaps it makes sense to
> distinguish between owning the ticket (as granted on request via the CTR
> or manually), and its value (which is set locally)? perhaps:
>
> Ownership is a true/false flag. Value is a positive integer (including
> 0).
>
> A site that "owns" a ticket of value 0 will stop resources cleanly, and
> afterwards relinquish the ticket itself.
>
> A site that "owns" a ticket of any value and loses it will perform the
> deadman dance.
>
> A site that does not own a ticket but has a non-zero value for it
> defined will request the ticket from the CTR; the CTR will grant it to
> the site with the highest bid (but not to a site with 0)
The site with the highest "bid" is being revoked the ticket. Should it
clear the "bid" also? Otherwise it will get the ticket again soon after?
> (if these are
> equal, to the site with the highest node count, if these again are
> equal, to the site with the lowest nodeid).
>
> (Tangent - ownership appears to belong to the status section; the value
> seems belongs to the cib->ticket section(?).)
Perhaps. Although there's no appropriate place to set a cluster-wide
attribute in the status section so far.
Other solutions are:
A "ticket" is not a nvpair. It is
- An object with "ownership" and "bid" attributes.
Or:
- A nvpair-set which includes the "ownership" and "bid" nvpairs.
>
> The value can be set manually - in that case, it allows the admin to
> define a primary site for a given set of resources. (It might also be
> modified automatically at a later stage based on whatever metric.)
>
> If a site owns a ticket, but doesn't have the highest value, it would
> either fail-back automatically - or require manual intervention,
OK, it seems to have answered my previous question. It should be
configurable from CTR server side.
> which
> I'd assume to be quite common. (Again, this builds a very simplistic
> active/passive overlay.)
>
> Does that make sense, or am I creating more confusion than answers? ;-)
Definitely makes a lot of sense:-)
Regards,
Yan
--
Gao,Yan <ygao at novell.com>
Software Engineer
China Server Team, OPS Engineering, Novell, Inc.
More information about the Pacemaker
mailing list